An external view of the central hub of the station with a show title overlay

Image Credit: Composit image from episode screenshot and official logo (CBS Studios)

Upscaling Star Trek Deep Space Nine

August 27, 2023
19 min, 3.96k words
Entertainment
Technology

Star Trek: Deep Space Nine is my favourite Star Trek series, and my second favourite TV series overall, right after Babylon 5. I’ve been rewatching the series every two or three years for the last couple of decades or so, and I am due for another round just about now.

While some of the other older Star Trek series have been re-released in HD quality on Bluray, notably, Deep Space Nine and Voyager have not. Since streaming services have proven to be more and more unreliable, I usually buy my favourite series on Bluray – or occasionally on DVDs. I have owned the DVD boxsets of Deep Space Nine for years.

While looking for any new updates on a potential Bluray release of DS9, as I occasionally do, I came across a number of discussions about fan upscale efforts, including this article. So I decided to see what I could do with my own discs. It’s been quite an interesting learning experience too. In this post I will share what I found out, and what I ended up doing to upscale my DVDs.

Legal Note

To perform this upscaling, I needed to extract the content from my DVDs. Whether or not that is technically legal, varies from place to place. Where I live, it is fine for personal use, but I will not be covering that part of the process in this post.

Researching Video Upscaling

There are some examples of official releases out there using upscaling techniques. For instance, I own a Bluray release of Stargate SG-1 where several seasons have been upscaled. They do look okay-ish, but not necessarily great. The discs even come with a disclaimer about it when loading. Based partially on this, I’ve been a bit sceptical of upscales because results really do vary quite a lot.

While doing the preliminary research, I found a number of discussions online, mostly on Reddit, about AI upscaling of both DS9 and Voyager, as well as other older TV shows. So I started looking at software options. As I usually do, I first looked into Open Source solutions that I could run on my Linux desktop. I found Video2x, and several discussions about this solution. I tested it on a small clip of one of my DVDs, and the result was just so-so. The result was not very impressive. Of course, I may not have been able to make it perform optimally.

The algorithm used in Video2x is designed for animated content primarily, and it generated a lot of weird results for live action video content. I found that it had trouble with fine details like design patterns on items and clothing. It would often create black lines between the pattern details. It was also quite tricky to get it to run properly on my hardware, and it also rendered PNG images as output, which is tedious and resource-consuming to work with.

Several times when reading about AI upscaling I saw TopazLabs being mentioned. Since it only runs on Windows and MacOS, I mostly dismissed it – until I came across another Reddit thread discussing it in more detail. It turns out you can download a trial, so I did just that, and installed it in a VirtualBox VM running Windows 10. I assigned it 8 CPU cores to give it something to work with, and set it to render a few short clips from DS9. I was really surprised by how good those clips turned out compared to others I’d seen, despite the trial version plastering a massive water mark cross the middle of each frame.

I needed more processing power though, and the VM wasn’t really cutting it with a framerate at about 0.1-0.2 fps. I was looking at about 9 hours per episode. Luckily, I had just bought a new desktop PC for my home office (I work from home), so the old one was now free, sitting in a corner of my office waiting to go into storage. It has a decent AMD Radeon graphics card in it too. I installed Windows 10 on it, and set up remote desktop so I could access it from my new Debian desktop. Then I installed Topaz Video AI, and dished out the steep $299 for the full license.

Time to go all in on this project!

Figuring Out the AI Settings

Then started a lot of fiddling with AI engines and their settings. Topaz has a brand new engine named Iris, which was released just a few months ago. They wrote a post about its release: “Iris v1: Face Enhancement for Low-Quality Videos (June 2023)”. It seemed perfect for what I was trying to do, so I gave it a spin.

I started with the first episode of DS9, which is not really in the best state on the DVD, as you can see on the left in the image below. It is quite grainy and has a lot of colour artifacts and noise. I rendered a sample image from one of the early scenes, using the estimated settings of the Iris AI engine at a 2x upscale. The result on the right in the image.

Comparison between two version of the same frame. One original, and one upscaled. — Left: the original DVD. Right: the AI upscaled version. Click for a larger version.

So, already from the start this looked quite promising. However, there are a lot of settings for the Iris AI engine, and while they are fairly self-explanatory in terms of what they do, it isn’t immediately obvious what the number values mean and what effects they have at various levels.

A screenshot of the settings panel for the Iris AI model — Topaz AI settings for Iris

I did a lot of manual tuning tests of these settings. Sometimes I could see a clear difference when tweaking them, sometimes not. I tried to run them on “Auto”, but at least for the early episodes that were the most challenging, I wasn’t entirely satisfied.

I did a bit of reading, and saw a someone recommending using the estimated settings as a staring point, and make sample estimates of various scenes of the episodes. I reasoned that the seaons are mostly consistent in quality, and looking through a few episodes seemed to confirm this. So I set up a spreadsheet in LibreOffice Calc, where i listed all the options in rows per season. Then I picked a scene from each episode of a season, trying to hit a variety of scene types and lighting, and ran the estimate feature in Topaz. I recorded the numbers, and used the averages to assemble a set of values for each season.

Running some sample upscale jobs, that seemed to work fairly well. The only setting I consistently tuned up was the “Revert Compression” setting. Tuning it higher than the estimated value generally produced a better result on the DS9 DVD sources.

I have to say the the Topaz Video AI user interface isn’t great. I was initially using version 3.3.10, then 3.4. The former had a number of issues. For instance, I could not use the Iris AI engine with cropping enabled. It would always error when running the job, but work fine when generating the preview. The application also always generates two large video export files. One with a “temp” suffix. As far as I can figure out, this is the file that is used to generate the video preview in the app itself. Looking at the underlying FFmpeg command, it split the video stream after the AI engine run, and encoded two outputs with slightly different settings. One was optimised for playback while encoding. This is really wasteful as I have no need to look at the actual encoding jobs. Especially when planning to queue up 173 episodes!

Topaz AI on Command Line

Luckily, Topaz Video AI uses FFmpeg under the hood to handle all of the video processing and encoding. You can even set up the job, and generate the command needed to run the job from command line from the Topaz menu. Since I’m reasonably familiar with FFmpeg, I figured that was the best approach to running such a large batch of jobs.

To handle all of this, I set up a Python script to handle everything. I split up the various steps of the process as they appear in the commands generated by Topaz, and added places to inject my own per-episode and per-season settings. I also stripped out the preview encoding process, as I have no need for that, and added my own sample clip generating logic that would extract 20 second preview samples before running the full jobs. These were particularly useful for fine-tuning my cropping settings.

I’ve made the Python script I used available on my GitHub, so feel free to grab it and use it as a starting point.

The script is written in such a way that the arguments to FFmpeg are defined as Python dictionaries, which are converted to command line arguments when the job is run. This makes it easier to play around with the settings, and comment them in our out when testing.

Another benefit of handling this on the command line was that I could control the cropping and reshaping of the PAL video format as I wanted. For instance, Topaz insists on scaling before cropping, and would add black bars at the end to compensate. I did not want that. I wanted to crop the frame first, then force it back up to the full PAL frame size of 768 by 576 pixels. The figure below illustrates what I wanted to do.

Two versions of the same video frame, side by side. One has red cropping lines, the other is the result, rescaled to fill the frame. — Left: Original frame with red cropping lines. Right: Cropped and rescaled frame.

Exactly what the correct shape of the PAL image is supposed to be is a bit hard to tell. However, the forced stretching I’m doing here is small enough that it isn’t visible anyway. In most of these cases, I stretched them slightly more in the horizontal than the vertical.

A Bit About PAL DVDs

This section is a bit of a side note, so if you just want to see the settings I used, you can just skip it.

As mentioned, my DVD boxsets of Deep Space Nine are PAL DVDs. There are a few issues with that format, as indicated above. For starters, since the original material is US-made in NTSC format, it has a different frame rate than the PAL standard. NTSC uses approximately 24 frames per second (fps). 24000/1001 fps to be precise. PAL uses 25. When they convert NTSC to PAL, they often just increase the frame rate, or effectively, speed up the playback by approximately 4%. They do the same with the audio track. It is not really noticeable unless you compare the two versions directly. But as a result, each PAL episode is a couple of minutes shorter than the corrssponding NTSC version of the episode.

Another thing about these old formats is that they were designed to be compatible with old analog TVs, despite DVDs themselves being digital. You know, the 4:3 format heavy tube TVs we’ve almost forgotten existed. Or CRT TVs as they’re technically called (CRT stands for Cathode Ray Tube).

CRT TVs don’t really work in pixels like modern screens and TVs do. At least not in the same way. They instead draw lines across the screen. In a digital format like the PAL DVD, each line is stored as 720 pixels of data, and as 576 lines. If you do the math, 720:576 does not correspond to a 4:3 screen ratio. It is off by a ratio 16:15. So what they do, in addition, is stretch each pixel along each line a little to make the image the correct shape. These are called anamorphic pixels. During the upscale process, I converted all of the PAL video files to a clean 4:3 format with square pixels. That is, I stretched them to fit the frame.

Another quirk of old tube TVs is that the image drawn on the screen is usually larger than the screen’s image area. So a lot of DVDs have black borders on the sides, and sometimes also top and bottom. The former is called “pillarboxing” and the latter is called “letterboxing”. To get a clean HD image, it is useful to get rid of these black areas too. Hence the cropping mentioned in the previous section. This too required some stretching to get the frames “back in shape” so to speak.

Colour Encoding

As an addendum, that isn’t really relevant to the upscaling, is the colour format used for video material. Computers generally work with RGB colours. Red, green and blue. Just like our eyes do. However, broadcast signals were originally black and white, so they only sent a single video signal for luminance.

When colour TVs came around, the broadcast signals needed to add the colour information, but not in a way that made old black and white TVs unusable. So they kept the luminance signal unchanged, and added two more signals that told your colour TV, if you had one, how to split the luminance signal into a total of three colour signals. The encoding of the signal is often referred to as “YUV” coding, as opposed to “RGB”. “Y” is the black and white component, and “U” and “V” the two colour difference components.

Video is still generally encoded as YUV. Even on Blurays and streaming services. It’s a standard that just stuck around. There is actually a benefit in keeping it too. Since our eyes are more sensitive to light variations than colour variations, most video formats store values for every pixel in the “Y” component, and only every fourth pixel in each colour component. This saves storage space. Usually this is done in blocks of 2 by 2 pixels. So it is generally recommended that video resolutions are in even numbers. This is important to keep in mind when you crop videos.

Preparing the PAL Source

In order to figure out the cropping settings of each episode of DS9, I loaded them all, one season at a time, into Topaz Video AI. It has a visual cropping tool that you can use to drag sliders to crop the input video. I wrote down each corresponding number for each episode into my Python script.

For the first four seasons, most episodes were cropped to 702 by 574 pixels, with an x-offset around 11 and y-offest of usually 1, but sometimes 2. The remaining seasons did mostly not need cropping at all, except for a handful of episodes in season 6.

Here is an example of the settings in the Python script for the first 5 episodes:

JOBS = [
    # Season 1
    {"file": "DS9_1_1_t00.mkv", "crop": "w=702:h=572:x=12:y=2", "params": "S1"},
    {"file": "DS9_1_1_t01.mkv", "crop": "w=702:h=572:x=11:y=2", "params": "S1"},
    {"file": "DS9_1_1_t02.mkv", "crop": "w=702:h=572:x=11:y=2", "params": "S1"},
    {"file": "DS9_1_2_t00.mkv", "crop": "w=702:h=572:x=11:y=2", "params": "S1"},
]

My Python script generates 20 second clips of each episode, and I used those clips to check that my cropping was OK. I had to tweak a few here and there.

As you can see, I also have a setting called params. These are the Iris AI engine parameters that I mentioned in an earlier section. I stored those per-season, so S1 is the key indicating that these episodes use the Season 1 parameter set.

The per-season parameters I used were:

AI_PARAMS = {
    "S1": "compression=0.30:details=0.13:blur=0.28:noise=0.06:halo=0.09:preblur=-0.05:blend=0.2",
    "S2": "compression=0.30:details=0.15:blur=0.28:noise=0.06:halo=0.09:preblur=-0.05:blend=0.2",
    "S3": "compression=0.30:details=0.14:blur=0.26:noise=0.06:halo=0.09:preblur=-0.10:blend=0.2",
    "S4": "compression=0.25:details=0.12:blur=0.20:noise=0.05:halo=0.08:preblur=-0.10:blend=0.2",
    "S5": "compression=0.25:details=0.12:blur=0.20:noise=0.05:halo=0.08:preblur=-0.10:blend=0.2",
    "S6": "compression=0.25:details=0.12:blur=0.20:noise=0.05:halo=0.06:preblur=-0.05:blend=0.2",
    "S7": "compression=0.25:details=0.12:blur=0.20:noise=0.07:halo=0.06:preblur=-0.05:blend=0.2",
}

I didn’t completely follow the averages. For instance, I kept the compression setting high. I checked everything by generating 20 second samples of all 173 episode files before starting the full run.

The Complex Filter Settings

All of the transformation of the video source is handled by a single setting to the FFmpeg command called filter_complex. In my Python script, this command is assembled from the following pieces:

COMPLEX_FILTER = [
    "crop={crop}",
    "scale=w=768:h=576",
    "setsar=1",
    "tvai_up=model=iris-1:scale=2:{params}:device=0:vram=1:instances=1",
    "scale=w=1440:h=1080:flags=lanczos:threads=0",
]

The filter consists of five elements, one per entry in the Python list:

The initial cropping of black edges. These are per-episode, so they are inserted at the {crop} mark when the command is run, based on the values defined together with each episode in the JOBS list.
After cropping, we brute force rescale to 768 by 576 pixels. This is the standard PAL resolution after the anamorphic pixel ratio has been accounted for. Normally, you would never brute force stretch like this, but we’re running this through an AI engine after, so it should be fine.
Then we force the pixel ratio value to be 1:1, which is what the setsar command does.
Then we call the Topaz AI engine. We’re running iris-1, with an upscale factor of 2. The parameters are defined per-season, so they are loaded dynamically into {params} later. The remaining parameters were generated by Topaz for how the upscaling is processed on my hardware.
Finally, we actually downscale the result from the AI engine. Doubling 768 by 576 results in a slightly larger image than the standard 1440 by 1080 that is HD for 4:3 video. I am still uncertain whether this was a good choice or not, but I wanted to stick with a standard format.

This is a rather time consuming step. It runs much faster on a decent graphics card though. I have an AMD Radeon RX 5700 in the PC I ran this on, and it gave me a framerate of 10 fps. That is, for each minute of video, it took 2.5 minutes to upscale. Note that the CPU is also used quite a lot in the process. I suspect it’s handling the encoding of the result. Which encoder I used affected CPU usage quite a bit.

Encoding Settings

The remaining settings to FFmpeg were mostly just picked from the command line arguments Topaz gave me, except I stripped away the preview video output as mentioned. I picked the following output format settings:

VIDEO_ENC = {
    # Apple ProRes LT
    "c:v": "prores_ks",
    "profile:v": "1",
    "vendor": "apl0",
    "quant_mat": "lt",
    "bits_per_mb": "525",
    "pix_fmt": "yuv422p10le",
}

These are standard settings for the ProRes LT option in the Export section of Topaz. It produces a video file of about 20 ± 3 GB per episode, with a bitrate around 60-70 Mbps. That’s a pretty high quality result.

For audio settings, I just copied track 0. I already ensured that I only had the English sound track in the files I extracted from my DVDs. The other 4 dubbed languages were discarded. I also discarded the DVD subtitles. They wouldn’t work very well here anyway as they are images, not text.

With all of this queued up in my Python script, my PC upscaled out about 12 episodes per day, or one every two hours.

Using the FFmpeg command directly turned out to work pretty well. You don’t get the same progress information that the Topaz user interface gives you, but it’s not like I was going to sit and watch a progress bar for two weeks anyway. FFmpeg also gives you progress output, if you need to check where it’s at.

The cleaned up FFmpeg commands are fairly straightforward, and aside from generating previews and figuring out some preliminary settings, I don’t really have much need for the Topaz app itself. It is after all the AI engine I’m interested in.

Here’s an example of the assembled FFmpeg command for one of the episodes:

ffmpeg -i D:\Upscale\Input\DS9_6_2_t01.mkv -sws_flags spline+accurate_rnd+full_chroma_int -color_trc 2 -colorspace 2 -color_primaries 2 -map_metadata 0 -map_metadata:s:v 0:s:v -map_metadata:s:a:0 0:s:i:0 -filter_complex crop=w=712:h=574:x=5:y=1,scale=w=768:h=576,setsar=1,tvai_up=model=iris-1:scale=2:compression=0.25:details=0.12:blur=0.20:noise=0.05:halo=0.06:preblur=-0.05:blend=0.2:device=0:vram=1:instances=1,scale=w=1440:h=1080:flags=lanczos:threads=0 -c:v prores_ks -profile:v 1 -vendor apl0 -quant_mat lt -bits_per_mb 525 -pix_fmt yuv422p10le -map 0:a -c:a copy D:\Upscale\Output\DS9_6_2_t01_upscale.mkv

Final Encoding Step

The videos from the upscaling job were, as mentioned, about 20 GB for a single episode, and twice that for the three double episodes. Since this is a massive size, I encoded all of them one more time with x265. I generally just use Handbrake for straight forward encoding jobs. You can define encoding presets, and load up each upscaled episode as they are ready and just hit “Add to Queue”. Handbrake is then perfectly happy to just sit there for weeks on end, encoding your videos.

Some would insist on using FFmpeg here too, because of added control, but you can provide all the custom parameters to the encoding library that you want from Handbrake too, so as long as I don’t need any of the other special FFmpeg settings, I usually don’t bother.

For these files I used the x265 10-bit library. The AI upscale job spits out YUV422, and this step reduces that to YUV420 as well. I set the RF scale to 16, and used the “slower” preset. I then added the following custom settings, which are the ones I usually use for HD content:

strong-intra-smoothing=0:sao=0:rect=1:aq-mode=3:limit-refs=3:rd=4:bframes=8:pools=8

The bframes=8 setting is technically redundant since this is the default for the “slower” preset, but I use these encoding settings also for “slow” preset, but usually keep bframes at 8. The pools=8 setting lets me run the encoding job on 8 CPU cores. I have an Intel i9-12900K in my Linux PC, which has 8 performance (fast) cores and 8 efficiency cores. This lets me use the computer for other things as it’s doing this.

The encoding speed here is roughly the same as the upscale job. I get about 10-12 fps on average. That’s a total of 4 hours of processing time per episode, but since they’re on different computers, they run in parallel. The whole job takes about two weeks to complete for the 173 episodes of Deep Space Nine, with the hardware I have.

Each final video file is about 1-2 GB in size with the above settings. x265 (HEVC) is a very efficient video codec, and the quality is great!

Examples and Conclusion

Finally, here are some 20 second example clips from a few of the upscaled episodes. They are best viewed on a computer and by right-clicking and opening them in a new tab. The layout of the website here is smaller than the video resolution, so your browser is scaling it down a bit when you play it directly in the blog post.

These clips are from the last two seasons, where the source material is of pretty good quality in the first place. The clips are not using the same encoding as I used in my final encoding pass as that encoding isn’t generally compatible with browsers. These are encoded with x264 and wrapped in mp4 containers.

The results from the earlier seasons were not as clean as this. Some of the artifacts from the original DVDs came through the upscale process. In particular, colour distortions on external views of the Deep Space Nine station, which were filmed using a model. From time to time there is also a sort of shimmer or flicker on some surfaces that weren’t in the original. But overall, the quality is pretty great. In particular for faces, which Iris is optimised for. I included a clip with a lot of hair below to show that it handles it pretty well too.

Of course, DS9 is filmed for TV. Text on props that is unreadable on the DVD video is also unreadable on the upscale, and some of the alien makeup looks more fake when upscaled. It’s noticeable on close-ups of Ferengis for instance.

In any case, I hope this run-through of my upscale project was useful. I certainly had a lot of fun experimenting with everything, and I’m already a season into my rewatching of Deep Space Nine with these upscaled episodes!

The Python script I used is available on my GitHub.

A clip from Season 6, Episode 13 “Far Beyond the Stars”

A clip from Season 7, Episode 7 “Once More Unto the Breach”

A clip from Season 7, Episode 18 “‘Til Death Do Us Part”

Tags: science-fiction, star-trek, tv, video, encoding, ai

Star Trek: SNW is Really Good Trek!

Joining Strings in Python: A "Huh" Moment

Comments

You can use your Mastodon account to comment on this post by replying to this thread.

Alternatively, you can copy and paste the URL below into the search field of your Fediverse app or the web interface of your Mastodon server.

Post Link:

The Mastodon integration is based on the implementation by Carl Schwan.