HDR - Flavors and Best Practices

Better Pixels.

Over the last decade we have had a bit of a renaissance in imaging display technology. The jump from SD to HD was a huge bump in image quality. HD to 4k was another noticeable step in making better pictures, but had less of an impact from the previous SD to HD jump. Now we are starting to see 8k displays and workflows. Although this is great for very large screens, this jump has diminishing returns for smaller viewing environments. In my opinion, we are to the point where we do not need more pixels, but better ones. HDR or High Dynamic Range images along with wider color gamuts are allowing us to deliver that next major increase in image quality. HDR delivers better pixels!

Last Summer_lake.jpg

Stop… What is dynamic range?

When we talk about the dynamic range of a particular capture system, what we are referring to is the delta between the blackest shadow and the brightest highlight captured. This is measured in Stops typically with a light-meter. A Stop is a doubling or a halving of light. This power of 2 way of measuring light is perfect for its correlation to our eyes logarithmic nature. Your eyeballs never “clip” and a perfect HDR system shouldn’t either. The brighter we go the harder it becomes to see differences but we never hit a limit.

Unfortunately digital camera senors do not work in the same way as our eyeballs. Digital sensors have a linear response, a gamma of 1.0 and do clip. Most high-end cameras convert this linear signal to a logarithmic one for post manipulation.

lin_curve_anim_wide.gif

I was never a huge calculus buff but this one thought experiment has served me well over the years.

Say you are at one side of the room. How many steps will it take to get to the wall if each time you take a step, the step is half the distance of your last. This is the idea behind logarithmic curves.

Say you are at one side of the room. How many steps will it take to get to the wall if each time you take a step, the step is half the distance of your last. This is the idea behind logarithmic curves.

It will take an infinite number of steps to reach the wall, since we can always half the half.

It will take an infinite number of steps to reach the wall, since we can always half the half.

range_shift_ANIM.gif

Someday we will be able to account for every photon in a scene, but until that sensor is made we need to work within the confines of the range that can be captured

For example if the darkest part of a sampled image are the shadows and the brightest part is 8 stops brighter, that means we have a range of 8 stops for that image. The way we expose a sensor or a piece of celluloid changes based on a combination of factors. This includes aperture, exposure time and the general sensitivity of the imaging system. Depending on how you set these variables you can move the total range up or down in the scene.

Let’s say you had a scene range of 16 stops. This goes from the darkest shadow to direct hot sun. Our imaging device in this example can only handle 8 of the 16 present stops. We can shift the exposure to be weighted towards the shadows, the highlights, or the Goldilocks sweet spot in the middle. There is no right or wrong way to set this range. It just needs to yield the picture that helps to promote the story you are trying to tell in the shot. A 16bit EXR file can handle 32 stops of range. Much more than any capture system can deliver currently.

Latitude is how far you can recover a picture from over or under exposure. Often latitude is conflated with dynamic range. In rare cases they are the same but more often than not your latitude is less then the available dynamic range.

Film, the original HDR system.

Film from its creation always captured more information than could be printed. Contemporary stocks have a dynamic range of 12 stops. When you print that film you have to pick the best 8 stops to show via printing with more or less light. The extra dynamic range was there in the negative but was limited by the display technology.

Flash forward to our digital cameras today. Cameras form Arri, Red, Blackmagic, Sony all boast dynamic ranges over 13 stops. The challenge has always been the display environment. This is why we need to start thinking of cameras not as the image creators but more as the photon collectors for the scene at the time of capture. The image is then “mapped” to your display creatively.

Scene referred grading.

The problem has always been how do we fit 10 pounds of chicken into an 8 pound bag? In the past when working with these HDR camera negatives we were limited to the range of the display technology being used. The monitors and projectors before their HDR counterparts couldn’t “display” everything that was captured on set even though we had more information to show. We would color the image to look good on the device for which we were mastering. “Display Referred Grading,” as this is called, limits your range and bakes in the gamma of the display you are coloring on. This was fine when the only two mediums were SDR TV and theatrical digital projection. The difference between 2.4 video gamma and 2.6 theatrical gamma was small enough that you could make a master meant for one look good on the other with some simple gamma math. Today the deliverables and masters are numerous with many different display gammas required. So before we even start talking about HDR, our grading space needs to be “Scene Referred.” What this means is that once we have captured the data on set, we pass it through the rest of the pipeline non-destructively, maintaining the relationship to the original scene lighting conditions. “No pixels were harmed in the making of this major motion picture.” is a personal mantra of mine.

I’ll add the tone curve later.

There are many different ways of working scene-referred. the VFX industry has been working this way for decades. The key point is we need to have a processing space that is large enough to handle the camera data without hitting the boundaries i.e. clipping or crushing in any of the channels. This “bucket” also has to have enough samples (bit-depth) to be able to withstand aggressive transforms. 10-bits are not enough for HDR grading. We need to be working in a full 16-bit floating point.

This is a bit of an exaggeration, but it illustrates the point. Many believe that a 10 bit signal is sufficient enough for HDR. I think for color work 16 bit is necessary. This ensures we have enough steps to adequately describe our meat and potatoe…

This is a bit of an exaggeration, but it illustrates the point. Many believe that a 10 bit signal is sufficient enough for HDR. I think for color work 16 bit is necessary. This ensures we have enough steps to adequately describe our meat and potatoes part of the image in addition to the extra highlight data at the top half of the code values.

Bit-depth is like butter on bread. Not enough and you get gaps in your tonal gradients. We want a nice smooth spread on our waveforms.

Now that we have our non destructive working space we use transforms or LUTs to map to our displays for mastering. ACES is a good starting point for a working space and a set of standardized transforms, since it works scene referenced and is always non destructive if implemented properly. The gist of this workflow is that the sensor linearity of the original camera data has been retained. We are simply adding our display curve for our various different masters.

Stops measure scenes, Nits measure displays.

For measuring light on set we use stops. For displays we use a measurement unit called a nit. Nits are a measure of peak brightness not dynamic range. A nit is equal to 1 cd/m2. I’m not sure why there is two units with different nomenclature for the same measurement, but for displays we use the nit. Perhaps candelas per meter squared, was just too much of a mouthful. A typical SDR monitor has a brightness of 100 nits. A typical theatrical projector has a brightness of 48 nits. There is no set standard for what is considered HDR brightness. I consider anything over 600nits HDR. 1000nits or 10 times brighter than legacy SDR displays is what most HDR projects are mastered to. The Dolby Pulsar monitor is capable of displaying 4000 nits which is the highest achievable today. The PQ signal accommodates values up to 10,000 nits

The Sony x300 has a peak brightness of 1000 nits and is current gold standard for reference monitors.

The Sony x300 has a peak brightness of 1000 nits and is current gold standard for reference monitors.

The Dolby Pulsar is capable of 4000 nit peak white

The Dolby Pulsar is capable of 4000 nit peak white

P-What?

Rec2020 color primaries with a D65 white point

Rec2020 color primaries with a D65 white point

The most common scale to store HDR data is the PQ Electro-Optical Transfer Function. PQ stands for perceptual quantizer. the PQ EOTF was standardized when SMPTE published the transfer function as SMPTE ST 2084. The color primaries most often associated with PQ are rec2020. BT.2100 is used when you pair the two, PQ transfer function with rec2020 primaries and a D65 white point. This is similar to how the definition of BT.1886 is rec709 primaries with an implicit 2.4 gamma and a D65 white point. It is possible to have a PQ file with different primaries than rec2020. The most common variance would be P3 primaries with a D65 white point. Ok, sorry for the nerdy jargon but now we are all on the same page.



2.4_vs_PQ.png

HDR Flavors

There are four main HDR flavors in use currently. All of them use a logarithmic approach to retain the maxim amount of information in the highlights.

Dolby Vision

Dolby Vision is the most common flavor of HDR out in the field today. The system works in three parts. First you start with your master that has been graded using the PQ EOTF. Next you “analyse“ the shots in in your project to attach metadata about where the shadows, highlights and meat and potatoes of your image are sitting. This is considered layer 1 metadata. Next this metadata is used to inform the Content Mapping Unit or CMU how best to “convert” your picture to SDR and lower nit formats. The colorist can “override” this auto conversion using a trim that is then stored in layer 2 metadata commonly referred to as L2. The trims you can make include lift gamma gain and sat. In version 4.0 out now, Dolby has given us the tools to also have secondary controls for six vector hue and sat. Once all of these settings have been programmed they are exported into an XML sidecar file that travels with the original master. Using this metadata, a Dolby vision equipped display can use the trim information to tailor the presentation to accommodate the max nits it is capable of displaying on a frame by frame basis.

HDR 10

HDR 10 is the simplest of the PQ flavors. The grade is done using the PQ EOTF. Then the entire show is analysed. The average brightness and peak brightness are calculated. These two metadata points are called MaxCLL - Maximum Content Light Level and MaxFALL - Maximum Frame Average Light Level. Using these a down stream display can adjust the overall brightness of the program to accommodate the displays max brightness.

HDR 10+

HDR 10+ is similar to Dolby Vision in that you analyse your shots and can set a trim that travels in metadata per shot. The difference is you do not have any color controls. You can adjust points on a curve for a better tone map. These trims are exported as an XML file from your color corrector.

HLG

Hybrid log gamma is a logarithmic extension of the standard 2.4 gamma curve of legacy displays. The lower half of the code values use 2.4 gamma and the top half use log curve. Combing the legacy gamma with a log curve for the HDR highlights is what makes this a hybrid system. This version of HDR is backwards compatible with existing display and terrestrial broadcast distribution. There is no dynamic quantification of the signal. The display just shows as much of the signal as it can.

HDR_curve_8_19_anim.gif

Deliverables

Deliverables change from studio to studio. I will list the most common ones here that are on virtually every delivery instruction document. Depending on the studio, the names of these deliverables will change but the utility of them stays the same.

PQ 16-bit Tiffs

This is the primary HDR deliverable and derives some of the other masters on the list. These files typically have a D65 white point and are either Rec2020 or p3 limited inside of a Rec2020 container.

GAM

The Graded Archival Master has all of the color work baked in but does not have the any output transforms. This master can come in three flavors all of which are scene referred;

ACES AP0 - Linear gamma 1.0 with ACES primaries, sometimes called ACES prime.

Camera Log - The original camera log encoding with the camera’s native primaries. For example, for Alexa, this would be LogC Arri Wide Gamut.

Camera Linear - This flavor has the camera’s original primaries with a linear gamma 1.0

NAM

The non-graded assembly master is the equivalent of the VAM back in the day. It is just the edit with no color correction. This master needs to be delivered in the same flavor that your GAM was.

ProRes XQ

This is the highest quality ProRes. It can hold 12-bits per image channel and was built with HDR in mind.

Dolby XML

This XML file contains all of the analysis and trim decisions. For QC purposes it needs to be able to pass a check from Dolby’s own QC tool Metafier.

IMF

Inter-operable Master Format files can do a lot. For the scope of this article we are only going to touch on the HDR delivery side. The IMF is created from an MXF made from jpeg 2000s. The jp2k files typically come from the PQ tiff master. It is at this point that the XML file is married with picture to create one nice package for distribution.


Near Future

Currently we master for theatrical first for features. In the near future I see the “flippening” occurring. I would much rather spend the bulk of the grading time on the highest quality master rather than the 48nit limited range theatrical pass. I feel like you get a better SDR version by starting with the HDR since you have already corrected any contamination that might have been in the extreme shadows or highlights. Then you spend a few days “trimming” the theatrical SDR for the theaters. The DCP standard is in desperate need of a refresh. 250Mbps is not enough for HDR or high resolution masters. For the first time in film history you can get a better picture in your living room than most cinemas. This of course is changing and changing fast.

Sony and Samsung both have HDR cinema solutions that are poised to revolutionize the way we watch movies. Samsung has their 34 foot onyx system which is capable of 400nit theatrical exhibition. You can see a proof of concept model in action today if you live in the LA area. Check it out at the Pacific Theatres Winnetka in Chatsworth.

Sony has, in my opinion, the wining solution at the moment. They have a their CLED wall which is capable of delivering 800 nits in a theatrical setting. These types of displays open up possibilities for filmmakers to use a whole new type of cinematic language without sacrificing any of the legacy story telling devices we have used in the past.

For example, this would be the first time in the history of film where you could effect a physiologic change to the viewer. I have often thought about a shot I graded for The “Boxtrolls” where the main character, Eggs, comes out from a whole life spent in the sewers. I cheated an effect where the viewers eyes were adjusting to a overly bright world. To achieve this I cranked the brightness and blurred the image slightly . I faded this adjustment off over many shots until your eye “adjusted” back to normal. The theatrical grade was done at 48nits. At this light level, even at it’s brightest the human eye is not iris’ed down at all, but what if I had more range at my disposal. Today I would crank that shot until it made the audiences irises close down. Then over the next few shots the audience would adjust back to the “new brighter scene and it would appear normal. That initial shock would be similar to the real world shock of coming from a dark environment to a bright one.

Another such example that I would like to revisit is the myth of “L’Arrivée d’un train en gare de La Ciotat. In this early Lumière picture a train pulls into a station. The urban legend is that this film had audiences jumping out of their seats and ducking for cover as the train comes hurling towards them. Imagine if we set up the same shot today but in a dark tunnel. We could make the head light so bright in HDR that coupled with the sound of a rushing train would cause most viewers, at the very least, to look away as it rushes past. A 1000 nit peak after your eyes have been acclimated to the dark can appear shockingly bright.

I’m excited for these and other examples yet to be created by filmmakers exploring this new medium. Here’s to better pixels and constantly progressing the art and science of moving images!

Please leave a comment below if there are points you disagree with or have any differing views on the topics discussed here.

Thanks for reading,

John Daro

Questioning Color

I was recently interviewed for an article on the power of color. The questions were very thought-provoking, and some topics I hadn’t thought about in years. I think it’s always a good practice to periodically approach the profession with a kindergarten mindset, ask the important “why’s”, and question or reaffirm first principles.


You can check out the article here and see the complete list of questions and my responses below.

Hopefully, this helps my fellow hue benders out there. Let me know if you disagree with anything in the comments. I always appreciate new ways of looking at things.


  • John Daro, Senior Colourist, Warner Bros

    John Daro is a Lead Digital Intermediate (DI) colourist at Warner Post Production Creative Services, a division of Warner Media. He has supervised the finishing and grading of many feature films, television pilots, commercials, and music videos for clients, including all the major studios, in addition to independent productions.

    Daro started his career at the film lab FotoKem. His first notable achievement was architecting a direct-to-disk dailies pipeline. From that role, he moved on to film scanning-recording and, with the DI process's creation, his current position as a finishing colourist. His past jobs gave him a mastery over colour transforms, and he started to couple those strengths with the art of cinematography. He continued to pioneer post-production techniques, including 3D conversion and the early days of HDR imaging. As a founding team member of their digital film services department, he helped FotoKem achieve its status as one of the premier post houses in the film and television post-production industry.

    In this interview, Daro talks to us about how colour can be used to shape an audience’s interpretation of a film and provides examples of how he’s used colour to help communicate a narrative in the past. 

  • How do you think colour shapes the way audiences perceive film?

    It's funny that you asked how colour shapes the audience's perception because, in a way, the colour process is literally “shaping” what we want you to see and what we don't. At its most basic, colour finishing is the process of highlighting and subduing certain key areas which directs the viewer’s attention to where the filmmakers intended. Before digital colour grading, cinematographers highlighted these key areas through shadow, light, depth of field and lens effects. Photochemical timing changes were limited to colour and density.  Nowadays, the sky's the limit with shapes, articulate roto masks and matte channels. Ultimately, the end goal of all these tools is to make it feel natural and true to the story and highlight key moments necessary for the viewer to absorb the supporting narrative. It's an old cliche, but a picture tells a thousand words. 

  • How have you used colour to communicate with an audience?

    A couple of examples that popped into my mind involve using dynamics to simulate coming out of a bright area. I think I used this technique most effectively in Laika’s The Boxtrolls. When Eggs, the film’s protagonist, came out of the sewer for the first time, we applied a dynamic luminance adjustment to simulate what your eyes would do when adjusting to a bright light coming from the darkness. 

    Another example is Steven Soderbergh's Contagion, where colour choices define geographical location to help the viewer know where they were without needing additional information.  I also used this technique in Natalie Portman’s adaptation of Amos Oz’s A Tale of Love and Darkness. For this film, we were dipping through memories and a fantasy world.  Colour choices defined the real world versus the world inside the character’s head.

  • Can you talk us through your involvement in choosing and implementing a colour palette for a specific scene/film?

    So many of these decisions are decided upfront through lighting, production design and wardrobe. It's always a pleasure to be brought in on a project early enough to have been part of those conversations and understand the motivation behind the choices. I feel there's great value for everybody to be on the same page and know how specific colours on the set will render in the final output.

    Look development can have many muses. A style guide, concept art and a reference deck are great tools. I like to start by picking two or three key colours that are important to the story, symbolize a character, or represent a message. Then the objective is to find the best way to shift and enhance to make a split complementary palette. Contrast and simplicity are at the heart of the finest design in my opinion.

  • Do you think colour is the backbone of emotion in film?

    I don't think so, no. I think sound is. Smell would be even more evocative, but luckily for the Jackass movies, smell-o-vision didn't catch on!

    I prefer to grade with the latest mix. It's the sum of all the parts that make a great cinematic symphony. Colour and sound must be playing in concert, similar in tone or in contrast but always together.

    I think that kind of sensory feeling might be ingrained in our DNA. Whenever I hear the opening to Peter and the Wolf, I visualize spring greens. It doesn't work the other way, however. I don't hear Peter and the Wolf every time I see green. Colour must be taken in context. Green can make you feel safe and calm like a lush field of grass blowing in the wind does. At the same time, it can evoke feelings of jealousy or sickness. It all depends on context and the motivation behind the story being told.

  • How do you know when a specific colour scheme does or doesn’t work?

    There are a few academic reasons I could give you. For example, certain colours clash with other colours. Certain colour harmonies are not particularly pretty, but that doesn't mean that you can't use those mismatches. Especially if what you're going for is to make the viewer uncomfortable or uneasy. Colour is subjective; balance is not. 

    Ultimately, the real answer is feeling it in your gut. You know when it’s right. I have an internal rule when taking my passes. The reel is done when I can watch it down and have less than three tweaks to make. You are never really finished. Most often, you just run out of time.

  • How do you work with the director and cinematographer to achieve a specific look in a film?

    I first watch a rough cut or read the script and get an idea of the story. Next, I take camera tests and start to build a basic look. The goal with this V1 transform is to find something that works for the cinematographer and gets them repeated results in different lighting conditions. Obviously, it should also have an aesthetically pleasing visual component. It's important when building a look to ensure that the camera still behaves as expected regarding sensitivity and dynamic range. You don’t want to bake anything that could hamper the original photography. Essentially, make sure mid-grey still maps to mid-grey. 

    Once we have dailies, the process begins again, where I might have a V2, V3, or V4 version of the look that we're going for. I put those in a still gallery on a server for remote viewing, and we constantly update the conversation forum page with feedback from the creatives. I maintain before and afters of all versions to ensure we improve and never go backwards creatively. The last step is to ensure that the look works for the story once the film is assembled. Tweaks are made to the show look and certain scenes get special treatments for effect. 

    CDLs are a vital part of this process as well. Grading your dailies is very important for ensuring that there are no surprises when everyone gets to the DI theatre. I've had past experiences where producers see something that has the final grade, but it's too far of a departure from what the look was in editorial. To combat this reaction, we always want to ensure that the look is consistently maintained from the first shot out of the camera through to the final finish.

  • How can colour set the tone for a scene? 

    To set the tone, it's all about warmer, colder, brighter, or darker. As I've already touched on, it's really important to pick a few colours that you want to enhance and then let the background support that enhancement, whether through a complimentary value or making it recede. They're also the obvious washes that you can do. For example, if you make a scene very red, the warmth invokes a sense of romance or love. It can also yield a literal hot vibe. Something very cold, very desaturated invokes a sense of bleakness or a dystopian feeling. Magenta's a weird one because it can be warm and cold at the same time. Depends on the context and how it's used. Green-yellow also functions similarly. It can seem sickly and off, but it can also be romantic and warm, depending on what side of the hue you're on. I don't know where these generalities came from, but they're almost universal at this point. My gut is that human evolution has something to do with it. I think the responses to these colours helped us survive at some point. When I say it’s in our DNA, I do mean just that.

  • What technical difficulties do you come across with undertaking this critical part of film production?

    You can avoid many technical difficulties by ensuring you’re doing no harm to your pixels. The most important thing is that you have a colour-managed pipeline, in that you’re never working on what the film looks like, but rather what the film was captured as. Make sure you’re always working in a photon real-world scene-referred way. At that point, the displays don’t matter as much. They simply target what you’re trying to hit. They can always be adjusted after the fact if there are technical concerns. I always work with soft display-referred transforms that gradually roll off your highs and also have a nice toe in the black. This generally helps cut down on the number of technical issues. 

    Past that, it's all about keeping your eye on the scopes and ensuring there are no technical glitches in the actual capture or renders like quantization, dead pixels, hits or bad frames. All it takes is an eye for detail and a great QC department.

  • How does your role as colourist differ when working on animation compared with live action films? 

    At the core, the two are very similar. The same principles that make a strong image still hold true regardless of how the image was created. Modern render engines are very good at doing what light does. So much so that a lot of DoP buddies of mine are pre-vising lighting setups virtually.

    Nowadays, that line is being blurred even further, where some films can be comprised of mostly VFX shots. Many of these setups, especially with the advent of digi doubles, are not very different from a fully animated picture.

    Now, if we’re defining a “live-action” shot as being something captured with a camera, and no further manipulation, then the biggest difference comes from the workflow. For example, if you take a live-action setup that has been shot with clouds and maybe some inconsistent lighting situations, your first step is technical colour correction just to balance the shots together. You don't have this problem with animation, but you do have a similar situation where you might have many artists working on the same scene. This can sometimes lead to slight inconsistencies that must be smoothed out.

    A huge advantage of CG-originated shots is that they tend to have advanced matte channels. These could be as simple as a matte for the main character or as complicated as depth or normals. These additional tools allow for more complex grades but also increase the time that you spend on each shot.

  • Can you tell us about your grading suite? What could you not be without while at work?

    My grading suite looks like a hot mess. Imagine the Great Wall of monitors. I have two x300s, and two GUI monitors for Baselight, one extra wide LG for what I call my Swiss Army box and an admin computer. The most important display is my Christie 4k projector. I also have an LG C2 to simulate a consumer experience.

    The most important machine in my tool set is the Swiss Army box. Essentially, it's a super micro chassis with four a6000s that has every single piece of post-production software that has ever been useful. I also use this box for coding and the development of my own in-house tools. I would consider this machine mission-critical. Second to that, the next most important piece of gear would be an external scope. Your eyes can lie to you, but scopes never do. Software scopes have made huge advancements in recent years. I can't say I use the external one every day, but when you need one, there really isn't a substitute.

  • How does Baselight aid your role as a colourist?

    There's a lot of great software out there and I always say use the right tool for the right job. For most of my jobs, that ends up being Baselight. The reasons are straightforward. Firstly, the colour science in the machine is second to none. Next, I appreciate the simplicity of the interface. When colouring long form, most of what you're doing is manipulating groups of many shots. Baselight makes this very easy to do. The other thing I can't live without is how Baselight organises and categorises. When you get towards the end of the project, things get hectic, and it's very nice to be able to sort and view in any way that a project demands. Often this has to do with missing visual effects or work I need to get to after the session. I use categories and marks so that I always know what the status of a scene is at any given time. This organisation also aids in communication. I can always keep post supervisors up to date with reports. Additionally, my in-house team always knows what needs to be done and what is already completed based on the organisation that we have put in place. I've always felt that Baselight was built by people who do the job of colour – not by committees or nonpracticing theoreticians. 

  • What are you working on now/next?

    I’m currently finishing a docuseries directed by Allen Hughes for FX about Tupac Shakur’s life and relationship with his mother called Dear Mama. The interviews have an exciting cognac look that I can’t wait to share. The first part premiered at the Toronto International Film Festival and was very well received.

    Later this month, I will be finishing a feature called Sweetwater. It’s about the story of the first black NBA player, Nat “Sweetwater” Clifton. It is a period piece, so a lot of fun in the colour department. We contemplated a black-and-white double X look for the show but ultimately landed on a derivative of an Ektachrome simulation that I had built a while ago. It has a super cool look if I do say so myself.

    We are also supporting pre-production and dailies on A Gun on Second Street. This show is shooting on film, which is always pleasurable and exciting.  The look for the show is a straightforward Kodak film vibe, expertly lensed by Leo Hinstin.

    While we are on the topic of film shows, I’m also supervising the remastering of Superman II (yes, both cuts.) This will be released in early 2023 just in time to get people excited about Michael Shannon’s General Zod in The Flash. Kneel before Zod!

    Additionally, my team and I will return to animation early next year for an upcoming Netflix feature. More on that later at www.johndaro.com.



HDR - Flavors and Best Practices (Copy)

Better Pixels.

Over the last decade we have had a bit of a renaissance in imaging display technology. The jump from SD to HD was a huge bump in image quality. HD to 4k was another noticeable step in making better pictures, but had less of an impact from the previous SD to HD jump. Now we are starting to see 8k displays and workflows. Although this is great for very large screens, this jump has diminishing returns for smaller viewing environments. In my opinion, we are to the point where we do not need more pixels, but better ones. HDR or High Dynamic Range images along with wider color gamuts are allowing us to deliver that next major increase in image quality. HDR delivers better pixels!

Last Summer_lake.jpg

Stop… What is dynamic range?

When we talk about the dynamic range of a particular capture system, what we are referring to is the delta between the blackest shadow and the brightest highlight captured. This is measured in Stops typically with a light-meter. A Stop is a doubling or a halving of light. This power of 2 way of measuring light is perfect for its correlation to our eyes logarithmic nature. Your eyeballs never “clip” and a perfect HDR system shouldn’t either. The brighter we go the harder it becomes to see differences but we never hit a limit.

Unfortunately digital camera senors do not work in the same way as our eyeballs. Digital sensors have a linear response, a gamma of 1.0 and do clip. Most high-end cameras convert this linear signal to a logarithmic one for post manipulation.

lin_curve_anim_wide.gif

I was never a huge calculus buff but this one thought experiment has served me well over the years.

Say you are at one side of the room. How many steps will it take to get to the wall if each time you take a step, the step is half the distance of your last. This is the idea behind logarithmic curves.

Say you are at one side of the room. How many steps will it take to get to the wall if each time you take a step, the step is half the distance of your last. This is the idea behind logarithmic curves.

It will take an infinite number of steps to reach the wall, since we can always half the half.

It will take an infinite number of steps to reach the wall, since we can always half the half.

range_shift_ANIM.gif

Someday we will be able to account for every photon in a scene, but until that sensor is made we need to work within the confines of the range that can be captured

For example if the darkest part of a sampled image are the shadows and the brightest part is 8 stops brighter, that means we have a range of 8 stops for that image. The way we expose a sensor or a piece of celluloid changes based on a combination of factors. This includes aperture, exposure time and the general sensitivity of the imaging system. Depending on how you set these variables you can move the total range up or down in the scene.

Let’s say you had a scene range of 16 stops. This goes from the darkest shadow to direct hot sun. Our imaging device in this example can only handle 8 of the 16 present stops. We can shift the exposure to be weighted towards the shadows, the highlights, or the Goldilocks sweet spot in the middle. There is no right or wrong way to set this range. It just needs to yield the picture that helps to promote the story you are trying to tell in the shot. A 16bit EXR file can handle 32 stops of range. Much more than any capture system can deliver currently.

Latitude is how far you can recover a picture from over or under exposure. Often latitude is conflated with dynamic range. In rare cases they are the same but more often than not your latitude is less then the available dynamic range.

Film, the original HDR system.

Film from its creation always captured more information than could be printed. Contemporary stocks have a dynamic range of 12 stops. When you print that film you have to pick the best 8 stops to show via printing with more or less light. The extra dynamic range was there in the negative but was limited by the display technology.

Flash forward to our digital cameras today. Cameras form Arri, Red, Blackmagic, Sony all boast dynamic ranges over 13 stops. The challenge has always been the display environment. This is why we need to start thinking of cameras not as the image creators but more as the photon collectors for the scene at the time of capture. The image is then “mapped” to your display creatively.

Scene referred grading.

The problem has always been how do we fit 10 pounds of chicken into an 8 pound bag? In the past when working with these HDR camera negatives we were limited to the range of the display technology being used. The monitors and projectors before their HDR counterparts couldn’t “display” everything that was captured on set even though we had more information to show. We would color the image to look good on the device for which we were mastering. “Display Referred Grading,” as this is called, limits your range and bakes in the gamma of the display you are coloring on. This was fine when the only two mediums were SDR TV and theatrical digital projection. The difference between 2.4 video gamma and 2.6 theatrical gamma was small enough that you could make a master meant for one look good on the other with some simple gamma math. Today the deliverables and masters are numerous with many different display gammas required. So before we even start talking about HDR, our grading space needs to be “Scene Referred.” What this means is that once we have captured the data on set, we pass it through the rest of the pipeline non-destructively, maintaining the relationship to the original scene lighting conditions. “No pixels were harmed in the making of this major motion picture.” is a personal mantra of mine.

I’ll add the tone curve later.

There are many different ways of working scene-referred. the VFX industry has been working this way for decades. The key point is we need to have a processing space that is large enough to handle the camera data without hitting the boundaries i.e. clipping or crushing in any of the channels. This “bucket” also has to have enough samples (bit-depth) to be able to withstand aggressive transforms. 10-bits are not enough for HDR grading. We need to be working in a full 16-bit floating point.

This is a bit of an exaggeration, but it illustrates the point. Many believe that a 10 bit signal is sufficient enough for HDR. I think for color work 16 bit is necessary. This ensures we have enough steps to adequately describe our meat and potatoe…

This is a bit of an exaggeration, but it illustrates the point. Many believe that a 10 bit signal is sufficient enough for HDR. I think for color work 16 bit is necessary. This ensures we have enough steps to adequately describe our meat and potatoes part of the image in addition to the extra highlight data at the top half of the code values.

Bit-depth is like butter on bread. Not enough and you get gaps in your tonal gradients. We want a nice smooth spread on our waveforms.

Now that we have our non destructive working space we use transforms or LUTs to map to our displays for mastering. ACES is a good starting point for a working space and a set of standardized transforms, since it works scene referenced and is always non destructive if implemented properly. The gist of this workflow is that the sensor linearity of the original camera data has been retained. We are simply adding our display curve for our various different masters.

Stops measure scenes, Nits measure displays.

For measuring light on set we use stops. For displays we use a measurement unit called a nit. Nits are a measure of peak brightness not dynamic range. A nit is equal to 1 cd/m2. I’m not sure why there is two units with different nomenclature for the same measurement, but for displays we use the nit. Perhaps candelas per meter squared, was just too much of a mouthful. A typical SDR monitor has a brightness of 100 nits. A typical theatrical projector has a brightness of 48 nits. There is no set standard for what is considered HDR brightness. I consider anything over 600nits HDR. 1000nits or 10 times brighter than legacy SDR displays is what most HDR projects are mastered to. The Dolby Pulsar monitor is capable of displaying 4000 nits which is the highest achievable today. The PQ signal accommodates values up to 10,000 nits

The Sony x300 has a peak brightness of 1000 nits and is current gold standard for reference monitors.

The Sony x300 has a peak brightness of 1000 nits and is current gold standard for reference monitors.

The Dolby Pulsar is capable of 4000 nit peak white

The Dolby Pulsar is capable of 4000 nit peak white

P-What?

Rec2020 color primaries with a D65 white point

Rec2020 color primaries with a D65 white point

The most common scale to store HDR data is the PQ Electro-Optical Transfer Function. PQ stands for perceptual quantizer. the PQ EOTF was standardized when SMPTE published the transfer function as SMPTE ST 2084. The color primaries most often associated with PQ are rec2020. BT.2100 is used when you pair the two, PQ transfer function with rec2020 primaries and a D65 white point. This is similar to how the definition of BT.1886 is rec709 primaries with an implicit 2.4 gamma and a D65 white point. It is possible to have a PQ file with different primaries than rec2020. The most common variance would be P3 primaries with a D65 white point. Ok, sorry for the nerdy jargon but now we are all on the same page.



2.4_vs_PQ.png

HDR Flavors

There are four main HDR flavors in use currently. All of them use a logarithmic approach to retain the maxim amount of information in the highlights.

Dolby Vision

Dolby Vision is the most common flavor of HDR out in the field today. The system works in three parts. First you start with your master that has been graded using the PQ EOTF. Next you “analyse“ the shots in in your project to attach metadata about where the shadows, highlights and meat and potatoes of your image are sitting. This is considered layer 1 metadata. Next this metadata is used to inform the Content Mapping Unit or CMU how best to “convert” your picture to SDR and lower nit formats. The colorist can “override” this auto conversion using a trim that is then stored in layer 2 metadata commonly referred to as L2. The trims you can make include lift gamma gain and sat. In version 4.0 out now, Dolby has given us the tools to also have secondary controls for six vector hue and sat. Once all of these settings have been programmed they are exported into an XML sidecar file that travels with the original master. Using this metadata, a Dolby vision equipped display can use the trim information to tailor the presentation to accommodate the max nits it is capable of displaying on a frame by frame basis.

HDR 10

HDR 10 is the simplest of the PQ flavors. The grade is done using the PQ EOTF. Then the entire show is analysed. The average brightness and peak brightness are calculated. These two metadata points are called MaxCLL - Maximum Content Light Level and MaxFALL - Maximum Frame Average Light Level. Using these a down stream display can adjust the overall brightness of the program to accommodate the displays max brightness.

HDR 10+

HDR 10+ is similar to Dolby Vision in that you analyse your shots and can set a trim that travels in metadata per shot. The difference is you do not have any color controls. You can adjust points on a curve for a better tone map. These trims are exported as an XML file from your color corrector.

HLG

Hybrid log gamma is a logarithmic extension of the standard 2.4 gamma curve of legacy displays. The lower half of the code values use 2.4 gamma and the top half use log curve. Combing the legacy gamma with a log curve for the HDR highlights is what makes this a hybrid system. This version of HDR is backwards compatible with existing display and terrestrial broadcast distribution. There is no dynamic quantification of the signal. The display just shows as much of the signal as it can.

HDR_curve_8_19_anim.gif

Deliverables

Deliverables change from studio to studio. I will list the most common ones here that are on virtually every delivery instruction document. Depending on the studio, the names of these deliverables will change but the utility of them stays the same.

PQ 16-bit Tiffs

This is the primary HDR deliverable and derives some of the other masters on the list. These files typically have a D65 white point and are either Rec2020 or p3 limited inside of a Rec2020 container.

GAM

The Graded Archival Master has all of the color work baked in but does not have the any output transforms. This master can come in three flavors all of which are scene referred;

ACES AP0 - Linear gamma 1.0 with ACES primaries, sometimes called ACES prime.

Camera Log - The original camera log encoding with the camera’s native primaries. For example, for Alexa, this would be LogC Arri Wide Gamut.

Camera Linear - This flavor has the camera’s original primaries with a linear gamma 1.0

NAM

The non-graded assembly master is the equivalent of the VAM back in the day. It is just the edit with no color correction. This master needs to be delivered in the same flavor that your GAM was.

ProRes XQ

This is the highest quality ProRes. It can hold 12-bits per image channel and was built with HDR in mind.

Dolby XML

This XML file contains all of the analysis and trim decisions. For QC purposes it needs to be able to pass a check from Dolby’s own QC tool Metafier.

IMF

Inter-operable Master Format files can do a lot. For the scope of this article we are only going to touch on the HDR delivery side. The IMF is created from an MXF made from jpeg 2000s. The jp2k files typically come from the PQ tiff master. It is at this point that the XML file is married with picture to create one nice package for distribution.


Near Future

Currently we master for theatrical first for features. In the near future I see the “flippening” occurring. I would much rather spend the bulk of the grading time on the highest quality master rather than the 48nit limited range theatrical pass. I feel like you get a better SDR version by starting with the HDR since you have already corrected any contamination that might have been in the extreme shadows or highlights. Then you spend a few days “trimming” the theatrical SDR for the theaters. The DCP standard is in desperate need of a refresh. 250Mbps is not enough for HDR or high resolution masters. For the first time in film history you can get a better picture in your living room than most cinemas. This of course is changing and changing fast.

Sony and Samsung both have HDR cinema solutions that are poised to revolutionize the way we watch movies. Samsung has their 34 foot onyx system which is capable of 400nit theatrical exhibition. You can see a proof of concept model in action today if you live in the LA area. Check it out at the Pacific Theatres Winnetka in Chatsworth.

Sony has, in my opinion, the wining solution at the moment. They have a their CLED wall which is capable of delivering 800 nits in a theatrical setting. These types of displays open up possibilities for filmmakers to use a whole new type of cinematic language without sacrificing any of the legacy story telling devices we have used in the past.

For example, this would be the first time in the history of film where you could effect a physiologic change to the viewer. I have often thought about a shot I graded for The “Boxtrolls” where the main character, Eggs, comes out from a whole life spent in the sewers. I cheated an effect where the viewers eyes were adjusting to a overly bright world. To achieve this I cranked the brightness and blurred the image slightly . I faded this adjustment off over many shots until your eye “adjusted” back to normal. The theatrical grade was done at 48nits. At this light level, even at it’s brightest the human eye is not iris’ed down at all, but what if I had more range at my disposal. Today I would crank that shot until it made the audiences irises close down. Then over the next few shots the audience would adjust back to the “new brighter scene and it would appear normal. That initial shock would be similar to the real world shock of coming from a dark environment to a bright one.

Another such example that I would like to revisit is the myth of “L’Arrivée d’un train en gare de La Ciotat. In this early Lumière picture a train pulls into a station. The urban legend is that this film had audiences jumping out of their seats and ducking for cover as the train comes hurling towards them. Imagine if we set up the same shot today but in a dark tunnel. We could make the head light so bright in HDR that coupled with the sound of a rushing train would cause most viewers, at the very least, to look away as it rushes past. A 1000 nit peak after your eyes have been acclimated to the dark can appear shockingly bright.

I’m excited for these and other examples yet to be created by filmmakers exploring this new medium. Here’s to better pixels and constantly progressing the art and science of moving images!

Please leave a comment below if there are points you disagree with or have any differing views on the topics discussed here.

Thanks for reading,

John Daro