- 7,898
- Horbury, West Yorkshire
- GTP_Sprite
PSINextWhen I joined the PSINext forums two years ago (it was called PS3Insider back then), one of my reasons for doing so was to glean knowledge of the Cell architecture from several individuals that seemed ahead of the curve in their understanding of the Broadband Engine. Marco Salvi (aka nAo) was one of those individuals.
Thus it seems only fitting that two years later, here I am asking him questions once more; this time in a more formal setting.
The topic of this interview is the custom high dynamic range lighting format Marco has helped to develop in his brief time with Ninja Theory. This rendering format - NAO32 - is one of the key reasons why Heavenly Sword has emerged as one of the most beautiful games to have made an appearance at this years E3, winning spots on the top ten lists of several publiations.
PSINext: Marco you've been with Ninja Theory for almost a year now, is that right?
Marco: I moved from Italy to the UK last September, so I've been living in this country for something like 8 months now (wow, time flies!) I'm not the biggest gamer out there; I work in this industry because I love cutting edge technology and computer graphics. Moreover, I have a passion for CPU and GPU architectures, so it was a natural thing for me to start working in games development.
PSINext: What inspired you to move to England and get involved with Ninja Theory in the first place?
Marco: After having spent 3 years in Milano working on a PS2/XBox title, I was kinda bored and started thinking about going abroad and trying something different. At the same time, a well-known and respected bloke called Dean Calver (lead programmer @ NT) asked me if I wanted to send over my CV because they were looking for someone that could join them and work on their in-house PS3 engine. I was so excited at the idea that I could not believe him... then quickly came the realization that it was an exceptional opportunity for me. I had an interview in Cambridge and 3 weeks later, I relocated to the UK and started to work on Heavenly Sword.
PSINext: High dynamic range lighting - or HDR - is a term we're starting to hear more and more of these days. Can you explain to us exactly what HDR is, and what it means to people sitting in front of their screens playing a game?
Marco: Yeah, it's a very common term nowadays and it's often misused.
When we render an HDR image, we are basically storing all of the information we need to represent the amount of light that passes through every pixel of our picture, so that we can capture a scene in all its color and luminance-range richness. Unfortunately, current low-cost display technologies (common LCDs, CRTs, etc..) can't properly display an HDR image, so we need to go through a process called 'tone mapping' to remap our image to an LDR (Low Dynamic Range) image in order to display it on a screen.
Tone mapping can also be used to simulate the way our eyes slowly adapt to different light conditions. In the end it means a better visual experience for gamers, although it will take time for the industry to get it right.
PSINext: What are some examples of scenery within a game that benefit from HDR?
Marco: Compared to LDR rendering, every scene that requires a very low or very high global luminance image, or an image with a high contrast ratio, would benefit from HDR lighting.
It's a way to avoid globally saturated images where everything comes out too bright or too dark and the viewer loses all that precious lighting/color information that enriches an image and makes it believable.
Furthermore, a specific tone mapping operator can give to a developer an extra tool to improve story telling, because it can be tweaked to simulate particular behaviors of human vision. It's important to understand that HDR is not just about nice bloom effects - it's much more than that. There's no such thing as LDR imagery in our every day lives.
PSINext: People have come to associate high dynamic range lighting with on-chip hardware support for either FP16 or FP32 HDR rendering. How does Team Ninja implement NAO32, and can it be considered 'real' HDR?
Marco: The FP16 and FP32 rendering formats give a developer the opportunity to collect per pixel information (respectively 8 and 16 bytes per pixel); hence they easily enable us to render and to store an HDR image. Unfortunately, these framebuffer formats are inherently slow because they require more memory bandwidth and increased memory space: an FP16 720p image with 4X anti-aliasing requires about 30 MBytes of memory!
At the same time it's important to understand that it does not matter how we store our HDR images so long as we find a way to encode them without losing too much information.
The RGB color space is not very efficient at encoding HDR images, so after a bit of research we found another color space that is far more efficient at representing HDR images. Its name is CIE Luv, and it splits a color into 3 components: one is not normalized and represents how intense a color is (luminance), while the other 2 components are normalized between 0 and 1.
Gregory Ward, a pioneer of HDR imaging, exploited this color space many years ago to store HDR images in a file format he called LogLuv, so we built upon that work and we customized it to our purposes.
PSINext: So NAO32 is a means for you to preserve memory while at the same time retaining image quality, is that correct?
Marco: Correct. The main idea behind NAO32 is that we want to trade shading power to regain memory space and bandwidth (very precious resources on a console). So instead of encoding our HDR colors into a FP16 or FP32 frame buffer, we devised a scheme to use RSX pixel shading units to convert an RGB color in a CIE Luv color that only requires a common RGBA8 frame buffer (4 bytes per pixel, half the space of a FP16 pixel) to be fully stored.
The quality of this format is really outstanding. Even if it uses half the space/bandwidth of common HDR rendering solutions, it really makes no compromises at all in image quality.
There's no magic here: HDR rendering costs are shifted from memory to shaders, and so our shaders are a bit longer now (between 3 and 5 cycles). We believe it's a very good trade-off. Furthermore, it enables HDR rendering and multisample anti-aliasing on GPUs that do not natively support AA with floating point render targets such as FP16 and FP32.
We also developed a faster 3 bytes per pixel version called NAO24 (predictable, isn't it?) that supports a narrower dynamic range with less accuracy. And although the quality was quite decent in most cases, we decided against making any compromises, and so in the end we did not use it.
My final answer is totally positive: NAO32 can be considered a real HDR format.
PSINext: The advantages of reduced space and preserved quality seem like they would have merit in a number of environments. Do you think there may be a place for NAO32 on the desktop, or even on Microsoft's or Nintendo's offerings?
Marco: Dunno about Nintendo's offering, but it might have merit on Microsoft's console if developers wanted something that takes the same storage space as an FP10 render target, but with a much higher level of quality. NAO32 on Xenos would cost developers shading power relative to FP10, however, and they would lose the ability to use the eDRAM for blending as well. So at this time, I believe something like NAO32 makes more sense on RSX than on Xenos.
PSINext: We've spoken here about a number of NAO32's advantages. Are there any notable drawbacks?
Marco: Yes, there are drawbacks too; it's not an all-win situation. GPUs usually support hardware blending in RGB color spaces, hence hardware assisted blending is not going to work on a NAO32 frame buffer - it would produce incorrect results.
There are various ways to overcome this limitation though, such as by emulating blending operations in a pixel shader, or performing blending on a FP16 render target and then composing the result… or even just blending in LDR in a common RGBA8 buffer.
It would be nice to have a GPU that natively supports a CIE Luv frame buffer though.
PSINext: If I may ask, what originally inspired the name NAO32?
Marco: I didn't name it, actually Dean did on Beyond3D. You should ask him! Internally, we used to call it "the funky color space."
PSINext: That's right, I remember that now that you mention it. NAO32 is the better name, for sure.
Straying from HDR for a bit before we finish up, now that Sony has announced the hard drive to be standard in every console, how do you feel this might effect game development in general, and Heavenly Sword in particular?
Marco: I believe it's a big opportunity for all developers to make better games. This generation the ratio between available memory and optical disc read speeds is much higher than what we had in the previous generation, and a standard hard drive is going to help us reduce loading times and give gamers a better experience.
Regarding HS, this announcement is not going to change our plans.
[It has been confirmed that Heavenly Sword will be making use of the hard drive]
PSINext: After getting to play the game at E3, it's clear that a number of intensive effects such as HDR - courtesy of NAO32 of course - and full soft shadowing are in place. What can you tell us about the levels of anti-aliasing and anisotropic filtering, if any, Ninja Theory is utilizing for Heavenly Sword?
Marco: Yep, shadowing is completely dynamic and everything can cast/receive shadows. Soft shadows are achieved taking 12 jittered samples per pixel. Antialiasing is set to 4X (multisampling), but quality wise is not as good as it could be; we need to work on it, and hopefully it will improve over the next several months.
Anisotropic filtering is being used on some specific meshes (floors, walls, etc...) and AFAIK is set to 8x + trilinear.
PSINext: Have any internal resolution and frame-rate targets been set yet for HS?
Marco: Our target resolution for Heavenly Sword is 720p with 4x MSAA, which we've already achieved. The frame rate target is not something completely set in stone at this time. Though our E3 demo was running at over 30 frames per second, I'm willing to bet the final game will run at 30 FPS. Hopefully this will allow us to push even more effects on screen.
PSINext: As previously discussed, beyond it's high quality one of the primary reasons for the use of NAO32 is that it saves bandwidth in a bandwidth-hungry environment. In the future do you feel RSX will be at a disadvantage to Xenos when it comes to framebuffer effects due to the 128-bit bus and lack of eDRAM?
Marco: Not at all; in fact for many framebuffer effects I believe RSX will have an edge over Xenos. Don't want to go into details, but let me just point out that RSX is connected to two seperate buses, not just one.
PSINext: After speaking with your Ninja colleagues Dean and Mike on the floor of E3, I've been told that Ninja Theory is presently using SPEs for tasks such as sound, physics, animation, and more. Since you're working on the engine, going forward what sort of performance headroom do you feel is present on Cell?
Marco: I can't give any numbers that would make much sense, but I'm certain at this time most developers are barely using Cell's power.
We all have to learn how this machine works and how to get the best out of it. It will take time, but gamers are going to experience some truly amazing stuff.
Very interesting article, im not a big fan of the game, buit hearing the positive comments from an actual developer is always nice to read, if a bit biased (they are trying to sell a product).