Topic by Aktan

Posted: 8/10/2010 6:45 PM
(Edited: 8/10/2010 7:14 PM)

Post subject: YV12 Color Reduction Proof

Aktan

Publisher

Joined: 4/23/2009
Posts: 1283

Since Warp asked for this, here goes. This is samples from when we were discussing in the IRC channel the different ways one can re-sample the color when it is lost. I'll do it in order and then explain all of them:

From top to bottom: Original RGB Convert to YV12 with Lanczos Re-sample and then convert back to RGB with Lanzos Re-sample Convert to YV12 with Lanczos Re-sample and then convert back to RGB with Point Re-sample Convert to YV12 with Lanczos Re-sample and then convert back to RGB with Windows Default (My GPU?) Re-sample Convert to YV12 with Point Re-sample and then convert back to RGB with Lanzos Re-sample Convert to YV12 with Point Re-sample and then convert back to RGB with Point Re-sample Convert to YV12 with Point Re-sample and then convert back to RGB with Windows Default (My GPU?) Re-sample Now the reason why there is a conversion back to RGB is because our displays output only in RGB, so the conversion is automatically done regardless, but I have a way to force a method, hence why I do it to show a difference. Note: View each picture in full screen to see it better.

Posted: 8/10/2010 7:09 PM
(Edited: 8/10/2010 8:42 PM)

Scepheo

Player (147)

Joined: 7/16/2009
Posts: 686

EDIT: I uploaded a picture to demonstrate the differences (it had all images underneath each other), but when I uploaded it it was scaled down for some silly reason and thus very vague. So, something useful: The Sonic pictures show the difference, but that is during a very jagged level transition (or something like it), is there any noticeable difference during normal gameplay?

Posted: 8/10/2010 7:12 PM

Aktan

Publisher

Joined: 4/23/2009
Posts: 1283

Scepheo wrote:

Honestly, I don't see any difference. At all.

Yea I forgot to mention, view them in full screen to see it better. =p

Posted: 8/10/2010 7:12 PM

Bisqwit

Editor, Player (91)

Joined: 3/8/2004
Posts: 7470
Location: Arzareth

Scepheo wrote:

Honestly, I don't see any difference. At all.

Your image is scaled-down and thus irrelevant. In the original one, take a careful look at the ⑤ ball. Also, the power meter.

Posted: 8/10/2010 7:15 PM

Aktan

Publisher

Joined: 4/23/2009
Posts: 1283

Bisqwit wrote:

Your image is scaled-down and thus irrelevant.

It is also all 1 image, making it hard to compare at full screen.

Posted: 8/10/2010 7:24 PM

Derakon

Joined: 7/2/2007
Posts: 3960

What you should do to compare is open one image in its own window, and then a second image in that window, and flip back and forth between them. The differences are subtle enough that you can't easily see them in Aktan's post, but they're there.

Pyrel - an open-source rewrite of the Angband roguelike game in Python.

Posted: 8/10/2010 7:41 PM

Warp

Banned User

Joined: 3/10/2004
Posts: 7698
Location: Finland

I was more interested in seeing snapshots of OoT because that was the topic of the discussion, but anyways: I don't know if it's just me, my monitor, or a combination, but I just can't see a visible difference between the images, even when zoomed. The differences are probably there (and could probably be corroborated by checking the actual pixel component values), I don't doubt that, but I just can't see any difference. If there is a difference, it's practically indiscernible. And this was with an example image which ought to best show any coloration difference. I can only imagine that with something like OoT any difference is going to be even less visible. So my question is: Why such a huge worry about a possible minuscule difference in coloration which nobody is going to notice even if you put the original and the transformed images side by side (much less if they look at the transformed image all by itself)? Everybody talks about the H.264 lossless compression is if it was changing all the colors around, turning bright blues into aquamarines and reds into magentas. I just don't see that.

Posted: 8/10/2010 7:42 PM

Aktan

Publisher

Joined: 4/23/2009
Posts: 1283

Derakon wrote:

What you should do to compare is open one image in its own window, and then a second image in that window, and flip back and forth between them. The differences are subtle enough that you can't easily see them in Aktan's post, but they're there.

Here is sonic which really shows it better. Same order as before:

Posted: 8/10/2010 7:43 PM

Aktan

Publisher

Joined: 4/23/2009
Posts: 1283

Warp wrote:

I don't know if it's just me, my monitor, or a combination, but I just can't see a visible difference between the images, even when zoomed.

Check the sonic ones =D

Posted: 8/10/2010 7:45 PM

Raiscan

Joined: 11/11/2006
Posts: 1235
Location: United Kingdom

Check the reds in the images. They get hit the worst. (compare the fire shield of sonic's in pics 1 and 2

<adelikat> I am annoyed at my irc statements ending up in forums & sigs

Posted: 8/10/2010 7:46 PM

Warp

Banned User

Joined: 3/10/2004
Posts: 7698
Location: Finland

Aktan wrote:

Check the sonic ones =D

Doesn't H.264 always perform the yuv transformation before encoding, regardless of which compression method is used? Wouldn't that mean that the above artifact would be shown in a regular published encode as well? Does it?

Posted: 8/10/2010 7:51 PM
(Edited: 8/10/2010 8:13 PM)

Aktan

Publisher

Joined: 4/23/2009
Posts: 1283

Warp wrote:

Doesn't H.264 always perform the yuv transformation before encoding, regardless of which compression method is used? Wouldn't that mean that the above artifact would be shown in a regular published encode as well? Does it?

It all depends on the encoder on how they choose to re-sample to YUV (whether they know it or not). The fact you are capturing to lossless H.264 basically means you let the converter in the encoder choose whatever it wants. LUCKLY it doesn't choose to re-sample the fastest way possible which is using Point re-sample. If it did, it would show up in the publish Sonic 3 encode. The point of a RGB raw is that the encoder will then have a choice to choose a re-sample to the encoder's liking. For example, I rather choose Lanczos re-sample, while other people might like a different one (Bisqwit actually prefers the Point re-sample for Lunar Ball though I doubt he'd prefer it for Sonic =D). Another reason is that the encoder would have more flexibility to do whatever he needs on the encode. That might be adding a middle border in DS games, etc. The main thing is, the conversion to YUV should be done last, just before your final output. Not at the source level. You can feed x264 a YV12 stream, and it won't do another YV12 conversion.

Posted: 8/10/2010 7:55 PM

Sonikkustar

Skilled player (1341)

Joined: 9/7/2007
Posts: 1354
Location: U.S.

The second Sonic picture looks the best imo.

Homepage SVN

Posted: 8/11/2010 7:03 AM

Post subject: Magnification of details (Lunar Ball version)

Bisqwit

Editor, Player (91)

Joined: 3/8/2004
Posts: 7470
Location: Arzareth

Warp wrote:

I don't know if it's just me, my monitor, or a combination, but I just can't see a visible difference between the images, even when zoomed.

Make sure you compare the vertically next picture to the previous one. The two horizontally adjacent images are completely redundant. Aktan forgot to deobfuscate his post. Here is a magnification of the interesting bits in his images. I have also included a grayscale version of each image, to show how the luma is affected (or not affected) by each conversion.

Original material (no artifacts)

Encoding: RGB→YV12 Using Lanczos Decoding: YV12→RGB Using Lanczos (High-quality software) Comment: The spheres are very round and pleasant to look at, but there are ringing artifacts where chroma changes are most dramatic (the 5 and 6 balls look like they're floating on jello)

Encoding: RGB→YV12 Using Lanczos Decoding: YV12→RGB Using Nearest-neighbor (Software default) Comment: There are extra grays and the spheres appear blockier. The 3 ball has grown a ghost image. The 5 ball has changed shape entirely. Even the wall has grown a shadow for no reason at all.

Encoding: RGB→YV12 Using Lanczos Decoding: YV12→RGB Using Windows default (maybe hardware-accelerated, maybe software quadratic scaling) Comment: The difference to lanczos-lanczos is almost unnoticeable. The ringing artifacts are smaller though.

Encoding: RGB→YV12 Using Nearest-neighbor Decoding: YV12→RGB Using Lanczos (High-quality software) Comment: This combination smudges stuff profusely. The smudging still only happens on chroma; the luma (luminance, which is what the grayscale image shows) is unaffected and appears as in the original.

Encoding: RGB→YV12 Using Nearest-neighbor Decoding: YV12→RGB Using Nearest-neighbor (Software default) Comment: The effect of halving the chroma resolution is very noticeable. Some pixels are simply wrong color. The score meter is very blocky. But there are no ringing artifacts whatsoever. This is the picture you should look at, if you want to learn what chroma supersampling ― using the same chroma for every 2x2 pixel group ― means and what are its practical consequences. All of the others use mathematical filter trickery to attempt to make the fact less noticeable.

Encoding: RGB→YV12 Using Nearest-neighbor Decoding: YV12→RGB Using Windows default (maybe hardware-accelerated, maybe software quadratic scaling) Comment: Better than the above two. There is an artifact beside the number "5" that does not exist in the lanczos-encoded versions. The power meter edges are very blurry. I am actually surprised of the fact that the grayscale images did not appear to be completely identical. Wasn't the point of these filters to scale the chroma, not the luma? But in retrospect, this is probably an artifact of RGB conversions; there are YUV values that cannot be expressed losslessly in RGB, so additional brightness is "loaned" from other color components by decreasing the saturation. Or something. Waiting for a better explanation. SUMMARY:

Encode    Decode     Prob*   Appearance & score
------------------------------------------------
Lanczos   Lanczos    10%     Good       (9pts)
Lanczos   Point      40%     Horrible   (1pts)
Lanczos   Something  50%     Good       (9pts)
Point     Lanczos    10%     Horrible   (1pts)
Point     Point      40%     Tolerable  (6pts)
Point     Something  50%     Good       (9pts)

Prob* = estimated probability of being decoded this way

As an encoder, you can only control how you encode stuff. You cannot control what will be used to decode it. You should maximize the chances of it looking good when decoded.
Summary of the consequences of your choices according to this chart:

Lanczos: 60% * 9pts + 40% * 1pts              = 5.8pts
Point:   50% * 9pts + 40% * 6pts + 10% * 1pts = 7.0pts

I would also like to see the encoding case of "average of four pixels". It is what MEncoder does. This would also change the Sonic case dramatically. With nearest-neighbor, I think you get sort of blinking between dramatically different chroma values. But lacking that option, of this selection, I recommend nearest-neighbor scaling in the encoding phase.

Posted: 8/11/2010 9:08 AM

Post subject: Re: Magnification of details (Lunar Ball version)

rhebus

Joined: 2/19/2010
Posts: 248

Bisqwit wrote:

I am actually surprised of the fact that the grayscale images did not appear to be completely identical. Wasn't the point of these filters to scale the chroma, not the luma? But in retrospect, this is probably an artifact of RGB conversions; there are YUV values that cannot be expressed losslessly in RGB, so additional brightness is "loaned" from other color components by decreasing the saturation. Or something. Waiting for a better explanation.

I suspect your greyscale images are not accurate representations of the luma channel. Luma usually refers to a weighted average of RGB. In JPEG (more accurately in the JFIF container), luma or Y is: Y = 0.299 R + 0.587 G + 0.114 B though other constants are possible. Note 0.299 + 0.587 + 0.114 == 1. Greyscale usually refers to a simple average of RGB. Using the letter X to represent greyscale: X = R/3 + G/3 + B/3. If you simply generated a "greyscale" image, this is what you get in most image editing programs. If the image is optimised to preserve luma, it will not preserve greyscale and vice versa. ie chroma lossage will be noticable in simple greyscale images. YUV luma uses its weighted average because the rods in our eyes are most sensitive at green wavelengths and least sensitive at blue wavelengths. EDIT: to answer the obvious followup question "How do I extract the luma channel from a YV12 image?": I don't know, sorry :( I'm just a theory guy, I don't know the tools.

Posted: 8/11/2010 10:30 AM

Aktan

Publisher

Joined: 4/23/2009
Posts: 1283

Bisqwit, very nicely done post. While I respect your suggestion that it is more likely to have a better output with point re-sampling, I can't help to think that if we all used point re-sampling, we would see problems like we saw in the three point re-sample to YV12 of Sonic. Since, as you know, point just takes 1 pixel for color out of the 4 pixels in a 2x2 pixel block, what was shown in the Sonic pictures can really happen often in pixel art.

Posted: 8/11/2010 10:54 AM

Post subject: Re: Magnification of details (Lunar Ball version)

Warp

Banned User

Joined: 3/10/2004
Posts: 7698
Location: Finland

Bisqwit wrote:

The two horizontally adjacent images are completely redundant. Aktan forgot to deobfuscate his post.

That's what confused me. It's much clearer now.

Posted: 8/11/2010 2:29 PM
(Edited: 8/11/2010 5:15 PM)

Aktan

Publisher

Joined: 4/23/2009
Posts: 1283

That's a reasonable explanation on Bisqwit's findings, rhebus! Thanks for the information.

Posted: 8/11/2010 5:12 PM

Bisqwit

Editor, Player (91)

Joined: 3/8/2004
Posts: 7470
Location: Arzareth

Aktan wrote:

That's a reasonable explanation oh Bisqwit's findings, rhebus! Thanks for the information.

I tried both Imagemagick's -type grayscale and -colorspace gray options, and they both gave the same result, which according to Imagemagick documentation, is calculated with that Y formula (in both cases). -fx (R+G+B)/3 gives the other grayscale result, which was not shown here. So no, that was a good theory, but not the right explanation. The real reason is still unknown.

Posted: 8/11/2010 7:18 PM
(Edited: 8/11/2010 10:41 PM)

Post subject: Independent tests

Bisqwit

Editor, Player (91)

Joined: 3/8/2004
Posts: 7470
Location: Arzareth

I decided to go and do independent tests with the same basic idea. The test results are here, as is the shell script that I used to run ffmpeg to create all these images. http://bisqwit.iki.fi/kala/yv12scaledemo/ Further comments I wrote in IRC while nobody seemed to be listening:

For some reason, I'm not getting the artifacts you do on Sonic. http://bisqwit.iki.fi/kala/yv12scaledemo/ I have tried offsetting the Sonic image by a pixel or two in any direction, but no matter what I try, I cannot seem to get the horrible red-distorted version you have, Aktan, regardless of the choice of filters. I think there's something wrong in your setup instead. Maybe you converted the images a few times back & forth in succession? It would explain the way heavier ringing artifacts than in my results, too...

I used ffmpeg for both conversion phases. The input (RGB) is a PNG file; it is converted into a single-frame AVI with FFV1 codec, which uses YUV420p (4:2:0) by default while being lossless in all other aspects, and then back into PNG file, which requires a RGB conversion. YUV420p is what H.264 (and thus x264) also uses, thus relevant here.* *) MEncoder does output different messages when encoding into FFV1 and when encoding into X264: SwScale: scaling 162x37 RGB 24-bit to 162x38 Planar YV12 (FFV1) SwScale: scaling 162x37 RGB 24-bit to 162x38 Planar I420 (X264) However, further study reveals that the only practical difference between these two is how the data is stored; the pixel transformations are exactly the same. I420 planes are numbered 0,1,2; YV12 planes are numbered 0,2,1. EDIT: Found the way to reintroduce those artifacts. So I did, and I fixed bugs, and updated the score table. The winner is: gauss.

Posted: 8/11/2010 8:50 PM

Aktan

Publisher

Joined: 4/23/2009
Posts: 1283

Good job on more testing! It is weird why you don't get the same effect on Sonic. Here is two zoomed in images that really show point re-sampling working, meaning the top left pixel color in each 2x2 box is taken. I've put URLs instead since the images are so large: http://img826.imageshack.us/img826/1094/18919974.png http://img202.imageshack.us/img202/8847/69921030.png

Posted: 8/12/2010 11:01 AM

rhebus

Joined: 2/19/2010
Posts: 248

Bisqwit wrote:

I tried both Imagemagick's -type grayscale and -colorspace gray options, and they both gave the same result, which according to Imagemagick documentation, is calculated with that Y formula (in both cases). -fx (R+G+B)/3 gives the other grayscale result, which was not shown here. So no, that was a good theory, but not the right explanation. The real reason is still unknown.

Thanks for checking this. It looks like you're entirely correct; and it makes sense that my explanation was wrong because it doesn't explain why luma is distorted in some places but not others. I have another theory: The RGB colour cube is embedded in a larger YUV colour cube. This means that there are YUV values which do not give sensible RGB values -- for example, when you plug the numbers in the formulae, you get negative RGB values. Perhaps this is causing issues? Let's study your nearest-neighbour example, because it's conceptually the simplest. AIUI, nearest neighbour tries to maintain Y values and to pick one UV-pair and share them between all pixels in a 2x2 block. But that can lead to YUV values out of the RGB range. Looking at the lower-right hand quadrant of the 5-ball, we find that two different pixel values want to share the same 2x2 block: the pink of the 5-ball (228,0,88) and the green of the "felt" (0,68,0). We decide we will take the UV values from the pink, because it's in the top-left, and to leave the Y values as they are. What happens to the green? (I use the formula in my previous post; Y \in 0..255, U,V \in -128..127 )

Pink (228,0,88) ->YUV-> (78,6,82)
Green (0,68,0) ->YUV-> (40,X,X)
Resultant YUV: (40,6,82)
Resultant RGB: (155,-21,51)

My guess is that some hackery maps this "fake" RGB value into the real RGB value that results in the image (190,0,50) in some way which doesn't preserve luma -- and at least one of luma or chroma must be diddled to get back to real RGB values. So there's my prediction: problems will occur at places where the YUV colour cube doesn't map back to RGB; this happens at extremes of chroma and luma. So it doesn't happen in the shot power bar because the luma is all pretty much bang in the middle of the range there; it doesn't happen at the top-left of the 5-ball because the pink of the 5-ball is of pretty average luma; but it *does* happen at the bottom-right of the 5-ball (and of most other balls) because the high chroma pink of the 5-ball combines with the low luma dark green of the felt.

Posted: 8/13/2010 7:51 AM
(Edited: 8/13/2010 12:38 PM)

Post subject: RGB and YUV conversions

Bisqwit

Editor, Player (91)

Joined: 3/8/2004
Posts: 7470
Location: Arzareth

rhebus wrote:

My guess is that some hackery maps this "fake" RGB value into the real RGB value that results in the image (190,0,50) in some way which doesn't preserve luma -- and at least one of luma or chroma must be diddled to get back to real RGB values.

That is pretty much my theory too. YUV can express colors that are not possible to express in RGB. For example, a fully-saturated red that is brighter than 30% of the maximum brightness of brightest white. In RGB, if you try to increase the brightness of #FF0000 (assuming 8 bits per channel), you lose saturation. Wich means, if you've got a fully saturated green (#00FF00, brightness 59% of #FFFFFF), i.e. luma = 59%, and you change the chroma into fully saturated red, you've got a YUV that cannot be converted losslessly into RGB. The best you can get is #FF7817 (rgb leak with Y weighting; preserves luma (59%), but becomes noticeably yellow), or #FF4BFF (rgb leak with Y weighting; also preserves luma (59%), but becomes noticeably purple), or #FF6969 (non-discriminating rgb leak with Y weighting; luma=59%; some saturation lost), or #FF2424 (simple rgb leak calculation; luma=only 40%; some saturation lost too), or #FF0000 (preserves chroma, but luma is only 30%. This is clamping conversion, which is what most software do). EDIT: Here's an image I created to illustrate this better. It has the basic SMPTE color bars*. Vertical axis: Luma; Horizontal axis: Saturation (used to multiply chroma). Eight different saturation values are shown. A pair of black circles is used to mark the interval on each color bar, outside which the RGB conversion is lossy. Between values are accurate representations; outside values have had their saturation reduced in order to make it fit into the RGB gamut. The grayscale conversion shows that luma is preserved in all values. Gamma=1.6.

EDIT 2: In the above imageset, I went to painstaking efforts to preserve the luma in the unrepresentable colors. Here is what most software does instead, i.e. clamping:

To my surprise, the same also happens in the upper region. Once the luma becomes small enough, too stark chroma values will cause RGB components to appear negative. When only simple clamping is done, these colors will appear anything but black. I am not sure what to make of this. On the other hand, color distribution in these graph correlates with the color of noise patterns seen on badly tuned tv receivers… *) Not exactly right. The colors are the same, but the real SMPTE color bars are actually 100% saturated color bars in which RGB components are always either 0 or 75%. Consecutively, the luma varies. They are sorted in order of increasing luma.

Posted: 8/13/2010 10:14 AM

Post subject: Re: RGB and YUV conversions

Warp

Banned User

Joined: 3/10/2004
Posts: 7698
Location: Finland

Bisqwit wrote:

YUV can express colors that are not possible to express in RGB.

This is a complete nitpick because I and everybody else knows perfectly well what you are talking about, but since I love nitpicking, let me correct that statement a bit: YUV can express colors that are not possible to express in RGB which uses a limited gamut. The problem is not RGB per se, but the limited accuracy of its representation in most image formats and display systems. Namely, most of these formats use a linear mapping between color components and (typically) a 8-bit or 16-bit integer. This not only limits the maximum (and obviously minimum) value of each component, but also introduces aliasing (which can sometimes be even visible, eg. in slow grayscale gradients). There are RGB-based color spaces which have no such limitations, typically using either larger integers or floating point numbers for each color component, and which can go far beyond the upper (and lower) limits of the traditional RGB-based image formats. scRGB is one example of such a color space. Of course this is completely theoretical stuff in this context because we have to limit ourselves to the traditional limited-gamut RGB when encoding and viewing videos.

Posted: 8/13/2010 10:40 AM

Post subject: Re: Independent tests

Raiscan

Joined: 11/11/2006
Posts: 1235
Location: United Kingdom

Bisqwit wrote:

I decided to go and do independent tests with the same basic idea. The test results are here, as is the shell script that I used to run ffmpeg to create all these images. http://bisqwit.iki.fi/kala/yv12scaledemo/

This in an exceptionally good comparison. Thank you very much for doing it, Bisqwit. Guess I'll be using Area when possible from now on.

<adelikat> I am annoyed at my irc statements ending up in forums & sigs

Forum Encoders' corner YV12 Color Reduction Proof

Forum

Encoders' corner

YV12 Color Reduction Proof