The PNG Guide is an eBook based on Greg Roelofs' book, originally published by O'Reilly.

Color Representation

Before we start putting chunks together, however, a brief interlude on the representation and terminology of color is useful. Color fundamentally refers to a property of light--namely, its wavelength. Each color in the rainbow, from red to purple, is a relatively pure strain of wavelengths of light, and none of these colors can be generated by adding together any of the others.[59] Furthermore, despite what our eyeballs would have us think, the spectrum does not end at deep purple; beyond that are the ultraviolet, X-ray, and gamma-ray domains. Nor does it end at dull red--smoke on the water glows in the infrared, if only we could see it, and still further down the spectrum are radio waves.[60] Each of these wavelength regions, from radio on up to gamma, is a color.

[59] Mathematically, this is known as orthogonality and is the basis for Fourier decomposition, among other things.

[60] It is probably not coincidence that the range of light visible to our water-filled orbs just happens to be the precise range of wavelengths that is not strongly absorbed by water.

So when someone refers to an RGB image--that is, containing only red, green, and blue values--as ``truecolor,'' what twisted logic lies behind such a claim? The answer lies not in physics but in physiology. Human eyes contain only three classes of color sensors, which trigger color sensations in the brain in ways that are not yet fully understood. One might guess that these sensors (the cones) are tuned to red, green, and blue light, but that turns out not to be the case, at least not directly. Instead, signals from the three types of cones are added and subtracted in various ways, apparently in more than one stage. The details are not especially important; what matters is that the end result is a set of only three signals going into the brain, corresponding to luminosity (or brightness), a red-versus-green intensity level, and a yellow-versus-blue level. In addition, the cones are not narrow-band sensors, but instead each responds to a broad range of wavelengths. The upshot is that the human visual system is relatively poor at analyzing colors, so feeding it different combinations of red, green, and blue light suffices to fool it into thinking it is seeing an entire spectrum. Keep in mind, however, that while true yellow and a combination of red and green may look identical to us, to spectrometers (or nonhuman eyes) they are quite different.

In fact, even printers ``see'' color differently. Since they employ pigments, which absorb light rather than emit it, the RGB color space that works so well for computer monitors is inappropriate. Instead, use a ``dual'' color space based on cyan, magenta, and yellow, or CMYK for short.[61] And in video processing, television, and the JPEG image format, yet another set of color spaces is popular: YUV, YIQ, and YCbCr, all of which represent light as an intensity value (Y) and a pair of orthogonal color vectors (U and V, or I and Q, or Cb and Cr). All of these color spaces are beyond the scope of this book, but note that every single one of them has its basis in human physiology. Indeed, if YUV and its brethren sound quite a lot like the set of three signals going into the brain that I just discussed, rest assured that it's not coincidence. Not a single color space in common use today truly represents the full continuum of physical color.

[61] The K is for black. Since black is the preferred color for a huge class of printed material, including text, it is more efficient and considerably cheaper to use a single pigment for it than always to be mixing the other three. Some printing systems actually use five, six, or even seven distinct pigments.

Finally, note that image files may represent the appearance of a scene not only as a self-contained item, but also in reference to a background or to other images or text. In particular, transparency information is often desirable. The simplest approach to transparency in computer graphics is to mark a particular color as transparent, but more complex applications will generally require a completely separate channel of information. This is known as an alpha channel (or sometimes an alpha mask) and enables the use of partial transparency, such as is often used in television overlays. In the text that follows, I will refer to an RGB image with an alpha channel as an RGBA image. PNG adheres to the usual convention that alpha represents opacity; that is, an alpha value of 0 is fully transparent, and the maximum value for the pixel depth is completely opaque. PNG also uses only unassociated alpha, wherein the actual gray or color values are stored unchanged and are only affected by the alpha channel at display time. The alternative is associated or premultiplied alpha, in which the pixel values are effectively precomposited against a black background; although this allows slightly faster software compositing, it amounts to a lossy transformation of the image data and was therefore rejected in the design of PNG.

Last Update: 2010-Nov-26