The PNG Guide is an eBook based on Greg Roelofs' book, originally published by O'Reilly.

International Text Annotations (iTXt)

  • Status:   PNG Extensions [87]
  • Location:   anywhere
  • Multiple:   yes

[87] As this book went to press, the iTXt chunk had just been approved for inclusion in the core PNG specification, but it was temporarily placed in the PNG extensions document pending completion and approval of extensive ISO-related changes to the core spec. (Note that these changes are almost entirely of an organizational or editorial nature; the technical content of the specification is expected to change only minimally from version 1.1.). Version 1.2 of the PNG specification is expected around mid-1999 or later. In the meantime, iTXt can be found in version 1.1.1 (and possibly later versions) of the extensions document, which is available electronically from

I previously noted that, as of early 1999, PNG was in the midst of joint ISO/IEC standardization. One of the technical issues in the first Committee Draft vote was the lack of support for non-Western languages, specifically in the text chunks. In fact, the PNG Development Group had already discussed a more general text chunk in mid-1998, but its vote was deferred until there was external interest in it. The ISO comments from Japan and the United States clearly fell into the category of external interest, however, so the iTXt was voted on and approved as part of the PNG specification in early February 1999.

The layout of iTXt is a generalization of tEXt and zTXt, as shown in Table 11-2.

Table 11-2. iTXt Chunk

Field Length and Valid Range
Keyword 1-79 bytes (Latin-1 text)
Null separator 1 byte (0)
Compression flag 1 byte (0, 1)
Compression method 1 byte (0)
Language tag k bytes (ASCII text)
Null separator 1 byte (0)
Translated keyword m bytes (Unicode UTF-8 text)
Null separator 1 byte (0)
Text n bytes (Unicode UTF-8 text)

The first field is a keyword, with exactly the same restrictions and officially registered values (Author, Description, and so on) as the tEXt and zTXt chunks. Latin-1 (ISO/IEC 8859-1) was chosen so that existing PNG source code could be used without modification to parse and optionally recognize the keyword.

The keyword is followed by a null separator byte and two compression-related bytes. The first indicates whether the main text is compressed (if its value is 1) or not (if it's 0). If the text is compressed, the next byte indicates its compression method, which currently must be zero for the zlib-encoded deflate algorithm. The two bytes could have been combined, but for historical reasons relating to the method byte in IHDR, the split approach was favored.

After the compression bytes is an optional case-insensitive field indicating the (human) language used in the remaining two text fields. This is necessary not only to render Unicode text properly but also to allow decoders to distinguish between multiple iTXt chunks, which may consist of the same text in different languages--but possibly identical keywords. Unlike both the keyword and the main text, the language tag is plain ASCII text (specifically, the ``invariant'' ASCII subset of ISO 646, which is itself a subset of both Latin-1 and Unicode UTF-8) conforming to Internet Standard RFC 1766. It consists of hyphen-separated ``words'' of between one and eight characters each, where the first word is either a two-letter ISO language code (ISO 639), the letter i for tags registered by the Internet Assigned Numbers Authority (IANA)[88] or the letter x for private tags. The second ``word'' is interpreted as an ISO 3166 country code if it is exactly two characters long or as an IANA-registered code if it is between three and eight characters. Subsequent ``words'' may be anything, as long as they conform to the general rules. Examples of language tags include cn (Chinese), en-US (American English), no-bok (Norwegian bokmål or ``book language''), i-navajo (Navajo), and x-klingon (Klingon, from the fictional Star Trek universe).

[88] As this is written, indications are that IANA will eventually be replaced by ICANN, the Internet Corporation for Assigned Names and Numbers. This transition may not occur until 2000, however.

A null separator byte terminates the language tag, which is followed by an optional translation of the keyword into the given language. The translated keyword is represented in the UTF-8 encoding of the Unicode character set, which is described in the International Standard ISO/IEC 10646-1, in Internet RFC 2279, and in the Unicode Consortium's reference, The Unicode Standard. Like the primary keyword, it should not contain any newline characters, and it is also followed by a null byte.

The remaining chunk data is the main UTF-8 text, either zlib-compressed or not, according to the compression flag. Since its length can be determined from the chunk length, it is not null-terminated. As with the other two text chunks, newlines should be represented by single line-feed characters (decimal 10), and all other control characters (1-9, 11-31, and 127-159) are discouraged. Note, however, that UTF-8 encodings may contain any of the bytes between 128 and 159; what is discouraged is the set of Unicode characters whose four-byte integer values are 128-159.

That last point is confusing, so perhaps a quick primer on Unicode is in order. The Unicode character set is a mapping between graphic characters (or glyphs) and integers. The simplest representation is called UCS-4 and consists of 4-byte integers, potentially allowing more than two billion characters to be defined. On top of that are a number of possible transformations or encodings of the character set; UTF-8 is one of the more popular ones, encoding 4-byte UCS-4 characters into anywhere from 1 to 8 bytes. All Unicode characters below 128 are encoded as single bytes in UTF-8, and because Unicode characters 1-127 are identical to US-ASCII characters 1-127, the Unicode character set (and UTF-8 in particular) may be thought of as a very large superset of 7-bit ASCII.

Multibyte UTF-8 encodings, on the other hand, are composed entirely of byte values between 128 and 253--which means that bytes 1-9, 11-31, and 127 will never be found in valid UTF-8-encoded text except when representing the characters 1-9, 11-31, and 127. So about half of the control characters that are discouraged in iTXt can be detected simply by checking for those single bytes. The remaining half, characters 128-159, are all encoded with 2-byte sequences that happen to begin with byte value 194: 194 128 through 194 159. The fact that character 128 is discouraged in iTXt's UTF-8 text fields therefore means that the 2-byte encoding 194 128 is discouraged, but the 2-byte encoding 195 128 (À or ``Latin capital letter A with grave accent'') is completely acceptable.

Last Update: 2010-Nov-26