The PNG Guide is an eBook based on Greg Roelofs' book, originally published by O'Reilly.



Text Chunks

The other item worth looking at is the interactive text-entry code. Most windowing systems will have more elegant ways to read in text than I use here, but even they should ensure that the entered text conforms to the recommended format for PNG text chunks. PNG text is required to use the Latin-1 character set; strictly speaking, that does not restrict the use of control characters (character code 127 and any code below 32 decimal), but in practice only line feeds (code 10) are necessary. The use of carriage-return characters (code 13) is explicitly discouraged by the spec in favor of single line feeds; this has implications for DOS, OS/2, Windows, and Macintosh systems. Horizontal tabs (code 9) are discouraged as well since they don't display the same way on all systems, but there are legitimate uses for tabs in text. The section of the spec dealing with security considerations implicitly recommends against the use of the escape character (code 27), which is commonly used to introduce ANSI escape sequences. Since these can include potentially malicious macros, encoders should restrict the use of the escape character for the sake of overly simple-minded decoders. That leaves codes 9, 10, 32-126, and 160-255 as valid from a practical standpoint, with use of the first (tab) discouraged. Note that codes 128-159 are not valid Latin-1 characters, at least not in the printable sense. They are reserved for specialized control characters.

The specification also recommends that lines in each text block be no more than 79 characters long; I've chosen to restrict mine to 72 characters each, plus provide for one or two newline characters and a trailing NULL. The spec does not specifically address the issue of the final newline, but does require omitting the trailing NULL; logically, one might extend that to include trailing newlines, so I have.

Finally, I have arbitrarily allowed only six predetermined keywords: Title, Author, Description, Copyright (all officially registered), and E-mail and URL (unregistered). Description is limited to nine lines, mainly so that the little line-counter prompts for each line are single digits and therefore line up nicely; the others are limited to one line each. Thus the code for reading the Title keyword, once the text buffer (textbuf) has been allocated, looks like this:

    do {
        valid = TRUE;
        p = textbuf + TEXT_TITLE_OFFSET;
        fprintf(stderr, "  Title: ");
        fflush(stderr);
        if (FGETS(p, 74, keybd) && (len = strlen(p)) > 1) {
            if (p[len-1] == '\n')
                p[--len] = '\0';    /* remove trailing newline */
            wpng_info.title = p;
            wpng_info.have_text |= TEXT_TITLE;

            if ((result = wpng_isvalid_latin1((uch *)p, len)) >= 0) {
                fprintf(stderr, "    " PROGNAME " warning:  character"
                  " code %u is %sdiscouraged by the PNG\n"
                  "    specification [first occurrence was at
                  " character position #%d]\n", (unsigned)p[result],
                  (p[result] == 27)? "strongly " : "", result+1);
                fflush(stderr);
#ifdef FORBID_LATIN1_CTRL
                wpng_info.have_text &= ~TEXT_TITLE;
                valid = FALSE;
#else
                if (p[result] == 27) {      /* escape character */
                    wpng_info.have_text &= ~TEXT_TITLE;
                    valid = FALSE;
                }
#endif
            }
        }
    } while (!valid);

Aside from some subtlety with the keybd stream that I won't cover here (it has to do with reading from the keyboard even if standard input is redirected), the only part of real interest is the test for nonrecommended Latin-1 characters, which is accomplished in the wpng_isvalid_latin1() function:

static int wpng_isvalid_latin1(uch *p, int len)
{
    int i, result = -1;

    for (i = 0;  i < len;  ++i) {
        if (p[i] == 10 || (p[i] > 31 && p[i] < 127) || p[i] > 160)
            continue;
        if (result < 0 || (p[result] != 27 && p[i] == 27))
            result = i;
    }

    return result;
}

If the function finds a control character that is discouraged by the PNG specification, it returns the offset of the first one found. The only exception is if an escape character (code 27) is found later in the string; in that case, its offset is what gets returned. The main code then tests for a non-negative value and prints a warning message. What happens next depends on how the program has been compiled. By default, the presence of an escape character forces the user to re-enter the text, but all of the other discouraged characters are allowed. If the FORBID_LATIN1_CTRL macro is defined, however, the user must re-enter the text whenever any of the ``bad'' control characters is found. The default behavior results in output similar to the following:

Enter text info (no more than 72 characters per line);
to skip a field, hit the <Enter> key.
  Title: L'Arc de Triomphe
  Author: Greg Roelofs
  Description (up to 9 lines):
    [1] This line contains only normal characters.
    [2] This line contains a tab character here: ^I
    [3]
    wpng warning:  character code 9 is discouraged by the PNG
    specification [first occurrence was at character position #85]
  Copyright: We attempt an escape character here: ^[
    wpng warning:  character code 27 is strongly discouraged by the PNG
    specification [first occurrence was at character position #38]
  Copyright: Copyright 1981, 1999 Greg Roelofs
  E-mail: roelofs@pobox.com
  URL: http://www.libpng.org/pub/png/pngbook.html

Note that the Copyright keyword had to be entered twice since the first attempt included an escape character. The Description keyword also would have had to be reentered if the program had been compiled with FORBID_LATIN1_CTRL defined.

Returning to more mundane issues, wpng_info is the struct by which the front end communicates with the PNG-writing back end. It is of type mainprog_info, and it is defined as follows:

typedef struct _mainprog_info {
    double gamma;
    long width;
    long height;
    time_t modtime;
    FILE *infile;
    FILE *outfile;
    void *png_ptr;
    void *info_ptr;
    uch *image_data;
    uch **row_pointers;
    char *title;
    char *author;
    char *desc;
    char *copyright;
    char *email;
    char *url;
    int filter;
    int pnmtype;
    int sample_depth;
    int interlaced;
    int have_bg;
    int have_time;
    int have_text;
    jmp_buf jmpbuf;
    uch bg_red;
    uch bg_green;
    uch bg_blue;
} mainprog_info;

As in the previous programs, we use the abbreviated typedefs uch, ush, and ulg in place of the more unwieldy unsigned char, unsigned short, and unsigned long, respectively. The title element is simply a pointer into the text buffer, and the struct contains similar pointers for the other five keywords. have_text is more than a simple Boolean (TRUE/FALSE) value, however. Because the user may not want all six text chunks, the program must keep track of which ones were provided with valid data. Thus, have_text is a bit flag, and TEXT_TITLE sets the bit corresponding to the Title keyword--but only if the length of the entered string is greater than one.

The user indicates that a field should be skipped by hitting the Enter key, and the fgets() function includes the newline character in the string it returns; thus a string of length one contains nothing but the newline.




Last Update: 2010-Nov-26