The Java Course provides a general introduction to programming in Java. It is based on A.B. Downey's book, How to Think Like a Computer Scientist. Click here for details.


Parsing

In order to implement the algorithm from the previous section, we need to be able to traverse a string and break it into operands and operators. This process is an example of parsing, and the results---the individual chunks of the string---are called tokens.

Java provides a built-in class called a StringTokenizer that parses strings and breaks them into tokens. To use it, you have to import it from java.util.

In its simplest form, the StringTokenizer uses spaces to mark the boundaries between tokens. A character that marks a boundary is called a delimiter.

We can create a StringTokenizer in the usual way, passing as an argument the string we want to parse.

    StringTokenizer st = new StringTokenizer ("Here are four tokens.");

The following loop is a standard idiom for extracting the tokens from a StringTokenizer.

    while (st.hasMoreTokens ()) {
        System.out.println (st.nextToken());
    }

The output is

Here
are
four
tokens.

For parsing expressions, we have the option of specifying additional characters that will be used as delimiters:

    StringTokenizer st = new StringTokenizer ("11 22+33*", " +-*/");

The second argument is a String that contains all the characters that will be used as delimiters. Now the output is:

11
22
33

This succeeds at extracting all the operands but we have lost the operators. Fortunately, there is one more option for StringTokenizers.

    StringTokenizer st = new StringTokenizer ("11 22+33*", " +-*/", true);

The third argument says, "Yes, we would like to treat the delimiters as tokens." Now the output is

11

22
+
33
*

This is just the stream of tokens we would like for evaluating this expression.



Last Update: 2011-01-24