| Index: > A B C D E F G H I J K L M N O P Q R S T U V W X Y Z |
|
|||||
| First Prev [ 1 2 ] Next Last |
An LL parser is called an LL(k) parser if it uses k tokens of look-ahead when parsing a sentence. If such a parser exists for a certain grammar and it can parse sentences of this grammar without backtracking then it is called an LL(k) grammar. Of these grammars, LL(1) grammars, although fairly restrictive, are very popular because the corresponding LL parsers only need to look at the next token to make their parsing decisions.
A table-based top-down parser can be schematically presented as in Figure 1.
| +---+---+---+---+---+---+ Input: | ( | 1 | + | 1 | ) | $ | +---+---+---+---+---+---+ ^ | Stack: | +----------+ +---+ | | | + |<------| Parser |-----> Output +---+ | | | S | +----------+ +---+ ^ | ) | | +---+ | | $ | +----------+ +---+ | Parsing | | table | +----------+ |
The parser has an input buffer, a stack on which it keeps symbols from the grammar, a parsing table which tells it what grammar rule to use given the symbols on top of its stack and its input tape. To explain its workings we will use the following small grammar:
The parsing table for this grammar looks as follows:
| ( | ) | 1 | + | $ | |
| S | 2 | - | 1 | - | - |
| F | - | - | 3 | - | - |
(Note that there is also a column for the special terminal $ that is used to indicate the end of the input stream.) Depending on the top-most symbol on the stack and the current symbol in the input stream, the parser applies the rule stated in the matching row and column of the parsing table (e.g. if there is a 'S' on the top of the parser stack and a '1' in the front-most position of the input stream, the parser executes rule number 1, i.e. it replaces the 'S' on its stack by 'F').
When the parser starts it always starts on its stack with
[ S, $ ]where $ is a special terminal to indicate the bottom of the stack and the end of the input stream, and S is the start symbol of the grammar. The parser will attempt to rewrite the contents of this stack to what it sees on the input stream. However, it only keeps on the stack what still needs to be rewritten. For example, let's assume that the input is "( 1 + 1 )". When the parser reads the first "(" it knows that it has to rewrite S to "( S + F )" and writes the number of this rule to the output. The stack then becomes:
[ (, S, +, F, ), $ ]In the next step it removes the '(' from its input stream and from its stack:
[ S, +, F, ), $ ]Now the parser sees an '1' on its input stream so it knows that it has to apply rule (1) and then rule (3) from the grammar and write their number to the output stream. This results in the following stacks:
[ F, +, F, ), $ ] [ 1, +, F, ), $ ]In the next two steps the parser reads the '1' and '+' from the input stream and also removes them from the stack, resulting in:
[ F, ), $ ]In the next three steps the 'F' will be replaced on the stack with '1', the number 3 will be written to the output stream and then the '1' and ')' will be removed from the stack and the input stream. So the parser ends with both '$' on its stack and on its input steam. In this case it will report that it has accepted the input string and on the output stream it has written the list of numbers [ 2, 1, 3, 3 ] which is indeed a leftmost derivation of the input string in reverse (therefore, the derivation goes like this: S -> ( S + F ) -> ( F + F ) -> ( 1 + F ) -> ( 1 + 1 )).
As can be seen from the example the parser performs three types of steps depending on whether the top of the stack is a nonterminal, a terminal or the special symbol $:
These steps are repeated until the parser stops, and then it will have either completely parsed the input and written a leftmost derivation to the output stream or it will have reported an error.