%%list: /* nothing */ | list EOLN | list expr EOLN { printf( "%sn",exprstring); } ;expr: primary | expr primary MUL { if (strlen(exprstring) > 0) { sprintf(tmpstring,"(%s * %g)",exprstring, pop()); } else { sprintf(tmpstring,"( %g * %g )",pop(),pop()); } strcpy(exprstring,tmpstring); } | expr primary DIV { temp=pop(); if (strlen(exprstring) > 0) { sprintf(tmpstring,"(%s / %g)",exprstring, temp); } else { sprintf(tmpstring,"( %g / %g )",pop(),temp); } strcpy(exprstring,tmpstring); } | expr primary PLUS { if (strlen(exprstring) > 0) { sprintf(tmpstring,"(%s + %g)",exprstring, pop()); } else { sprintf(tmpstring,"( %g + %g )",pop(),pop()); } strcpy(exprstring,tmpstring); } | expr primary MINUS { temp=pop(); if (strlen(exprstring) > 0) { sprintf(tmpstring,"(%s - %g)",exprstring, temp); } else { sprintf(tmpstring,"( %g - %g )",pop(),temp); } strcpy(exprstring,tmpstring); } ;primary: NUMBER { push($1); } ;%%

You can see the resulting output below:

4 5 + 6 *(( 4 + 5 ) * 6)

As mentioned in the original IBM article, we can pipe sequences together to show the parsing and calculation of an expression from different formats. For example:

$ rpntoequ|calc 4 5 + 6 *54

And even rpntoequ and equtorpn:

$ rpntoequ|equtorpn 4 5 + 6 *4 5 + 6 *

The current RPN translator as shown here is not as advanced as the main RPN system, and so it doesn’t support all the options, or expression formats, but you can get the general idea. You can download the code for this example: rpntoequ.tar.gz (Unix).

]]>

%%list: /* nothing */ | list EOLN | list expr EOLN { printf( "n" ); } ;expr: shift_expr ;shift_expr: pow_expr | shift_expr LEFTSHIFT pow_expr { printf("> "); } ;pow_expr: add_expr | pow_expr POW add_expr { printf("^ "); } ;add_expr: mul_expr | add_expr PLUS mul_expr { printf("+ "); } | add_expr MINUS mul_expr { printf("- "); } ;mul_expr: unary_expr | mul_expr MUL unary_expr { printf("* "); } | mul_expr DIV unary_expr { printf("/ "); } | mul_expr MOD unary_expr { printf("% "); } ;unary_expr: postfix_expr | MINUS primary %prec UNARYMINUS { printf("-"); } | INC unary_expr { printf("++ "); } | DEC unary_expr { printf("-- "); } ;postfix_expr: primary | postfix_expr INC { printf("++ "); } | postfix_expr DEC { printf("-- "); } | postfix_expr FACT { printf("! "); } ; primary: NUMBER { printf("%g ",$1); } | PI { printf("%g ", M_PI); } | OPENBRACKET expr CLOSEBRACKET { } | function_call ;function_call: SIN OPENBRACKET expr CLOSEBRACKET { printf("sin "); } | COS OPENBRACKET expr CLOSEBRACKET { printf("cos "); } | TAN OPENBRACKET expr CLOSEBRACKET { printf("tan "); } | ASIN OPENBRACKET expr CLOSEBRACKET { printf("asin "); } | ACOS OPENBRACKET expr CLOSEBRACKET { printf("acos "); } | ATAN OPENBRACKET expr CLOSEBRACKET { printf("atan "); } ;%%

Why does it work? It has to do with the parser evaluates the different components. When, for example, the parser identifies an addition with this rule:

add_expr: mul_expr | add_expr PLUS mul_expr { printf("+ "); }

The code that the parser generates evaluates the sub-rules first, and in both cases the rules will ultimately lead to the numerical value. Each time the number is seen, the value is printed. Once both rules have been resolved, it then matches the full expression and outputs the plus sign. In use, the parser generates all of the necessary RPN:

4+5*64 5 6 * + (4+5)*64 5 + 6 *

You can download the source for the equation to RPN parser: equtorpn.tar.gz (Unix)

]]>

(4+5)*6

I suggest a couple of extensions in the tutorial, namely a Reverse Polish Notation (RPN) calculator, and translators that convert to/from RPN and standard equation format. Here, we’re going to start with looking at the RPN calculator. The RPN system is more straightforward for people to learn when you think about typical equations, for example you might write:

4563 +

In RPN, you would enter this as:

45 63 +

From a parsing point of view, the process is also easier, because you can perform the calculation by pushing the numbers on to the stack and then performing a calculation with those two numbers. This hugely simplifies the parser, but it only has to push numbers and pop them off when it sees the operand, rather than having to extract both numbers and parser from the input text. Even better, compound calculations can be made easier because the result of one calculation can be pushed back on to the stack for the next part. For example, the following equation:

45 63 + 23 *

Can be read as:

- Push 45 on to stack
- Push 63 on to stack
- Pop value off stack, add it to another value popped off the stack
- Push result to stack
- Pop value off stack, multiply by value popped off stack

The lexical analysis component (i.e. the lex definitions) remain the same, it’s only the parser that changes. Before we examine the yacc rules, you need to see the simple stack system. It provides two functions, one pushes values on, and the other pops values off. All the values are stored in a simple array and a global stack pointer holds the current storage location so that values can be popped off or pushed back:

#include "globdefs.h"int sp=0;double val[MAXVAL];void push(f)double f;{ if (sp 0) return(val[--sp]); else { printf("error: stack emptyn"); return 0.0; }}

The yacc rules for a simple RPN parser are shown below (the rest of the surrounding code is identical).

%%list: /* nothing */ | list EOLN { printf( "%gn" , pop()); } | list exprlist EOLN { printf( "%gn" , pop()); } ;exprlist: shift_expr | exprlist shift_expr ;shift_expr: add_expr | shift_expr LEFTSHIFT { temp=pop(); push(((int)pop()) > ((int)temp)); } ;add_expr: mul_expr | add_expr PLUS { push(pop()+pop()); } | add_expr MINUS { temp=pop(); push(pop()-temp); } ;mul_expr: unary_expr | add_expr MUL { push(pop()*pop()); } | add_expr DIV { temp=pop(); push(pop()/temp); } | add_expr MOD { temp=pop(); push(fmod(pop(),temp)); } ; unary_expr: primary | MINUS primary %prec UNARYMINUS { push(-pop()); } | unary_expr INC { push(pop()+1); } | unary_expr DEC { push(pop()-1); } ; primary: NUMBER { push($1); } | PI { push(M_PI); } ;%%

You can see here that numbers are simply pushed onto the stack:

primary: NUMBER { push($1); } | PI { push(M_PI); } ;

While any calculation is a case of popping off the values and putting them back on the stack:

add_expr: mul_expr | add_expr PLUS { push(pop()+pop()); } | add_expr MINUS { temp=pop(); push(pop()-temp); } ;

The ruleset is shorter, partially because this RPN calculator is not as advanced, but also because the process is much simpler because the rules don’t need to take into account the complex structure of a typical equation line. In a future post we’ll cover the RPN to equation and equation to RPN parsers. You can download the complete code for the RPN calculator as rpn.tar.gz (Unix).

]]>