The Tower Bridge in Sacramento, California Parser Comparison
How GOLD Compares to YACC
Main
Latest News
Getting Started
Screen Shots
Download
Documentation
Contributors
Contact
About GOLD
How It Works
FAQ
Why Use GOLD?
Comparison
Revision History
Freeware License
More ...
Articles
What is a Parser?
Backus-Naur Form
DFA Lexer
LALR Parsing
Glossary
Links
More ...


Overview of YACC

One of the oldest and most respected parsing engine generators available to developers is YACC. Like "vi" "grep" and "awk", this software is considered the de facto standard in the UNIX world.

When developing a parser using YACC, the grammar description is added directly to the C source code. Special notation is used to define the data type used to store reductions as well as where YACC code is to link to the original C source. For instance, using a "$1" will refer to the first token of a reduced rule.

Naturally this source code, containing both YACC grammar information and normal C, cannot be compiled directly. Instead, YACC is used to analyze this source code and create a new C program with the compiled parser tables and engine "hard-wired" into it. After this process is complete, the new C program can be compiled and executed. This type of approach is known as a "compiler-compiler".

For developers using C or C++ on the UNIX platform, YACC is an ideal tool. However, the approach YACC uses has several drawbacks that limit it for developing modern interpreters and compilers. The most notable drawback is the limitation on the programming language used for development. There are several versions of YACC for different programming languages, but in each case the grammar definition is not directly portable. Another major drawback lies in the overall approach of  compiler-compilers.

Comparison Between GOLD & YACC

Essentially, GOLD differs for the classic YACC parser in four important ways.

1.

GOLD uses standard notation for specifying rules, tokens, sets, etc....

Rules are specified by Backus-Naur Form (non-enhanced at this time).

Terminals are represented through Regular Expressions

Character sets are represented through set notation (more or less).

2.

The actual parser engine and tables exist separately.

One of the key differences between YACC and GOLD is how the grammar is used in conjunction with the developer's source code.  Unlike YACC, which combines C source code and grammar constructs, GOLD operationally separates the parse engine and the code that derives the table information.

Instead of integrating the grammar description into C source code, GOLD reads the grammar from a separate file and then generates the tables. Afterwards, the derived tables can be saved to a separate binary file which can be used at a later time.  

This allows developers to build a grammar on one platform, and then use this binary file to develop the interpreter/compiler on another. For instance, the developer can use GOLD Builder to analyze a grammar on a x86 machine and then use that binary file on a UNIX, Mac, Linux, etc... Since the DFA and LALR algorithms are simple, creating a parsing engine on other platforms can be accomplished  with a minimum of coding.

3.

GOLD is language-independent

When a grammar is compiled and saved to a file by the GOLD Builder, the data exists independent of any particular programming language.  As a result, the parsing engine that loads this file can be implemented in such languages as C, C++, Java, C#, Visual Basic, Eiffel, etc...

There is an ActiveX DLL provided with GOLD Builder that reads and parses the information stored in a Compiled Grammar Table file, but you create your own parser engine

4.

The GOLD Builder does not have short-cut notation for operator precedence.

YACC provides a mechanism for providing operator precedence. The developer can specify the order of operator precedence and whether each operator associates left-to-right or right-to-left.

The GOLD Builder does not provide such a mechanism for an important reason. Operator precedence actually consists of a series of rules. In the case with YACC, the extra rules needed to implement the proper logic are created "behind the scene". This makes sense for YACC since the additional rules can be hidden from the programmer and the special logic needed for the parser engine is already implemented.

However, if these "hidden" rules were saved to the Compiled Grammar Table file, reductions would take place in the parser engine that the grammar designer would not expect (nor plan for). Essentially, it would give ambiguity to the parsing process and defeat the purpose of a generalized parser generator such as GOLD.

Fortunately,  implementing operator precedence is easy, though often tedious.