Symbol Table

Overview

Since the Compiled Grammar Table file contains the grammar's symbol table, it is often useful to list each symbol, their name and table index, within the skeleton program.. This "list" can be in the form of an enumerated constant definition, case statements, or anything else that the developer needs. The GOLD Parser Builder will scan a template file and insert a list of symbols when it locates a "symbols" block.

Essentially, a Symbol List contains a text block which will be used to create the elements of the list. When a "list" is created, this text block will be printed to the skeleton program for each symbol in the grammar's symbol table. The following diagram shows the format used to denote a Symbol Lists. The meaning of the ##Delimiter field is described below.

Structure

The ##Symbol-Table block was added in Version 2.5 of the Builder.

##SYMBOL-TABLE
...
[ ##SYMBOLS
...
##END-SYMBOLS ]
...
##END-SYMBOL-TABLE
or
##SYMBOLS
...
##END-SYMBOLS

Options

The following tags can be used to enhance the text generated by the skeleton program.

[ ##ID-CASE { ProperCase | Uppercase | Lowercase | None } ]
[ ##ID-SEPARATOR Text ]
[ ##ID-SYMBOL-PREFIX Text ]
[ ##ID-RULE-PREFIX Text ]
[ ##DELIMITER Text ]

 

Name Description
##ID-CASE When the GOLD Parser Builder creates identifiers for each constant, the system can put each in either ProperCase or Uppercase. This value should be set to the standard conventions used in the target language. Lowercase will be supported in the next version.
##ID-SEPARATOR For readability, many programming languages allow the use of characters, such as underscores and dashes, to be used in identifiers. The value of this field will used in the constant names.
##ID-SYMBOL-PREFIX The value of this field will be added to the front each generated symbol constant.
##ID-RULE-PREFIX The value of this field will be added to the front each generated rule constant.
##DELIMITER This tag is used to specify the characters used to display lists. This variable is used in the construction of rule lists, symbol lists, and chararacter sets.

Variables

SYMBOL-TABLE Block

Name Description
%Count% The Count variable contains the number of symbols in the grammar's Symbol Table. This variable is useful for array declaration or setting the global variables before storing each of the actual symbols.

SYMBOLS Block

Name Description
%Delimiter% For each state in the table, this variable is set to the value set with the ##Delimiter tag. For the last item in the list, the value is set to a number of spaces.
%Description% This variable contains a friendly description of the rule - using Backus-Naur form. The variable is designed so that the developer can put comments directly into the code that describes the actual content of the rule.
%Description.XML% This variable will display the contents of the Description in XML format.
%ID% This variable contains a name generated by the GOLD Parser Builder for each symbol in the grammar. The format of the ID is specified by the the template's parameter fields.
%ID.Padded% When creating a list, the value of each of these variables will contain added spaces so that each identifier is the same width. The variable is primarily used in the construction of enumerated constants or anywhere when the text should "line-up" for readability.
%Kind% Each symbol in the symbol table contains a "kind" which identifies the classification of the symbol. The value of this variable will match the numeric value stored in the Compiled Grammar Table file.
%Index% This variable contains the index of the symbol in the table.
%Name% This variable contains the formal name of the symbol.
%Name.XML% This variable contains the XML encoding of %Name%.
%Value% This variable simply contains the index of the rule in the table. This variable has the same value as %Index%.
%Value.Padded% Like the ID.Padded variable, each item in a generated list will contain a number of spaces such that each is the same width.

 

Symbol 'Kind' Constants

Value Description
0 Normal Nonterminal
1 Normal Terminal
2 Whitespace Terminal
3 End Character - End of File. This symbol is used to represent the end of the file or the end of the source input.
4 Start of a block quote
5 End of a block quote
6 Line Comment Terminal
7 Error Terminal. If the parser encounters an error reading a token, this kind of symbol can used to differentiate it from other terminal types.

Example: Symbol Table

The following displays a template that will output the content of the Symbol Table using formatted text.

##SYMBOL-TABLE
Table Count: %Count%
##SYMBOLS
   Symbol %Index%
      Name : %Namee%
      ID   : %ID%
      Value: %Value%
      Kind : %Kind%
##END-SYMBOLS
##END-SYMBOL-TABLE

 

If the "Simple" example grammar is used, the program template will create the following text for Symbol #6. The sets before and after #6 were excluded for brevity.

Table Count: 39
   .
   .
   .
   Symbol 6
      Name : -
      ID   : Symbol_Minus
      Value: 6
      Kind : 1

Example: Constant Lists

The following are a few examples on how to define various symbol lists. In most cases, only a list of enumerated constants would be defined in template.

 

C++ Enumerated Constants

enum RuleConstants
{
##DELIMITER ','
##SYMBOLS
    %ID% = %Value% %Delimiter% // %Description%
##END-SYMBOLS
};

 

Visual Basic Enumerated Constants

Enum RuleConstants
##SYMBOLS
    %ID% = %Value% ' %Description%
##END-SYMBOLS
End Enum