|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
What is a Terminal?When talking about grammars, a terminal is any symbol, identifier, reserved word, etc... that describes a piece of data found in the language. Essentially, anything that can be recognized as a meaningful piece of lexical information is a terminal. What is considered a terminal varies greatly between programming languages. In
most cases, there is significant overlap. For instance, the following piece of text would
specify a simple mathematical expression in most languages:
In this example, the terminals are '1', '+', '2','*', and '3'. Often, the format of terminal is specified using a regular expression. This is true of GOLD and many other parsing systems. Regular ExpressionsIntroduction
Notation VariationsMany parsing systems have expanded the notation to include set literals and sometimes named sets. In the case of Lex, literal sets of characters are delimited using the square brackets '[' and ']' and named sets are delimited by the braces '{' and '}'. For instance, the text "[abcde]" denotes a set of characters consisting of the first five letters of the alphabet while the text "{abc}" refers to a set named "abc". This type of notation permits a short-cut notation for regular expressions. The expression (a|b|c)+ can be defined as [abc]+. It should be noted that standard set notation uses curly brackets to denote a set, not the name of a set. As a result, the notation used by most parsing systems differs quite a bit from the official notation. Examples
|