bison学习2：bison grammar files

3.1 Outline of a bison grammar

如下：

%{

Prologue

%}

Bison declarations

%%

Grammar rules

%%

Epilogue

/* */类型的注释可以出现任何sections。//产生直到行尾的注释。

3.1.1 The prologue

这是一些c语言代码，它们会被拷贝到parser implementation file的开始部分。因此可以用来定义或声明action使用的函数、变量等。

如果你不需要任何C声明，你可以省去prologue。

prologue section在第一个出现的%}处结束。注释，字符串字面量，字符常量里面的%}不算。

你可以有多个prologue section，可以和bison declarations混合。

3.1.2 Prologue alternatives

可以使用 %code，它可以现实地指定code应该放的位置。

3.1.3 The bison declarations section

用来声明terminal和nonterminal symbols，指定优先级等等。

3.1.4 The grammar rules section

只包含一个或多个bison grammar rules。

至少需要包含一个grammar rule，第一个%%绝对不省去。即使它是file的第一行。

3.1.5 the epilogue

这部分被原封不动的拷贝到parser implementation file的末尾。

如果这部分为空，可以省略和grammar rules分开的%%。

3.2 Symbols, Terminal and Nonterminal

nonterminal symbol可以包含：字母、下划线、点、非开头的数字、连字符。连字符是GNU扩展和POSIX yacc不兼容。

有3种方式写terminal symbols：

一个named token type写成一个标识符。每一个这种name必须在bison declaration定义，如 %token
一个character token type用c中相同的语法，如'+'，这种不需要声明，除非你需要指定semantic value，关联性或优先级等。
一个linteral string可以写成C字符串常量，如"<="，同理，不需要声明，除非需要指定semantice value data type，关联性或者

yylex返回的总是terminal symbols，负数或者0表示end of input。

在grammar rule中的token type必须和yylex中定义的类型相同。

character token type的numeric code是字符的编码值，每个named token type成为c语言里面的宏，所以yylex可以使用宏名。

当yylex写在其它源文件时，可以使用-d选项，让bison产生一个头文件name.tab.h，这样yylex就可以包含。

error符号是一个保留的terminal symbol，用来实现error recovery，你不应该使用。

3.3 Grammar rules

3.3.1 syntax of grammar rules

通用格式：

1	`result: components ...;`

result是rule描述的nonterminal symbol，components是各种各样的terminal和nonterminal symbols，它们组成result。

例如：

1	`exp: exp '+' exp;`

空白字符只用来分开符号，没有其它的含义，你可以添加多个空白字符。

散列在组成部分中的是决定rule semantics的actions，一个action可以是：

{ C statements }

有相同result的多个rules可以使用 | 合并在一起：

result:

rule1-components ...

| rule2-components ...

...

;

3.3.2 Empty Rules

如果右边的components是空的，那么这个rule是emptry的。意味着result可以匹配任何的empty string。

如下定义一个可选的分号：

1	`semicolon.opt: \| ";" ;`

使用%empty语法可以显示地表示一个rule是empty

semicolon.opt:

%empty

| ";"

;

3.3.3 Recursive Rules

一个符号出现在两边，这种rule是recursive rule。

expseq1:

exp

| expseq1 ',' exp

;

为了节省栈空间，我们尽量使用left recursion。

3.4 Defining Language Semantics

grammar rules只定义syntax。

ILD