Posts

KTU Compiler Lab CSL411

 About Me Preamble: This course aims to offer students hands-on experience on compiler design concepts. Students will be able to familiarize with tools such as LEX and YACC and automate different phases of a compiler. This course helps the learners to enhance the capability to design and implement a compiler. Prerequisite: A sound knowledge in C programming, Data Structures, Formal languages and Automata Theory and Compiler design. How to Install  flex and Bison Introduction to lex/flex Format of lex input files Lex Regular Expressions Input Matching in Lex Actions in Lex File Start Conditions in Lex Values available to the user in action section Reading and writing files Interfacing flex with yacc Introduction to yacc Grammer Specification in yacc Writing yacc programs Precedence and conflict resolution in yacc

13.Precedence and conflict resolution in yacc

We had seen that , when you feed yacc a grammar that is not LALR(1), it shows a warning and resolve the conflicts automatically. For a shift/reduce conflict, yacc will choose the shift. In a reduce/reduce conflict, it will reduce using the rule declared first in the file. We can control what happens by explicitly declaring precedence and associativity for operators. Rather than re-writing the grammar to implicitly control the precedence and associativity with a bunch of intermediate non-terminals, we can directly indicate the precedence so that yacc will know how to break ties. In the definitions section, we can add any number of precedence levels, one per line, from lowest to highest, and indicate the associativity (either left, right, or nonassoc). Several terminals can be on the same line to assign them equal precedence. Example: %token T_Int %left '+' %left '*' %right '^' %% E : E '+' E |E '*'

12.Writing yacc programs

... definitions ... %% ... rules ... %% ... subroutines ... Input to yacc is divided into three sections. The definitions section consists of token declarations and C code bracketed by “%{“ and “%}”. The BNF grammar is placed in the rules section and user subroutines are added in the subroutines section. This is best illustrated by constructing a small calculator that can add and subtract numbers. We’ll begin by examining the linkage between lex and yacc. Here is the definitions section for the yacc input file: %token INTEGER This definition declares an INTEGER token. Yacc generates a parser in file y.tab.c and an include file, y.tab.h: Lex includes this file and utilizes the definitions for token values. To obtain tokens yacc calls yylex. Function yylex has a return type of int that returns a token. Values associated with the token are returned by lex in variable yylval. For example, [0-9]+  {              yylval = atoi(yytext);    

11.Grammar specification in yacc

Grammars for yacc are described using a variant of Backus Naur Form (BNF). This technique, pioneered by John Backus and Peter Naur, was used to describe ALGOL60. A BNF grammar can be used to express context-free languages. Most constructs in modern programming languages can be represented in BNF. For example, the grammar for an expression that multiplies and adds numbers is E -> E + E E -> E * E E -> id Three productions have been specified. Terms that appear on the left-hand side (lhs) of a production, such as E (expression) are nonterminals. Terms such as id (identifier) are terminals (tokens returned by lex) and only appear on the right-hand side (rhs) of a production. This grammar specifies that an expression may be the sum of two expressions, the product of two expressions, or an identifier. We can use this grammar to generate expressions: E -> E*E (r2) E->E*z (r3) E- >E+E*z (r1) E->E+y*z (r3) E->x+y*z (r

10.Introduction to Yacc

Image
Yacc is a parser generator. A parser generator is a program that takes as input a specification of a syntax and produces as output a procedure for recognizing that language. Historically, they are also called compiler-compilers YACC ( yet another compiler-compiler ) is an LALR(1) (LookAhead, Left-to-right,Rightmost derivation producer with 1 lookahead token) parser generator. YACC was originally designed for being complemented by Lex. An updated version, bison can also be used. yacc is designed for use with C code and generates a parser written in C. The parser is configured for use in conjunction with a lex -generated scanner and relies on standard shared features (token types, yylval , etc.) and calls the function yylex as a scanner co-routine. You provide a grammar specification file, which is traditionally named using a .y extension. You invoke yacc on the .y file and it creates the y.tab.h and y.tab.c files containing a thousand or so lines of intense C code that imp

9.Interfacing flex with yacc

One of the main uses of flex is as a companion to the yacc parser-generator. yacc parsers expect to call a routine named yylex() to find the next input token. The routine is supposed to return the type of the next token as well as putting any associated value in the global yylval. To use flex with yacc, one specifies the -d option to yacc to instruct it to generate the file y.tab.h containing definitions of all the %tokens appearing in the yacc input. This file is then included in the flex scanner. For example, if one of the tokens is "TOK_NUMBER", part of the scanner might look like: %{ #include "y.tab.h" %} %% [0-9]+ yylval = atoi( yytext ); return TOK_NUMBER;

8.Reading and Writing using Files with examples

By default yyin and yyout points to the stdin and stdout.We can set yyin to point to a different file so that flex will read the contents from the file pointed by yyin.Similarly yyout can point to output file. The following program will find number of lines,words and characters in a text file "text.dat"( similar to wc command in linux) %{       int nchar, nword, nline; %} %% \n             { nline++; nchar++; } [^  \t\n]+ { nword++, nchar += yyleng; } .               { nchar++; } %% int main(void) { yyin=fopen("text.dat","r"); yylex(); printf("%d\t%d\t%d\n",nline,nword,nchar); return 0; } The program below reads a text file passed as argument and display each line with line number. %{ int lineno; %} %% ^(.*)\n printf("%4d\t%s", ++lineno, yytext); %% int main(int argc, char *argv[]) { yyin = fopen(argv[1], "r"); yylex(); flose(yyin); } The program below reads a data f