From 40aba5857f58a75f61cdd25a673e02c23b6a8089 Mon Sep 17 00:00:00 2001 From: Mark Hollomon Date: Thu, 1 Oct 2020 10:44:50 -0400 Subject: [PATCH] doc updates --- README.md | 111 +++++++++++++++++++++++++++-------------------- RELEASE_NOTES.md | 11 +++++ 2 files changed, 76 insertions(+), 46 deletions(-) diff --git a/README.md b/README.md index 2f8f2bc..a9a0e9a 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@ # Yet Another LR Parser Generator -## Yalr Release 0.2.0 +## Yalr Release 0.2.1 [![Github Releases](https://img.shields.io/github/release/mhhollomon/yalr.svg)](https://github.com/mhhollomon/yalr/releases) [![Build Status](https://api.cirrus-ci.com/github/mhhollomon/yalr.svg)](https://cirrus-ci.com/github/mhhollomon/yalr) [![Github Issues](https://img.shields.io/github/issues/mhhollomon/yalr.svg)](http://github.com/mhhollomon/yalr) @@ -7,16 +7,7 @@ ## Release Highlights -- Case insensitive lexer. Turn on case folding for the entire lexer or only - select terminals. - -- Precedence and associativity markers - you can now give rules and terminals - precedence in order to help resolve grammar ambiguities. - -- Better error messages - The error message system has been completely - revamped. The message should now be cleaner and easier to understand. - -- Autogenerated `main()`. Yalr will create a main for you, if you want. +- Small bug fixes and documentation clean ups. For more details, see below and the [Release Notes](RELEASE_NOTES.md) @@ -114,7 +105,7 @@ This statement may only appear once in the file. ### Option statements -A number settings can be changed via an option statement. The general syntax +A number of settings can be changed via the option statement. The general syntax is: ``` @@ -126,8 +117,7 @@ The available options are: option-id | setting ----------|--------- lexer.case| default case matching. Setting is `cfold` and `cmatch` -code.main | When set to true, will case the generator to include a simple -main() function (See below). +code.main | When set to true, will cause the generator to include a simple main() function (See below). ### Terminals @@ -136,7 +126,7 @@ There are two types of terminals - "parser" terminals and "lexer" terminals. #### Parser Terminals Parser Terminals are those terminals that are used to create the rules in -grammar. These are the terminals that are return by the lexer. +the grammar. These are the terminals that are return by the lexer. Parser Terminals are defined by the `term` keyword. @@ -188,7 +178,7 @@ term PRINT_KEYWORD 'print' @cfold ; ``` The computation is given as an action encased in `<%{ ... }%>` . If an action -is given, then the normal terminating semi-colon is not required. +is given, then the normal terminating semi-colon is not allowed. ``` term INTEGER r:[-+]?[0-9]+ <%{ return std::stoi(lexeme); }%> @@ -216,28 +206,28 @@ how they treat case. prefix | behavior -------|--------- -`r:` | Default case behavior (currently case sensitive). +`r:` | Global "default" case behavior as potentially set using `option lexer.case` statement. `rm:` | Match case - ie case sensitive. `rf:` | Fold case - i.e. case insensitive. ##### @lexeme special type -The special type `@lexeme` can be used to give a short cut for the common +The special type `@lexeme` can be used as a short cut for the common pattern of returning the parsed text as the semantic value. When a terminal is given the type `@lexeme`, this is transformed internally -into `std::string`. Additionally, the action set to return the lexeme. If the +into `std::string`. Additionally, the action is set to return the lexeme. If the terminal is given an action, this is an error. ```yalr // This term <@lexeme> IDENT r:_*[a-zA-Z]+ ; -// becomes this: +// acts like: term IDENT r:_*[a-zA-Z]+ <%{ return std::move(lexeme); }%> // THIS is an ERROR -term <@lexeme IDENT r:_*[a-zA-Z]+ <%{ /* blah, blah */ }%> +term <@lexeme> IDENT r:_*[a-zA-Z]+ <%{ /* blah, blah */ }%> ``` ##### Terminal Precedence and Associativity @@ -258,7 +248,7 @@ term Mult '*' @assoc=left ; The `left` or `right` keyword must come directly after the flag. There can be no spaces btween the equal sign and the value. -Precedence is assgined to the terminal using the `@prec=` flag. It can be +Precedence is assigned to the terminal using the `@prec=` flag. It can be assigned as a positive integer value, or as the name or pattern of another terminal. The referenced terminal must have a precedence assigned. @@ -296,7 +286,7 @@ rule Foo { => WS ; } ### Associativity statement -Terms can be given an assoviativity setting using the `associativity` +Terms can be given an associativity setting using the `associativity` statement. This statment will also create single-quote style terminals "inline" ```yalr @@ -364,7 +354,7 @@ value. The semantic values of the items in the production are available to the actions in variables of the form `_v{n}` where `{n}` is the position of the item from the left numbered from 1. An item may also be given an alias. This alias will be used to create a reference variable that points to the -corresponding semantic avalue variable. If an represents a rule or terminal +corresponding semantic avalue variable. If an item represents a rule or terminal without a type, the corresponding semantic variable will not be defined. Giving an alias to such an item will result in an error. @@ -466,21 +456,21 @@ rule E { and the input `1 + 2 * 3`. -Would like it to parse it as `1 + ( 2 * 3)` - that is - use the second +We would like `yalr` to parse this as `1 + ( 2 * 3)` - that is - use the second production first and then use the first production to create the parse tree : ``` E(+ E(1) E(* E(2) E(3))) ``` -The important point is after it has seen (and shifted) '1' '+' '2' and is -deciding what to do with the `*`. it has a choice, it can shift it and delay +The critical point in the parse is after it has seen (and shifted) '1' '+' '2' and is +deciding what to do with the `*`. the system has a choice, it can shift the `*` and delay reducing until later (this is what we want it to do), or it can go ahead and -reduce by production 1. +reduce by production 1. -When there is a shift/reduce conflict like this thegenerator will compare the -precedence of the production (1) and the terminal(`*`). If the production is -greater, then the reduce will be done. If the terminal is higher precedence, -then the shift will done. +When there is a shift/reduce conflict like this, the generator will compare the +precedence of the production (1) and the terminal (`*`). If the precedence of +the production is greater, then the reduce will be done. If the terminal has higher precedence, +then the shift will be done. If the two have equal precedence, the associativity of the terminal will be consulted. If it is 'left' then reduce will be done. If it is 'right', then the @@ -492,12 +482,41 @@ It is also possible to have two rules come in conflict (reduce/reduce). The same rules apply. So, to make our example act as we want, we need to make `*` have a higher -precedence than production 1. By default it will have the precedence of the `+` -terminal. +precedence than production 1. + +There are several ways to do this. We could directly assign precedence the rule +and the terminal. +``` +term P '+' ; +term M '*' @prec=200 ; + +rule E { + => E '+' E @prec=1; // production 1 + => E '*' E ; // production 2 + => number ; // production 3 +} +``` + +By default production 1 will have the precedence of the `+` terminal. +So, we could also set the precedence of '+'. + +``` +term P '+' @prec=1 ; +term M '*' @prec=200 ; + +rule E { + => E '+' E ; // production 1 + => E '*' E ; // production 2 + => number ; // production 3 +} +``` + +We could also set the associativity in order to invoke the second part of the +conflict resolution rules. ``` -term P '+' @prec=1 -term M '*' @prec=200 +associativity right '+' +associativity left '*' rule E { => E '+' E ; // production 1 @@ -581,14 +600,14 @@ Each state will have a block of descriptive information such as this sample --------- State 1 Items: - [ 3] statement => PRINT * expression - [ 5] expression => * expression '+' expression - [ 6] expression => * expression '-' expression - [ 7] expression => * expression '*' expression - [ 8] expression => * expression '/' expression - [ 9] expression => * NUMBER - [ 10] expression => * VARIABLE - [ 11] expression => * '(' expression ')' + [ 3] statement => PRINT |*| expression + [ 5] expression => |*| expression '+' expression + [ 6] expression => |*| expression '-' expression + [ 7] expression => |*| expression '*' expression + [ 8] expression => |*| expression '/' expression + [ 9] expression => |*| NUMBER + [ 10] expression => |*| VARIABLE + [ 11] expression => |*| '(' expression ')' Actions: VARIABLE => shift and move to state 8 @@ -602,7 +621,7 @@ Gotos: #### Items This section lists the current partial parses this state represents. The -unquted star is the pointer to where the parse currenty is in this state. +`|*|` symobl is the pointer to where the parse currenty is in this state. #### Actions @@ -610,7 +629,7 @@ What do to for each possible token that could be received next. Any tokens not listed are considered errors and the parse will terminate. Possible actions are: -- shift - add the token and its value to the stack and move the the designated +- shift - Add the token and its value to the stack and move to the designated new state. - reduce - For the listed production, pull the correct number of items off the stack and run the action code associated with the production. Shift the token diff --git a/RELEASE_NOTES.md b/RELEASE_NOTES.md index a8e679d..19dbe4d 100644 --- a/RELEASE_NOTES.md +++ b/RELEASE_NOTES.md @@ -1,3 +1,14 @@ +## Release v0.2.1 + +### Functional Changes + +- bug #14 - make sure to fail if a void terminal is given an alias. + +### Non-functional Changes + +- doc clean up +- testing improvements + ## Release v0.2.0 ### Functional Changes