diff --git a/README.md b/README.md index 7aae48e..a50d997 100644 --- a/README.md +++ b/README.md @@ -17,9 +17,7 @@ ## Example ```rust -use peg::parser; - -parser!{ +peg::parser!{ grammar list_parser() for str { rule number() -> u32 = n:$(['0'..='9']+) { n.parse().unwrap() } diff --git a/peg-runtime/lib.rs b/peg-runtime/lib.rs index 7e912f6..562d528 100644 --- a/peg-runtime/lib.rs +++ b/peg-runtime/lib.rs @@ -7,7 +7,7 @@ pub mod error; /// The result type used internally in the parser. /// /// You'll only need this if implementing the `Parse*` traits for a custom input -/// type. The public API of a parser adapts errors to `std::Result`. +/// type. The public API of a parser adapts errors to `std::result::Result`. #[derive(Clone)] pub enum RuleResult { Matched(usize, T), diff --git a/src/lib.rs b/src/lib.rs index d123955..4e60b1e 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -4,10 +4,24 @@ //! //! [wikipedia-peg]: https://en.wikipedia.org/wiki/Parsing_expression_grammar //! -//! The `parser!{}` macro encloses a `grammar` definition containing a set of `rule`s which match -//! components of your language. It expands to a Rust `mod` containing functions corresponding to -//! each `rule` marked `pub`. -//! +//! The `parser!{}` macro encloses a `grammar` definition containing a set of +//! `rule`s which match components of your language. The grammar is defined over +//! an [input type, normally `str`](#custom-input-types), and expands to a Rust +//! `mod`. +//! +//! Rules can accept parameters and optionally return a value when they match. A +//! `rule` not marked `pub` can only be called from other rules within the +//! grammar. +//! +//! Each `rule` marked `pub` expands to a function in the module which +//! accepts a reference to an input sequence, followed by any additional +//! parameters defined on the `rule`. It returns a `Result` +//! carrying either the successfully parsed value, or a `ParseError` containing +//! the failure position and the set of tokens expected there. +//! +//! The body of the rule, following the `=`, is a PEG expression, definining how +//! the input is matched to produce a value. +//! //! ```rust //! peg::parser!{ //! grammar list_parser() for str { @@ -23,16 +37,31 @@ //! assert_eq!(list_parser::list("[1,1,2,3,5,8]"), Ok(vec![1, 1, 2, 3, 5, 8])); //! } //! ``` -//! +//! //! ## Expressions //! //! * `"keyword"` - _Literal:_ match a literal string. //! * `['0'..='9']` - _Pattern:_ match a single element that matches a Rust `match`-style //! pattern. [(details)](#match-expressions) //! * `some_rule()` - _Rule:_ match a rule defined elsewhere in the grammar and return its -//! result. +//! result. Arguments in the parentheses are Rust expressions. +//! * `_` or `__` or `___`: _Rule (underscore):_ As a special case, rule names +//! consisting of underscores are invoked without parentheses. These are +//! conventionally used to match whitespace between tokens. //! * `e1 e2 e3` - _Sequence:_ match expressions in sequence (`e1` followed by `e2` followed by //! `e3`). +//! * `a:e1 e2 b:e3 c:e4 { rust }` - _Action:_ Match `e1`, `e2`, `e3`, `e4` in +//! sequence, like above. If they match successfully, run the Rust code in +//! the block and return its return value. The variable names before the +//! colons in the preceding sequence are bound to the results of the +//! corresponding expressions. It is important that the Rust code embedded +//! in the grammar is deterministic and free of side effects, as it may be +//! called multiple times. +//! * `a:e1 b:e2 c:e3 {? rust }` - _Conditional action:_ Like above, but the +//! Rust block returns a `Result` instead of a value directly. On +//! `Ok(v)`, it matches successfully and returns `v`. On `Err(e)`, the match +//! of the entire expression fails and it tries alternatives or reports a +//! parse error with the `&str` `e`. //! * `e1 / e2 / e3` - _Ordered choice:_ try to match `e1`. If the match succeeds, return its //! result, otherwise try `e2`, and so on. //! * `expression?` - _Optional:_ match one or zero repetitions of `expression`. Returns an @@ -49,14 +78,6 @@ //! without consuming any characters. //! * `!expression` - _Negative lookahead:_ Match only if `expression` does not match at this //! position, without consuming any characters. -//! * `a:e1 b:e2 c:e3 { rust }` - _Action:_ Match `e1`, `e2`, `e3` in sequence. If they match -//! successfully, run the Rust code in the block and return its return value. The variable -//! names before the colons in the preceding sequence are bound to the results of the -//! corresponding expressions. -//! * `a:e1 b:e2 c:e3 {? rust }` - Like above, but the Rust block returns a `Result` -//! instead of a value directly. On `Ok(v)`, it matches successfully and returns `v`. On -//! `Err(e)`, the match of the entire expression fails and it tries alternatives or reports a -//! parse error with the `&str` `e`. //! * `$(e)` - _Slice:_ match the expression `e`, and return the `&str` slice of the input //! corresponding to the match. //! * `position!()` - return a `usize` representing the current offset into the input, and @@ -110,6 +131,7 @@ //! x:@ "^" y:(@) { x.pow(y as u32) } //! -- //! n:number() { n } +//! "(" e:arithmetic() ")" { e } //! } //! # }} //! # fn main() {} @@ -119,22 +141,29 @@ //! levels. The levels consist of one or more operator rules each followed by a Rust action //! expression. //! -//! The `(@)` and `@` are the operands, and the parentheses indicate associativity. An operator +//! The `(@)` and `@` are the operands, and the parentheses indicate associativity. An operator //! rule beginning and ending with `@` is an infix expression. Prefix and postfix rules have one //! `@` at the beginning or end, and atoms do not include `@`. //! //! ## Custom input types //! //! `rust-peg` handles input types through a series of traits, and comes with implementations for -//! `str`, `[u8]`, and `[T]`. +//! `str`, `[u8]`, and `[T]`. Define the traits below to use your own types as +//! input to `peg` grammars: //! -//! * `Parse` is the base trait for all inputs. The others are only required to use the +//! * `Parse` is the base trait required for all inputs. The others are only required to use the //! corresponding expressions. //! * `ParseElem` implements the `[_]` pattern operator, with a method returning the next item of //! the input to match. //! * `ParseLiteral` implements matching against a `"string"` literal. //! * `ParseSlice` implements the `$()` operator, returning a slice from a span of indexes. //! +//! As a more complex example, the body of the `peg::parser!{}` macro itself is +//! parsed with `peg`, using a [definition of these traits][gh-flat-token-tree] +//! for a type that wraps Rust's `TokenTree`. +//! +//! [gh-flat-token-tree]: https://github.com/kevinmehall/rust-peg/blob/master/peg-macros/tokens.rs +//! //! ### Error reporting //! //! When a match fails, position information is automatically recorded to report a set of @@ -180,7 +209,7 @@ //! ## Rustdoc comments //! //! `rustdoc` comments with `///` before a `grammar` or `pub rule` are propagated to the resulting -//! function: +//! module or function: //! //! ```rust,no_run //! # peg::parser!{grammar doc() for str { @@ -195,8 +224,9 @@ //! //! ## Tracing //! -//! If you pass the `peg/trace` feature to Cargo when building your project, a trace of the parsing -//! will be printed to stdout when parsing. For example, +//! If you pass the `peg/trace` feature to Cargo when building your project, a +//! trace of the rules attempted and matched will be printed to stdout when +//! parsing. For example, //! ```sh //! $ cargo run --features peg/trace //! ...