Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capture slice and value #283

Open
bruceiv opened this issue Jan 17, 2022 · 3 comments
Open

Capture slice and value #283

bruceiv opened this issue Jan 17, 2022 · 3 comments
Labels
feature Something not supported that perhaps should be

Comments

@bruceiv
Copy link

bruceiv commented Jan 17, 2022

I'd like a built-in expression to capture both the result of a parsing expression and the input slice that spans it, like $(e) but produces a tuple of (slice spanning e, return value of e). (This is basically the consumed combinator in nom; $$(e) might be a good syntax.)

My motivation is that I'm working on a programming-language parser where each AST node has a location field that's basically a three-tuple: (start index, length, identifier for source file). I can't get the file identifier from the position!() expression, but I've set up the ParseSlice implementation for my file type to insert it. However, since there's no direct access to the input object within the available expressions in peg, the slice operator seems to be the only way to get this value, and if I use slice I don't get the return value of the expression. I think I can work around it for the use-cases I have by pulling the location fields from some of the subexpressions, but the only fully-general solution I can think of would be something like this which parses the expression twice (I haven't tried it, I'm not sure about the return value of &e):

rule cap_ret<T>(f: rule<T>) -> (Span, T)
= v:&f() p:$(f()) { (p, v) }
@kevinmehall
Copy link
Owner

This would be pretty easy to add and would be generally useful. I'm trying to avoid adding more symbols to make grammars easier to read for new users, but $$ is close enough to $ that the difference could be easily documented.

For your use case though, there's an existing feature might be better than repurposing ParseSlice: the undocumented ##method() expression calls input.method(pos). Here's an example of it defined and used in rust-peg's own meta-grammar. This is undocumented because I plan to replace it with a more flexible expression accepting a block of Rust code in the grammar with access to input and position (#284) once I can come up with a good syntax for it.

You could use this to make a customized replacement for position!() that returns the additional info you need. Then wrap it in a generic rule:

rule spanned<T>(inner: rule<T>) = start:##custom_position() v:inner() end:##custom_position() { ... }
...
spanned(<foo>)

@kevinmehall kevinmehall added the feature Something not supported that perhaps should be label Jan 17, 2022
@emk
Copy link
Contributor

emk commented Oct 11, 2023

For the case where someone needs a value T plus the input slice, I tried doing this, as @bruceiv suggested:

        /// Return both the value and slice matched by the rule.
        rule with_slice<T>(r: rule<T>) -> (T, &'input str)
            = value:&r() input:$(r()) { (value, input) }

This passed all my tests in a moderately complex grammar.

@chrysn
Copy link

chrysn commented Jan 12, 2025

As it has taken me quite some time to figure out how with_slice is used (I never needed the rule(<more rules>) syntax before), here's an example:

    rule my_rule() -> MyType<'input>
        = sliced:with_slice(<blank()* data:content() blank()* { data }>)
          { MyType { slice: sliced.0, data: sliced.1 } }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Something not supported that perhaps should be
Projects
None yet
Development

No branches or pull requests

4 participants