Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternations optimization #141

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Alternations optimization #141

wants to merge 1 commit into from

Conversation

nitely
Copy link
Owner

@nitely nitely commented Apr 11, 2024

In theory this speeds up regex containing hundreds of alternations (see #138). How? the first state contains a state per alternation; the optimization reduces the number of alternations (at the very least the outer one), so the first state will be capped to [a-zA-Z-0-9] (+ symbols) states in the case of ASCII, so 1K initial states will get reduced to ~50. For find all this matters a lot because it tries to match the initial states to every input character.

The description is in this gist https://gist.github.com/nitely/745c8cabdf06ba2d37f8cf5cda3aea5f

This is just a PoC, though. I'm just optimizing the simplest case, since it should speed up #138

it's also a (broken) WIP that cannot even be tested.

@nitely
Copy link
Owner Author

nitely commented May 23, 2024

We can go further than just literals, consider (^abc|^acb) -> ^a(bc|cb). The same can be applied to suffixes, consider (car|bar) -> (b|c)ar, this would be useful for the literals optimization to kick in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant