Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantified concatenation using < or > fails when not escaped, different from xfst & hfst #114

Open
snomos opened this issue May 25, 2021 · 0 comments

Comments

@snomos
Copy link
Contributor

snomos commented May 25, 2021

In writing a URL parser, I have the following lexicon:

LEXICON realdomain
    < [ a | b | c | d | e | f | g | h | i | j | k
      | l | m | n | o | p | q | r | s | t | u | v
      | w | x | y | z | A | B | C | D | E | F | G
      | H | I | J | K | L | M | N | O | P | Q | R
      | S | T | U | V | W | X | Y | Z |%-
      |%0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 ]^>1 %. > topdomainlist ;

This fails in Foma with the following error:

***Syntax error on line 49 column 616 at '>'

If I escape the quantifier as follows: ^%>1 the regex compiles in Foma, but fails in both Xfst:

*** Warning: regex_parse: Positive integer expeted, got 0. ***

and Hfst-xfst:

*** xre parsing failed: syntax error, unexpected LEXER_ERROR, expecting end of file
***    parsing […]
      |%_ |%? |%& |%= |%% |%@ |%. |%/ |%~ ]^%>1  [near ^] on line 27...
Unable to parse regular expression
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant