Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing issue #2009 #2010

Closed
wants to merge 8 commits into from
Closed

Fixing issue #2009 #2010

wants to merge 8 commits into from

Conversation

jurgenvinju
Copy link
Member

@jurgenvinju jurgenvinju commented Jul 22, 2024

The SymbolFactory is/was still implementating the escaping and character encoding of SDF2 and not that of Rascal. This produces weird errors with unicode characters and escapes (\a, \u and \U).

@jurgenvinju jurgenvinju marked this pull request as ready for review July 22, 2024 08:38
@jurgenvinju jurgenvinju changed the title Fixing issue 2009 Fixing issue #2009 Jul 22, 2024
@jurgenvinju
Copy link
Member Author

jurgenvinju commented Jul 22, 2024

Issue #2009 is about character classes like [A-Z] but literals seem to need the same fixing:

// TODO: did we deal with all escapes here? probably not!

So let's add some more tests for that and fix them, and then merge the PR when that's done. I'm shifting back to draft.

So this now still breaks when programmers use escapes in literals that are part of types like { Exp "\a10"}+, and when they use unicode in literals as well, like: ("🍕" Id)*

@jurgenvinju jurgenvinju self-assigned this Jul 22, 2024
@jurgenvinju jurgenvinju marked this pull request as draft July 22, 2024 08:43
Copy link

codecov bot commented Jul 22, 2024

Codecov Report

Attention: Patch coverage is 30.00000% with 14 lines in your changes missing coverage. Please review.

Project coverage is 49%. Comparing base (6c2e1b9) to head (6cf019e).
Report is 7 commits behind head on main.

Files Patch % Lines
...org/rascalmpl/values/parsetrees/SymbolFactory.java 30% 13 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##              main   #2010   +/-   ##
=======================================
  Coverage       49%     49%           
+ Complexity    6310    6302    -8     
=======================================
  Files          664     664           
  Lines        59601   59567   -34     
  Branches      8646    8641    -5     
=======================================
- Hits         29483   29479    -4     
+ Misses       27902   27870   -32     
- Partials      2216    2218    +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@DavyLandman DavyLandman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, great that you also added new tests 👍🏼

src/org/rascalmpl/values/parsetrees/SymbolFactory.java Outdated Show resolved Hide resolved
}
else if (s.startsWith("\\")) {
// builtin escape
int cha = s.codePointAt(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes 👍

@jurgenvinju jurgenvinju marked this pull request as ready for review July 22, 2024 09:24
@jurgenvinju
Copy link
Member Author

@DavyLandman I added more initially failing tests for the literals in types, such as {X "\a0A"}* and the unicode characters in literals and unicode escapes. Fixed easily by reusing the string notation of vallang (which is exactly equal by design). This really ready now. No hurry though.

try {
String lit = ((StringConstant.Lexical) sep).getString();
// this should be the exact notation for string literals in vallang
IValue string = new StandardTextReader().read(factory, new StringReader(lit));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comparing the code, \\f is not in vallang StandardTextReader. the rest is there it seems.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test case showing that with this change an \f in a character class is not translated to form feed char correctly.

@DavyLandman
Copy link
Member

@jurgenvinju I've fixed 2 bugs I found, and made an extra test case. But I think we have to bring back the escape code. as letting vallang take care breaks support for \f. Maybe we should add \f support in vallang?

@jurgenvinju
Copy link
Member Author

@jurgenvinju I've fixed 2 bugs I found, and made an extra test case. But I think we have to bring back the escape code. as letting vallang take care breaks support for \f. Maybe we should add \f support in vallang?

We have to add it to vallang then. In another PR. Let's merge this one because it fixes some serious issues?

@jurgenvinju
Copy link
Member Author

I merged this into #2002 because of overlap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants