Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegEx "negative lookahead" within XPath in scorecard #515

Closed
roederXC opened this issue Aug 1, 2024 · 5 comments
Closed

RegEx "negative lookahead" within XPath in scorecard #515

roederXC opened this issue Aug 1, 2024 · 5 comments
Assignees
Labels
invalid This doesn't seem right wontfix This will not be worked on

Comments

@roederXC
Copy link

roederXC commented Aug 1, 2024

The regex "negative lookahead" does not work within an XPath in the scorecard-config.JSON

For example, if you use this expression as a rule selector:
"sum(//transactions/transaction[matches(name, '^(?!.*Crawler).+$')]/count)"
you will get this Error in the Report/Scorecard:
Error evaluating rule 'visitsWithoutCrawlerCheck': [Check #0] Syntax error at char 2 in regular expression: No expression before quantifier. Found while atomizing the first argument of fn:sum() in {docOrder(((root/descendant::transactions)/transaction[fn:matches(...)])/count)} on line 1

To better reproduce and handle the issue, I am attaching my debug-scorecard-config.json.
debug-scorecard-config.json

@h-arlt h-arlt self-assigned this Aug 2, 2024
@h-arlt
Copy link
Contributor

h-arlt commented Aug 2, 2024

Regular expressions in XPath do not support lookahead or lookbehind constructs. That's why it is reported as being erroneous.

See here for details on what regular expression syntax in XPath 3.1.

@h-arlt h-arlt closed this as not planned Won't fix, can't repro, duplicate, stale Aug 2, 2024
@h-arlt h-arlt added invalid This doesn't seem right wontfix This will not be worked on labels Aug 2, 2024
@h-arlt
Copy link
Contributor

h-arlt commented Aug 2, 2024

As workaround, you may simply substract the sum of all transaction counts whose names contains TCrawler from the total sum of all transactions: sum(//transactions/transaction/count)-sum(//transactions/transaction[contains(name, 'Crawler')]/count)

@rschwietzke
Copy link
Contributor

With #508 we will get a way to deal with that more easily because we label things upfront and make the queries for the scorecard simpler.

@roederXC
Copy link
Author

roederXC commented Aug 4, 2024

@h-arlt
Maybe the better workauround is to use only XPath:
`"sum(//transactions/transaction[not(contains(name, 'Crawler'))]/count)"``

But maybe it is still be an known or ongoing issue to us a "?" in ExPath here.

@rschwietzke
Copy link
Contributor

This page explocitly says, that negative lookahead is purposly not part of the spec:
https://www.web3d.org/specifications/X3dRegularExpressions.html

Negative Lookahead filters have potential to disqualify attributes that contain illegal values.

W3C Recommendation for XML Schema (XSD): Regular Expressions does not support this construct, perhaps to avoid the possibility of computational denial-of-service (DoS) attacks.

@rschwietzke rschwietzke assigned rschwietzke and unassigned h-arlt Aug 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants