-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex in replace_query_select is buggy #122
Comments
For the above mentioned Blazegraph exception see the SPARQL Specification 'Aggregate Projection Restrictions' |
Note: It is currently not possible to recognize the above query as invalid in a query sanity check using RDFLib, see #2960 |
Note: A better version of the pattern would be Since this is used with |
The new regex non-greedily matches everything after "select" and before "where". "[\s\S]*?" basically means ".*?" but is more general because it also matches linebreaks without the re.DOTALL flag. See https://docs.python.org/3/library/re.html#re.DOTALL. Fixes #122.
The tests run variations of the example given in #122. Every test implemented here fails without the fix introduced with this PR.
The new regex non-greedily matches everything after "select" and before "where". "[\s\S]*?" basically means ".*?" but is more general because it also matches linebreaks without the re.DOTALL flag. See https://docs.python.org/3/library/re.html#re.DOTALL. Fixes #122.
The tests run variations of the example given in #122. Every test implemented here fails without the fix introduced with this PR.
The new regex non-greedily matches everything after "select" and before "where". "[\s\S]*?" basically means ".*?" but is more general because it also matches linebreaks without the re.DOTALL flag. See https://docs.python.org/3/library/re.html#re.DOTALL. Fixes #122.
The tests run variations of the example given in #122. Every test implemented here fails without the fix introduced with this PR.
The new regex non-greedily matches everything after "select" and before "where". "[\s\S]*?" basically means ".*?" but is more general because it also matches linebreaks without the re.DOTALL flag. See https://docs.python.org/3/library/re.html#re.DOTALL. Fixes #122.
The tests run variations of the example given in #122. Every test implemented here fails without the fix introduced with this PR.
The new regex non-greedily matches everything after "select" and before "where". "[\s\S]*?" basically means ".*?" but is more general because it also matches linebreaks without the re.DOTALL flag. See https://docs.python.org/3/library/re.html#re.DOTALL. Fixes #122.
The tests run variations of the example given in #122. Every test implemented here fails without the fix introduced with this PR.
The tests run variations of the example given in #122. Every test implemented here fails without the fix introduced with this PR.
The regex used in
rdfproxy.utils.sparql_utils.replace_query_select_clause
is completely inappropriate for the task at hand. E.g. it is currently not possible forrdfproxy
to deal with query strings containing newlines likeIn that case, the current regex
select\s.*
matches until\n
and createswhich is invalid SPARQL and leads to the
org.openrdf.query.MalformedQueryException: Bad aggregate [...]
Java when targeting Blazegraph/Wikidata (or any other Jena-based triplestore).Solution proposal
A much more apt regex for selecting the entire SELECT clause would be
(select\s+[\s\S]*?)(?=\s+where)
.This non-greedily matches the SELECT clause with all variable references and performs a lookahead for WHERE.
The text was updated successfully, but these errors were encountered: