-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store Token position in the produces quads #377
Comments
This would be possible indeed, if the parser emits the context from the tokenizer in the quads. We have no plans to take this up, but a pull request that puts this functionality behind a flag would be welcome, provided it has no performance impact when switched off. |
This is excactly what I need too. I am currently developing an RDF editing extension for Visual Studio Code named Mentor. For this use case I frequently need to resolve URIs and blank nodes to parsed Tokens and this feature would be extremely helpful. I found a workaround for URIs which requires parsing the document again after loading and interpreting the Triples, but that only works for URIs and not for blank nodes. This currently blocks me from implementing SHACL support where blank node definitions of (property) shapes are quite common. Any idea how such source maps could be implemented? |
Luckily tokens emitted by the Lexer already contain information about the line and position of each token emitted by the lexer. In the Parser you could add this information property of this._subject = this._blankNode();
if (this._recordPosition) {
this._subject[POS] = { line: token.line, start: token.start }
}
this._saveContext('blank', this._graph,
this._subject, null, null); I would recommend making The caveat of this approach would be that it might cause a non-negligible performance hit even when the feature is disabled; but I suspect this is something you can perf. test and optimise once the feature is implemented. |
I did a POC some time ago. I added it as a use case in the RDF-Star working group. Maybe in the future we can use RDF-Start to define such source maps "externally" from the source turtle. My poc is using n3 parser and exposes the tokens in the quads (not rdf-star). |
@BenjaminHofstetter Did you create a patch for N3 and publish the code of the PoC somewhere? |
Perhaps change the issue title from — (At least, change |
Why do I need that:
After parsing a Turtle file, I lose all information about the source file. For better tooling support, I propose implementing some kind of "source maps" to trace back from quads to positions in the Turtle file.
For instance, in tools like https://shacl-playground.zazuko.com/, when encountering errors in SHACL validation reports, locating the error-causing triple requires human intervention. With source map information, editors could pinpoint the exact location in the Turtle file, aiding in error resolution. Implementing source maps would bridge the gap between parsed files and their source, enhancing tooling support. The tokenizer already generates tokens with line, start, and end information, laying the groundwork for this feature.
The text was updated successfully, but these errors were encountered: