Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf/better quad ids #318

Closed
wants to merge 29 commits into from
Closed
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
2f7d6e8
chore: add nested list test
jeswr Nov 23, 2022
37ab09c
fix: support quoted triples in list
jeswr Nov 23, 2022
f837b8d
breaking: drop support for quads in quoted triples as they are forbid…
jeswr Nov 23, 2022
da945e9
feat: support annotated triples
jeswr Nov 23, 2022
624e3e1
chore: error on quoted compound bnodes
jeswr Nov 23, 2022
d27b920
feat: turtle-star spec tests are passing
jeswr Nov 23, 2022
2af5e05
chore: fix lint and coverage errors
jeswr Nov 23, 2022
0ac4c46
chore: remove commented code
jeswr Nov 23, 2022
0337ed8
chore: rename RDF* -> RDF-star
jeswr Nov 23, 2022
99d6b72
chore: update RDF-star reference in readme
jeswr Nov 24, 2022
d258f22
chore: fix round trip on deeply nested rdfstar triples
jeswr Nov 24, 2022
56cc184
chore: add tests from https://github.com/rdfjs/N3.js/pull/303
jeswr Nov 26, 2022
2f0f57d
chore: describe quoted triple predicate parsing
jeswr Jan 4, 2023
0539be9
chore: clarify use of graph term in quoted quads
jeswr Jan 4, 2023
e7646d9
fix: allow a split between '|' and '}' (see https://github.com/rdfjs/…
jeswr Jan 4, 2023
6140e86
chore: remove doubling comment
jeswr Jan 4, 2023
bec8395
chore: add comment about nested parameter
jeswr Jan 4, 2023
8ce8428
fix: use describe for all shouldParse test suites
jeswr Jan 4, 2023
211bf07
fix: dont interpret }| as {|
jeswr Jan 4, 2023
9d8afd8
perf: mint quad ids using term ids
jeswr Jan 4, 2023
d422319
fix: fix broken N3Store tests
jeswr Jan 5, 2023
49e9fb6
chore: improve tests range
jeswr Jan 5, 2023
19ed891
chore: use _termToNumericId to convert Ids
jeswr Jan 5, 2023
4ef4fc0
chore: don't mint new ids unecessarily
jeswr Jan 5, 2023
78f5acf
feat: allow nested graph terms
jeswr Jan 5, 2023
6b25c80
Update src/N3Parser.js
jeswr Jan 5, 2023
bf957a8
Update src/N3Parser.js
jeswr Jan 5, 2023
a25f9e9
chore: add performance testing
jeswr Jan 5, 2023
a089fc7
chore: add performance test for limited annotations
jeswr Jan 5, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,14 @@ It offers:
[TriG](https://www.w3.org/TR/trig/),
[N-Triples](https://www.w3.org/TR/n-triples/),
[N-Quads](https://www.w3.org/TR/n-quads/),
[RDF*](https://blog.liu.se/olafhartig/2019/01/10/position-statement-rdf-star-and-sparql-star/)
[RDF-star](https://www.w3.org/2021/12/rdf-star.html)
and [Notation3 (N3)](https://www.w3.org/TeamSubmission/n3/)
- [**Writing**](#writing) triples/quads to
[Turtle](https://www.w3.org/TR/turtle/),
[TriG](https://www.w3.org/TR/trig/),
[N-Triples](https://www.w3.org/TR/n-triples/),
[N-Quads](https://www.w3.org/TR/n-quads/)
and [RDF*](https://blog.liu.se/olafhartig/2019/01/10/position-statement-rdf-star-and-sparql-star/)
and [RDF-star](https://www.w3.org/2021/12/rdf-star.html)
- [**Storage**](#storing) of triples/quads in memory

Parsing and writing is:
Expand Down Expand Up @@ -358,16 +358,16 @@ The N3.js parser and writer is fully compatible with the following W3C specifica

In addition, the N3.js parser also supports [Notation3 (N3)](https://www.w3.org/TeamSubmission/n3/) (no official specification yet).

The N3.js parser and writer are also fully compatible with the RDF* variants
The N3.js parser and writer are also fully compatible with the RDF-star variants
of the W3C specifications.

The default mode is permissive
and allows a mixture of different syntaxes, including RDF*.
and allows a mixture of different syntaxes, including RDF-star.
Pass a `format` option to the constructor with the name or MIME type of a format
for strict, fault-intolerant behavior.
If a format string contains `star` or `*`
(e.g., `turtlestar` or `TriG*`),
RDF* support for that format will be enabled.
RDF-star support for that format will be enabled.

### Interface specifications
The N3.js submodules are compatible with the following [RDF.js](http://rdf.js.org) interfaces:
Expand Down
530 changes: 175 additions & 355 deletions package-lock.json

Large diffs are not rendered by default.

10 changes: 8 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
"mocha": "^8.0.0",
"nyc": "^14.1.1",
"pre-commit": "^1.2.2",
"rdf-test-suite": "^1.19.2",
"rdf-test-suite": "^1.20.0",
"streamify-string": "^1.0.1",
"uglify-js": "^3.14.3"
},
Expand All @@ -54,7 +54,7 @@
"mocha": "mocha",
"lint": "eslint src perf test spec",
"prepare": "npm run build",
"spec": "npm run spec-turtle && npm run spec-ntriples && npm run spec-nquads && npm run spec-trig",
"spec": "npm run spec-turtle && npm run spec-ntriples && npm run spec-nquads && npm run spec-trig && npm run spec-rdf-star",
"spec-earl": "npm run spec-earl-turtle && npm run spec-earl-ntriples && npm run spec-earl-nquads && npm run spec-earl-trig",
"spec-ntriples": "rdf-test-suite spec/parser.js http://w3c.github.io/rdf-tests/ntriples/manifest.ttl -i '{ \"format\": \"n-triples\" }' -c .rdf-test-suite-cache/",
"spec-nquads": "rdf-test-suite spec/parser.js http://w3c.github.io/rdf-tests/nquads/manifest.ttl -i '{ \"format\": \"n-quads\" }' -c .rdf-test-suite-cache/",
Expand All @@ -64,6 +64,12 @@
"spec-earl-nquads": "rdf-test-suite spec/parser.js http://w3c.github.io/rdf-tests/nquads/manifest.ttl -i '{ \"format\": \"n-quads\" }' -c .rdf-test-suite-cache/ -o earl -p spec/earl-meta.json > spec/earl-nquads.ttl",
"spec-earl-turtle": "rdf-test-suite spec/parser.js http://w3c.github.io/rdf-tests/turtle/manifest.ttl -i '{ \"format\": \"turtle\" }' -c .rdf-test-suite-cache/ -o earl -p spec/earl-meta.json > spec/earl-turtle.ttl",
"spec-earl-trig": "rdf-test-suite spec/parser.js http://w3c.github.io/rdf-tests/trig/manifest.ttl -i '{ \"format\": \"trig\" }' -c .rdf-test-suite-cache/ -o earl -p spec/earl-meta.json > spec/earl-trig.ttl",
"spec-rdf-star": "npm run spec-trig-rdf-star && npm run spec-trig-eval-rdf-star && npm run spec-turtle-rdf-star && npm run spec-turtle-eval-rdf-star",
"spec-trig-rdf-star": "node ../rdf-test-suite.js/bin/Runner.js spec/parser.js https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.jsonld -i '{ \"format\": \"trig-star\" }' -c .rdf-test-suite-cache/",
"spec-trig-eval-rdf-star": "node ../rdf-test-suite.js/bin/Runner.js spec/parser.js https://w3c.github.io/rdf-star/tests/trig/eval/manifest.jsonld -i '{ \"format\": \"trig-star\" }' -c .rdf-test-suite-cache/",
"spec-turtle-rdf-star": "node ../rdf-test-suite.js/bin/Runner.js spec/parser.js https://w3c.github.io/rdf-star/tests/turtle/syntax/manifest.jsonld -i '{ \"format\": \"turtle-star\" }' -c .rdf-test-suite-cache/",
"spec-turtle-eval-rdf-star": "node ../rdf-test-suite.js/bin/Runner.js spec/parser.js https://w3c.github.io/rdf-star/tests/turtle/eval/manifest.jsonld -i '{ \"format\": \"turtle-star\" }' -c .rdf-test-suite-cache/",
"spec-ntriples-rdf-star": "node ../rdf-test-suite.js/bin/Runner.js spec/parser.js https://w3c.github.io/rdf-star/tests/nt/syntax/manifest.jsonld -i '{ \"format\": \"n-quads-star\" }' -c .rdf-test-suite-cache/",
"spec-clean": "rm -r .rdf-test-suite-cache/",
"docs": "cd src && docco *.js -o ../docs && cd .."
},
Expand Down
55 changes: 31 additions & 24 deletions src/N3DataFactory.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ let DEFAULTGRAPH;
let _blankNodeCounter = 0;

const escapedLiteral = /^"(.*".*)(?="[^"]*$)/;
const quadId = /^<<("(?:""|[^"])*"[^ ]*|[^ ]+) ("(?:""|[^"])*"[^ ]*|[^ ]+) ("(?:""|[^"])*"[^ ]*|[^ ]+) ?("(?:""|[^"])*"[^ ]*|[^ ]+)?>>$/;

// ## DataFactory singleton
const DataFactory = {
Expand Down Expand Up @@ -188,9 +187,12 @@ export class DefaultGraph extends Term {
// ## DefaultGraph singleton
DEFAULTGRAPH = new DefaultGraph();


// ### Constructs a term from the given internal string ID
export function termFromId(id, factory) {
// The third 'nested' parameter of this function is to aid
// with recursion over nested terms. It should not be used
// by consumers of this library.
// See https://github.com/rdfjs/N3.js/pull/311#discussion_r1061042725
export function termFromId(id, factory, nested) {
factory = factory || DataFactory;

// Falsy value or empty string indicate the default graph
Expand All @@ -215,21 +217,28 @@ export function termFromId(id, factory) {
return factory.literal(id.substr(1, endPos - 1),
id[endPos + 1] === '@' ? id.substr(endPos + 2)
: factory.namedNode(id.substr(endPos + 3)));
case '<':
const components = quadId.exec(id);
return factory.quad(
termFromId(unescapeQuotes(components[1]), factory),
termFromId(unescapeQuotes(components[2]), factory),
termFromId(unescapeQuotes(components[3]), factory),
components[4] && termFromId(unescapeQuotes(components[4]), factory)
);
case '[':
id = JSON.parse(id);
break;
default:
return factory.namedNode(id);
if (!nested || !Array.isArray(id)) {
return factory.namedNode(id);
}
}
return factory.quad(
termFromId(id[0], factory, true),
termFromId(id[1], factory, true),
termFromId(id[2], factory, true),
id[3] && termFromId(id[3], factory, true)
);
}

// ### Constructs an internal string ID from the given term or ID string
export function termToId(term) {
// The third 'nested' parameter of this function is to aid
// with recursion over nested terms. It should not be used
// by consumers of this library.
// See https://github.com/rdfjs/N3.js/pull/311#discussion_r1061042725
export function termToId(term, nested) {
if (typeof term === 'string')
return term;
if (term instanceof Term && term.termType !== 'Quad')
Expand All @@ -247,17 +256,15 @@ export function termToId(term) {
term.language ? `@${term.language}` :
(term.datatype && term.datatype.value !== xsd.string ? `^^${term.datatype.value}` : '')}`;
case 'Quad':
// To identify RDF* quad components, we escape quotes by doubling them.
// This avoids the overhead of backslash parsing of Turtle-like syntaxes.
return `<<${
escapeQuotes(termToId(term.subject))
} ${
escapeQuotes(termToId(term.predicate))
} ${
escapeQuotes(termToId(term.object))
}${
(isDefaultGraph(term.graph)) ? '' : ` ${termToId(term.graph)}`
}>>`;
const res = [
termToId(term.subject, true),
termToId(term.predicate, true),
termToId(term.object, true),
];
if (!isDefaultGraph(term.graph)) {
res.push(termToId(term.graph, true));
}
return nested ? res : JSON.stringify(res);
default: throw new Error(`Unexpected termType: ${term.termType}`);
}
}
Expand Down
19 changes: 16 additions & 3 deletions src/N3Lexer.js
Original file line number Diff line number Diff line change
Expand Up @@ -294,20 +294,33 @@ export default class N3Lexer {
case '!':
if (!this._n3Mode)
break;
case '{':
// Note the input[0] === '{' is required as this could be a fall-through from the above case
if (input.length > 1 && input[0] === '{' && input[1] === '|') {
type = '{|', matchLength = 2;
break;
}
case ',':
case ';':
case '[':
case ']':
case '(':
case ')':
case '{':
case '}':
if (!this._lineMode) {
if (
!this._lineMode &&
// The token might actually be {| and we just have not encountered the pipe yet
(input !== '{' || input.length > 1)
) {
matchLength = 1;
type = firstChar;
}
break;

case '|':
if (input.length > 1 && input[1] === '}') {
type = '|}', matchLength = 2;
break;
}
default:
inconclusive = true;
}
Expand Down
86 changes: 70 additions & 16 deletions src/N3Parser.js
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,10 @@ export default class N3Parser {
this._subject = this._blankNode(), null, null);
return this._readBlankNodeHead;
case '(':
// Lists are not allowed inside quoted triples
if (this._contextStack.length > 0 && this._contextStack[this._contextStack.length - 1].type === '<<') {
return this._error('Unexpected list inside quoted triple', token);
}
// Start a new list
this._saveContext('list', this._graph, this.RDF_NIL, null, null);
this._subject = null;
Expand Down Expand Up @@ -239,7 +243,7 @@ export default class N3Parser {
break;
case '<<':
if (!this._supportsRDFStar)
return this._error('Unexpected RDF* syntax', token);
return this._error('Unexpected RDF-star syntax', token);
this._saveContext('<<', this._graph, null, null, null);
this._graph = null;
return this._readSubject;
Expand Down Expand Up @@ -315,6 +319,10 @@ export default class N3Parser {
this._subject = this._blankNode());
return this._readBlankNodeHead;
case '(':
// Lists are not allowed inside quoted triples
if (this._contextStack.length > 0 && this._contextStack[this._contextStack.length - 1].type === '<<') {
return this._error('Unexpected list inside quoted triple', token);
}
// Start a new list
this._saveContext('list', this._graph, this._subject, this._predicate, this.RDF_NIL);
this._subject = null;
Expand All @@ -328,7 +336,7 @@ export default class N3Parser {
return this._readSubject;
case '<<':
if (!this._supportsRDFStar)
return this._error('Unexpected RDF* syntax', token);
return this._error('Unexpected RDF-star syntax', token);
this._saveContext('<<', this._graph, this._subject, this._predicate, null);
this._graph = null;
return this._readSubject;
Expand Down Expand Up @@ -363,6 +371,9 @@ export default class N3Parser {
this._subject = null;
return this._readBlankNodeTail(token);
}
else if (this._contextStack.length > 1 && this._contextStack[this._contextStack.length - 2].type === '<<') {
return this._error('Compound blank node expressions not permitted within quoted triple', token);
}
else {
this._predicate = null;
return this._readPredicate(token);
Expand Down Expand Up @@ -473,6 +484,16 @@ export default class N3Parser {
this._saveContext('formula', this._graph, this._subject, this._predicate,
this._graph = this._blankNode());
return this._readSubject;
case '<<':
if (!this._supportsRDFStar)
return this._error('Unexpected RDF-star syntax', token);

this._saveContext('<<', this._graph, this._subject, null, null);
return this._readSubject;
case '>>':
item = this._graph;
this._graph = null;
break;
default:
if ((item = this._readEntity(token)) === undefined)
return;
Expand Down Expand Up @@ -614,6 +635,18 @@ export default class N3Parser {
case ',':
next = this._readObject;
break;
case '{|':
if (!this._supportsRDFStar)
return this._error('Unexpected RDF-star syntax', token);

this._saveContext('{|', this._graph, this._subject, this._predicate, this._object);

// As a convention we use set the graph term as the Default Graph in quads representing quoted triples
jeswr marked this conversation as resolved.
Show resolved Hide resolved
// see https://github.com/rdfjs/N3.js/pull/311#discussion_r1061039556 for details
this._subject = this._quad(this._subject, this._predicate, this._object, this.DEFAULTGRAPH);
this._predicate = null;
this._object = null;
return this._readPredicate;
default:
// An entity means this is a quad (only allowed if not already inside a graph)
if (this._supportsQuads && this._graph === null && (graph = this._readEntity(token)) !== undefined) {
Expand Down Expand Up @@ -835,25 +868,21 @@ export default class N3Parser {
return this._readPath;
}

// ### `_readRDFStarTailOrGraph` reads the graph of a nested RDF* quad or the end of a nested RDF* triple
_readRDFStarTailOrGraph(token) {
if (token.type !== '>>') {
// An entity means this is a quad (only allowed if not already inside a graph)
if (this._supportsQuads && this._graph === null && (this._graph = this._readEntity(token)) !== undefined)
return this._readRDFStarTail;
return this._error(`Expected >> to follow "${this._object.id}"`, token);
}
return this._readRDFStarTail(token);
}

// ### `_readRDFStarTail` reads the end of a nested RDF* triple
// ### `_readRDFStarTail` reads the end of a nested RDF-star triple
_readRDFStarTail(token) {
if (token.type !== '>>')
return this._error(`Expected >> but got ${token.type}`, token);
return this._error(`Expected >> to follow "${this._object.id}" but got ${token.type}`, token);
// Read the quad and restore the previous context
const quad = this._quad(this._subject, this._predicate, this._object,
this._graph || this.DEFAULTGRAPH);
this._restoreContext('<<', token);

// If the triple is in a list then return to reading the remaining elements
if (this._contextStack.length > 0 && this._contextStack[this._contextStack.length - 1].type === 'list') {
this._graph = quad;
return this._readListItem(token);
}

// If the triple was the subject, continue by reading the predicate.
if (this._subject === null) {
this._subject = quad;
Expand All @@ -866,6 +895,29 @@ export default class N3Parser {
}
}

// ### `_readRDFStarTail` reads the end of a nested RDF-star triple
_readAnnotatedTail(token) {
if (token.type === '{|') {
this._saveContext('{|', this._graph, this._subject, this._predicate, this._object);

// As a convention we use set the graph term as the Default Graph in quads representing quoted triples
jeswr marked this conversation as resolved.
Show resolved Hide resolved
// see https://github.com/rdfjs/N3.js/pull/311#discussion_r1061039556 for details
this._subject = this._quad(this._subject, this._predicate, this._object, this.DEFAULTGRAPH);
this._predicate = null;
this._object = null;
return this._readPredicate;
}
else {
this._emit(this._subject, this._predicate, this._object, this._graph);
}

// If the quoted triple is not finished, the next token must be a predicate
if (token.type !== '|}')
return this._readPredicate;
this._restoreContext('{|', token);
return this._getContextEndReader();
}

// ### `_getContextEndReader` gets the next reader function at the end of a context
_getContextEndReader() {
const contextStack = this._contextStack;
Expand All @@ -880,7 +932,9 @@ export default class N3Parser {
case 'formula':
return this._readFormulaTail;
case '<<':
return this._readRDFStarTailOrGraph;
return this._readRDFStarTail;
case '{|':
return this._readAnnotatedTail;
}
}

Expand Down
Loading