Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf/better quad ids #318

Closed
wants to merge 29 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
2f7d6e8
chore: add nested list test
jeswr Nov 23, 2022
37ab09c
fix: support quoted triples in list
jeswr Nov 23, 2022
f837b8d
breaking: drop support for quads in quoted triples as they are forbid…
jeswr Nov 23, 2022
da945e9
feat: support annotated triples
jeswr Nov 23, 2022
624e3e1
chore: error on quoted compound bnodes
jeswr Nov 23, 2022
d27b920
feat: turtle-star spec tests are passing
jeswr Nov 23, 2022
2af5e05
chore: fix lint and coverage errors
jeswr Nov 23, 2022
0ac4c46
chore: remove commented code
jeswr Nov 23, 2022
0337ed8
chore: rename RDF* -> RDF-star
jeswr Nov 23, 2022
99d6b72
chore: update RDF-star reference in readme
jeswr Nov 24, 2022
d258f22
chore: fix round trip on deeply nested rdfstar triples
jeswr Nov 24, 2022
56cc184
chore: add tests from https://github.com/rdfjs/N3.js/pull/303
jeswr Nov 26, 2022
2f0f57d
chore: describe quoted triple predicate parsing
jeswr Jan 4, 2023
0539be9
chore: clarify use of graph term in quoted quads
jeswr Jan 4, 2023
e7646d9
fix: allow a split between '|' and '}' (see https://github.com/rdfjs/…
jeswr Jan 4, 2023
6140e86
chore: remove doubling comment
jeswr Jan 4, 2023
bec8395
chore: add comment about nested parameter
jeswr Jan 4, 2023
8ce8428
fix: use describe for all shouldParse test suites
jeswr Jan 4, 2023
211bf07
fix: dont interpret }| as {|
jeswr Jan 4, 2023
9d8afd8
perf: mint quad ids using term ids
jeswr Jan 4, 2023
d422319
fix: fix broken N3Store tests
jeswr Jan 5, 2023
49e9fb6
chore: improve tests range
jeswr Jan 5, 2023
19ed891
chore: use _termToNumericId to convert Ids
jeswr Jan 5, 2023
4ef4fc0
chore: don't mint new ids unecessarily
jeswr Jan 5, 2023
78f5acf
feat: allow nested graph terms
jeswr Jan 5, 2023
6b25c80
Update src/N3Parser.js
jeswr Jan 5, 2023
bf957a8
Update src/N3Parser.js
jeswr Jan 5, 2023
a25f9e9
chore: add performance testing
jeswr Jan 5, 2023
a089fc7
chore: add performance test for limited annotations
jeswr Jan 5, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,14 @@ It offers:
[TriG](https://www.w3.org/TR/trig/),
[N-Triples](https://www.w3.org/TR/n-triples/),
[N-Quads](https://www.w3.org/TR/n-quads/),
[RDF*](https://blog.liu.se/olafhartig/2019/01/10/position-statement-rdf-star-and-sparql-star/)
[RDF-star](https://www.w3.org/2021/12/rdf-star.html)
and [Notation3 (N3)](https://www.w3.org/TeamSubmission/n3/)
- [**Writing**](#writing) triples/quads to
[Turtle](https://www.w3.org/TR/turtle/),
[TriG](https://www.w3.org/TR/trig/),
[N-Triples](https://www.w3.org/TR/n-triples/),
[N-Quads](https://www.w3.org/TR/n-quads/)
and [RDF*](https://blog.liu.se/olafhartig/2019/01/10/position-statement-rdf-star-and-sparql-star/)
and [RDF-star](https://www.w3.org/2021/12/rdf-star.html)
- [**Storage**](#storing) of triples/quads in memory

Parsing and writing is:
Expand Down Expand Up @@ -358,16 +358,16 @@ The N3.js parser and writer is fully compatible with the following W3C specifica

In addition, the N3.js parser also supports [Notation3 (N3)](https://www.w3.org/TeamSubmission/n3/) (no official specification yet).

The N3.js parser and writer are also fully compatible with the RDF* variants
The N3.js parser and writer are also fully compatible with the RDF-star variants
of the W3C specifications.

The default mode is permissive
and allows a mixture of different syntaxes, including RDF*.
and allows a mixture of different syntaxes, including RDF-star.
Pass a `format` option to the constructor with the name or MIME type of a format
for strict, fault-intolerant behavior.
If a format string contains `star` or `*`
(e.g., `turtlestar` or `TriG*`),
RDF* support for that format will be enabled.
RDF-star support for that format will be enabled.

### Interface specifications
The N3.js submodules are compatible with the following [RDF.js](http://rdf.js.org) interfaces:
Expand Down
530 changes: 175 additions & 355 deletions package-lock.json

Large diffs are not rendered by default.

10 changes: 8 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
"mocha": "^8.0.0",
"nyc": "^14.1.1",
"pre-commit": "^1.2.2",
"rdf-test-suite": "^1.19.2",
"rdf-test-suite": "^1.20.0",
"streamify-string": "^1.0.1",
"uglify-js": "^3.14.3"
},
Expand All @@ -54,7 +54,7 @@
"mocha": "mocha",
"lint": "eslint src perf test spec",
"prepare": "npm run build",
"spec": "npm run spec-turtle && npm run spec-ntriples && npm run spec-nquads && npm run spec-trig",
"spec": "npm run spec-turtle && npm run spec-ntriples && npm run spec-nquads && npm run spec-trig && npm run spec-rdf-star",
"spec-earl": "npm run spec-earl-turtle && npm run spec-earl-ntriples && npm run spec-earl-nquads && npm run spec-earl-trig",
"spec-ntriples": "rdf-test-suite spec/parser.js http://w3c.github.io/rdf-tests/ntriples/manifest.ttl -i '{ \"format\": \"n-triples\" }' -c .rdf-test-suite-cache/",
"spec-nquads": "rdf-test-suite spec/parser.js http://w3c.github.io/rdf-tests/nquads/manifest.ttl -i '{ \"format\": \"n-quads\" }' -c .rdf-test-suite-cache/",
Expand All @@ -64,6 +64,12 @@
"spec-earl-nquads": "rdf-test-suite spec/parser.js http://w3c.github.io/rdf-tests/nquads/manifest.ttl -i '{ \"format\": \"n-quads\" }' -c .rdf-test-suite-cache/ -o earl -p spec/earl-meta.json > spec/earl-nquads.ttl",
"spec-earl-turtle": "rdf-test-suite spec/parser.js http://w3c.github.io/rdf-tests/turtle/manifest.ttl -i '{ \"format\": \"turtle\" }' -c .rdf-test-suite-cache/ -o earl -p spec/earl-meta.json > spec/earl-turtle.ttl",
"spec-earl-trig": "rdf-test-suite spec/parser.js http://w3c.github.io/rdf-tests/trig/manifest.ttl -i '{ \"format\": \"trig\" }' -c .rdf-test-suite-cache/ -o earl -p spec/earl-meta.json > spec/earl-trig.ttl",
"spec-rdf-star": "npm run spec-trig-rdf-star && npm run spec-trig-eval-rdf-star && npm run spec-turtle-rdf-star && npm run spec-turtle-eval-rdf-star",
"spec-trig-rdf-star": "node ../rdf-test-suite.js/bin/Runner.js spec/parser.js https://w3c.github.io/rdf-star/tests/trig/syntax/manifest.jsonld -i '{ \"format\": \"trig-star\" }' -c .rdf-test-suite-cache/",
"spec-trig-eval-rdf-star": "node ../rdf-test-suite.js/bin/Runner.js spec/parser.js https://w3c.github.io/rdf-star/tests/trig/eval/manifest.jsonld -i '{ \"format\": \"trig-star\" }' -c .rdf-test-suite-cache/",
"spec-turtle-rdf-star": "node ../rdf-test-suite.js/bin/Runner.js spec/parser.js https://w3c.github.io/rdf-star/tests/turtle/syntax/manifest.jsonld -i '{ \"format\": \"turtle-star\" }' -c .rdf-test-suite-cache/",
"spec-turtle-eval-rdf-star": "node ../rdf-test-suite.js/bin/Runner.js spec/parser.js https://w3c.github.io/rdf-star/tests/turtle/eval/manifest.jsonld -i '{ \"format\": \"turtle-star\" }' -c .rdf-test-suite-cache/",
"spec-ntriples-rdf-star": "node ../rdf-test-suite.js/bin/Runner.js spec/parser.js https://w3c.github.io/rdf-star/tests/nt/syntax/manifest.jsonld -i '{ \"format\": \"n-quads-star\" }' -c .rdf-test-suite-cache/",
"spec-clean": "rm -r .rdf-test-suite-cache/",
"docs": "cd src && docco *.js -o ../docs && cd .."
},
Expand Down
118 changes: 118 additions & 0 deletions perf/N3StoreStar-perf.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
#!/usr/bin/env node
const N3 = require('..');
const assert = require('assert');

console.log('N3Store performance test');

const prefix = 'http://example.org/#';

/* Test triples */
const dim = Number.parseInt(process.argv[2], 10) || 22;
const dimSquared = dim * dim;
const dimCubed = dimSquared * dim;
const dimToTheFour = dimCubed * dim;
const dimToTheFive = dimToTheFour * dim;

const store = new N3.Store();
let TEST = `- Adding ${dimToTheFive} triples to the default graph`;
console.time(TEST);
let i, j, k, l, m;
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
for (k = 0; k < dim; k++)
for (l = 0; l < dim; l++)
for (m = 0; m < dim; m++)
store.addQuad(
N3.DataFactory.quad(
N3.DataFactory.namedNode(prefix + i),
N3.DataFactory.namedNode(prefix + j),
N3.DataFactory.namedNode(prefix + k)
),
N3.DataFactory.namedNode(prefix + l),
N3.DataFactory.namedNode(prefix + m)
);
console.timeEnd(TEST);

console.log(`* Memory usage for triples: ${Math.round(process.memoryUsage().rss / 1024 / 1024)}MB`);

TEST = `- Finding all ${dimToTheFive} triples in the default graph ${dimSquared * 1} times (0 variables)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
for (k = 0; k < dim; k++)
for (l = 0; l < dim; l++)
for (m = 0; m < dim; m++)
assert.equal(store.getQuads(
N3.DataFactory.quad(
N3.DataFactory.namedNode(prefix + i),
N3.DataFactory.namedNode(prefix + j),
N3.DataFactory.namedNode(prefix + k)
),
N3.DataFactory.namedNode(prefix + l),
N3.DataFactory.namedNode(prefix + m)
).length, 1);
console.timeEnd(TEST);

TEST = `- Finding all ${dimCubed} triples in the default graph ${dimSquared * 2} times (1 variable subject)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
assert.equal(store.getQuads(null, N3.DataFactory.namedNode(prefix + i), N3.DataFactory.namedNode(prefix + j)).length, dimCubed);
console.timeEnd(TEST);

TEST = `- Finding all ${0} triples in the default graph ${dimSquared * 2} times (1 variable predicate)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
assert.equal(store.getQuads(N3.DataFactory.namedNode(prefix + i), null, N3.DataFactory.namedNode(prefix + j)).length, 0);
console.timeEnd(TEST);

TEST = `- Finding all ${dim} triples in the default graph ${dimSquared * 4} times (1 variable predicate)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
for (k = 0; k < dim; k++)
for (l = 0; l < dim; l++)
assert.equal(store.getQuads(N3.DataFactory.quad(
N3.DataFactory.namedNode(prefix + i),
N3.DataFactory.namedNode(prefix + j),
N3.DataFactory.namedNode(prefix + k)
), null, N3.DataFactory.namedNode(prefix + l)).length, dim);
console.timeEnd(TEST);

TEST = `- Finding all ${0} triples in the default graph ${dimSquared * 2} times (1 variable object)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
assert.equal(store.getQuads(N3.DataFactory.namedNode(prefix + i), N3.DataFactory.namedNode(prefix + j), null).length, 0);
console.timeEnd(TEST);

TEST = `- Finding all ${dim} triples in the default graph ${dimSquared * 4} times (1 variable objects)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
for (k = 0; k < dim; k++)
for (l = 0; l < dim; l++)
assert.equal(store.getQuads(N3.DataFactory.quad(
N3.DataFactory.namedNode(prefix + i),
N3.DataFactory.namedNode(prefix + j),
N3.DataFactory.namedNode(prefix + k)
), N3.DataFactory.namedNode(prefix + l), null).length, dim);
console.timeEnd(TEST);

TEST = `- Finding all ${dimSquared} triples in the default graph ${dimSquared * 1} times (2 variables)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
for (k = 0; k < dim; k++)
assert.equal(store.getQuads(
N3.DataFactory.quad(
N3.DataFactory.namedNode(prefix + i),
N3.DataFactory.namedNode(prefix + j),
N3.DataFactory.namedNode(prefix + k)
),
null,
null
).length,
dimSquared);
console.timeEnd(TEST);
118 changes: 118 additions & 0 deletions perf/N3StoreStarViews-perf.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
#!/usr/bin/env node
const N3 = require('../lib');
const assert = require('assert');

console.log('N3Store performance test');

const prefix = 'http://example.org/#';

/* Test triples */
const dim = Number.parseInt(process.argv[2], 10) || 64;
const dimSquared = dim * dim;
const dimCubed = dimSquared * dim;
const dimToTheFour = dimCubed * dim;
const dimToTheFive = dimToTheFour * dim;

const store = new N3.Store();
let TEST = `- Adding ${dimToTheFive} triples to the default graph`;
console.time(TEST);
let i, j, k, l, m;
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
for (k = 0; k < dim; k++)
for (l = 0; l < 3; l++)
for (m = 0; m < 3; m++)
store.addQuad(
N3.DataFactory.quad(
N3.DataFactory.namedNode(prefix + i),
N3.DataFactory.namedNode(prefix + j),
N3.DataFactory.namedNode(prefix + k)
),
N3.DataFactory.namedNode(prefix + l),
N3.DataFactory.namedNode(prefix + m)
);
console.timeEnd(TEST);

console.log(`* Memory usage for triples: ${Math.round(process.memoryUsage().rss / 1024 / 1024)}MB`);

TEST = `- Finding all ${dimToTheFive} triples in the default graph ${dimSquared * 1} times (0 variables)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
for (k = 0; k < dim; k++)
for (l = 0; l < 3; l++)
for (m = 0; m < 3; m++)
assert.equal(store.getQuads(
N3.DataFactory.quad(
N3.DataFactory.namedNode(prefix + i),
N3.DataFactory.namedNode(prefix + j),
N3.DataFactory.namedNode(prefix + k)
),
N3.DataFactory.namedNode(prefix + l),
N3.DataFactory.namedNode(prefix + m)
).length, 1);
console.timeEnd(TEST);

TEST = `- Finding all ${dimCubed} triples in the default graph ${dimSquared * 2} times (1 variable subject)`;
console.time(TEST);
for (i = 0; i < 3; i++)
for (j = 0; j < 3; j++)
assert.equal(store.getQuads(null, N3.DataFactory.namedNode(prefix + i), N3.DataFactory.namedNode(prefix + j)).length, dimCubed);
console.timeEnd(TEST);

TEST = `- Finding all ${0} triples in the default graph ${dimSquared * 2} times (1 variable predicate)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
assert.equal(store.getQuads(N3.DataFactory.namedNode(prefix + i), null, N3.DataFactory.namedNode(prefix + j)).length, 0);
console.timeEnd(TEST);

TEST = `- Finding all ${3} triples in the default graph ${dimCubed * 3} times (1 variable predicate)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
for (k = 0; k < dim; k++)
for (l = 0; l < 3; l++)
assert.equal(store.getQuads(N3.DataFactory.quad(
N3.DataFactory.namedNode(prefix + i),
N3.DataFactory.namedNode(prefix + j),
N3.DataFactory.namedNode(prefix + k)
), null, N3.DataFactory.namedNode(prefix + l)).length, 3);
console.timeEnd(TEST);

TEST = `- Finding all ${0} triples in the default graph ${dimSquared * 2} times (1 variable object)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
assert.equal(store.getQuads(N3.DataFactory.namedNode(prefix + i), N3.DataFactory.namedNode(prefix + j), null).length, 0);
console.timeEnd(TEST);

TEST = `- Finding all ${3} triples in the default graph ${dimCubed * 3} times (1 variable objects)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
for (k = 0; k < dim; k++)
for (l = 0; l < 3; l++)
assert.equal(store.getQuads(N3.DataFactory.quad(
N3.DataFactory.namedNode(prefix + i),
N3.DataFactory.namedNode(prefix + j),
N3.DataFactory.namedNode(prefix + k)
), N3.DataFactory.namedNode(prefix + l), null).length, 3);
console.timeEnd(TEST);

TEST = `- Finding all ${9} triples in the default graph ${dimCubed} times (2 variables)`;
console.time(TEST);
for (i = 0; i < dim; i++)
for (j = 0; j < dim; j++)
for (k = 0; k < dim; k++)
assert.equal(store.getQuads(
N3.DataFactory.quad(
N3.DataFactory.namedNode(prefix + i),
N3.DataFactory.namedNode(prefix + j),
N3.DataFactory.namedNode(prefix + k)
),
null,
null
).length,
9);
console.timeEnd(TEST);
55 changes: 31 additions & 24 deletions src/N3DataFactory.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ let DEFAULTGRAPH;
let _blankNodeCounter = 0;

const escapedLiteral = /^"(.*".*)(?="[^"]*$)/;
const quadId = /^<<("(?:""|[^"])*"[^ ]*|[^ ]+) ("(?:""|[^"])*"[^ ]*|[^ ]+) ("(?:""|[^"])*"[^ ]*|[^ ]+) ?("(?:""|[^"])*"[^ ]*|[^ ]+)?>>$/;

// ## DataFactory singleton
const DataFactory = {
Expand Down Expand Up @@ -188,9 +187,12 @@ export class DefaultGraph extends Term {
// ## DefaultGraph singleton
DEFAULTGRAPH = new DefaultGraph();


// ### Constructs a term from the given internal string ID
export function termFromId(id, factory) {
// The third 'nested' parameter of this function is to aid
// with recursion over nested terms. It should not be used
// by consumers of this library.
// See https://github.com/rdfjs/N3.js/pull/311#discussion_r1061042725
export function termFromId(id, factory, nested) {
factory = factory || DataFactory;

// Falsy value or empty string indicate the default graph
Expand All @@ -215,21 +217,28 @@ export function termFromId(id, factory) {
return factory.literal(id.substr(1, endPos - 1),
id[endPos + 1] === '@' ? id.substr(endPos + 2)
: factory.namedNode(id.substr(endPos + 3)));
case '<':
const components = quadId.exec(id);
return factory.quad(
termFromId(unescapeQuotes(components[1]), factory),
termFromId(unescapeQuotes(components[2]), factory),
termFromId(unescapeQuotes(components[3]), factory),
components[4] && termFromId(unescapeQuotes(components[4]), factory)
);
case '[':
id = JSON.parse(id);
break;
default:
return factory.namedNode(id);
if (!nested || !Array.isArray(id)) {
return factory.namedNode(id);
}
}
return factory.quad(
termFromId(id[0], factory, true),
termFromId(id[1], factory, true),
termFromId(id[2], factory, true),
id[3] && termFromId(id[3], factory, true)
);
}

// ### Constructs an internal string ID from the given term or ID string
export function termToId(term) {
// The third 'nested' parameter of this function is to aid
// with recursion over nested terms. It should not be used
// by consumers of this library.
// See https://github.com/rdfjs/N3.js/pull/311#discussion_r1061042725
export function termToId(term, nested) {
if (typeof term === 'string')
return term;
if (term instanceof Term && term.termType !== 'Quad')
Expand All @@ -247,17 +256,15 @@ export function termToId(term) {
term.language ? `@${term.language}` :
(term.datatype && term.datatype.value !== xsd.string ? `^^${term.datatype.value}` : '')}`;
case 'Quad':
// To identify RDF* quad components, we escape quotes by doubling them.
// This avoids the overhead of backslash parsing of Turtle-like syntaxes.
return `<<${
escapeQuotes(termToId(term.subject))
} ${
escapeQuotes(termToId(term.predicate))
} ${
escapeQuotes(termToId(term.object))
}${
(isDefaultGraph(term.graph)) ? '' : ` ${termToId(term.graph)}`
}>>`;
const res = [
termToId(term.subject, true),
termToId(term.predicate, true),
termToId(term.object, true),
];
if (!isDefaultGraph(term.graph)) {
res.push(termToId(term.graph, true));
}
return nested ? res : JSON.stringify(res);
default: throw new Error(`Unexpected termType: ${term.termType}`);
}
}
Expand Down
Loading