-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load html #347
base: main
Are you sure you want to change the base?
Conversation
Some tests still required for XHR and Node documentLoaders. Some work needed to optionally append text/html (and application/xhtml+xml) to accept headers if we can parser HTML. Perhaps some abstracted parsing support to handle other HTML parsers? Otherwise, please see if you agree with the general direction of this work. |
@@ -865,7 +871,10 @@ jsonld.get = async function(url, options) { | |||
load = jsonld.documentLoader; | |||
} | |||
|
|||
const remoteDoc = await load(url); | |||
// FIXME: unescape frag? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is necessary; it's not to pass tests.
continue; | ||
} | ||
try { | ||
remoteDoc.document.push(JSON.parse(script.textContent)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that textContent
always decodes entities, so there are tests we can't pass. Any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWLIW, I threw together a JSBin that uses example 10 to test parsing via jsonld.js 2.0.2:
https://jsbin.com/rewaxiquki/edit?html,console
There aren't any entity decoding issues, so I think this issue is probably a bug in xmldom
's implementation.
Here's another one which works using jsdom:
https://runkit.com/embed/9qrv8dl5g2bs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I use htmlparser2 in my implementation, and it doesn't seem to be decoding entities, so it allows all tests to pass.
So I also think that this is a bug/feature in xmldom.
lib/jsonld.js
Outdated
@@ -934,6 +995,20 @@ jsonld.documentLoaders = {}; | |||
jsonld.documentLoaders.node = require('./documentLoaders/node'); | |||
jsonld.documentLoaders.xhr = require('./documentLoaders/xhr'); | |||
|
|||
// Optional DOM parser | |||
try { | |||
jsonld.domParser = require('xmldom').DOMParser; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will support xmldom, if it's loaded. Are there other parsers that should be included? How to configure and test?
@@ -15,6 +15,7 @@ const REGEX_LINK_HEADER = /\s*<([^>]*?)>\s*(?:;\s*(.*))?/; | |||
const REGEX_LINK_HEADER_PARAMS = | |||
/(.*?)=(?:(?:"([^"]*?)")|([^"]*?))\s*(?:(?:;\s*)|$)/g; | |||
|
|||
// FIXME: conditinally support text/html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With HTML support, we should include text/html
and application/ld+json
, but not sure best way to do that. It will also mess up some documentLoader tests?
@@ -30,7 +34,7 @@ module.exports = ({ | |||
const queue = new RequestQueue(); | |||
return queue.wrapLoader(loader); | |||
|
|||
async function loader(url) { | |||
async function loader(url, options) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There don't seem to be standalone XHR documentLoader tests, as there are node tests.
Perhaps the webpack build is failing because it doesn't include xmldom? Should there be a way to condition tests on this? How? |
@@ -934,6 +995,27 @@ jsonld.documentLoaders = {}; | |||
jsonld.documentLoaders.node = require('./documentLoaders/node'); | |||
jsonld.documentLoaders.xhr = require('./documentLoaders/xhr'); | |||
|
|||
// Optional DOM parser | |||
try { | |||
jsonld.domParser = require('xmldom').DOMParser || class NoDOMParser { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidlehn -- can you comment here? I don't think we can easily support this pattern with webpack. Can you suggest an alternative path forward? Instead of a require
here, the user may need to have installed another package themselves that registered a DOM parser with jsonld in a similar way we do with RDF parsers. If so -- we should copy that pattern since it's already used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gkellogg! This is great. A few minor changes needed plus one architectural thing we need some feedback from @davidlehn on (regarding how to pull in a dom parser). |
c1c0221
to
d1c1290
Compare
b0f158d
to
a45ef6d
Compare
@davidlehn are we waiting for something else to merge this? |
@gkellogg I was waiting on me to look into how to improve the way the |
* Adds options parameter to documentLoader * Uses xmldom, if loaded. * Adds util.ParseContentTypeHeader * Adds documentLoader implementations for xhr and node (still requires tests).
Skip HTML tests if there is no DOMParser, or loading the module raises an exception. Allows Karma tests to pass.
Co-Authored-By: Dave Longley <[email protected]>
Co-Authored-By: David I. Lehn <[email protected]>
Co-Authored-By: David I. Lehn <[email protected]>
Co-Authored-By: David I. Lehn <[email protected]>
- Updated tests switch from "invalid script element" to "loading document failed".
Load JSON-LD from HTML documents.