-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MET-6139 Move rdf convertion classes from metis-framework. #14
MET-6139 Move rdf convertion classes from metis-framework. #14
Conversation
This is a first iteration.
@stzanakis Just wondering about separation of concerns and all that. I think when it comes to the metis-schema module, we should try to keep it focused only on functionality that is specific to the RDF schema and converting to/from it. It seems that you're proposing to move media-processing specific code to the metis schema (e.g. extract the hasView, isShownAt, etc. from records). Reading back my TODO I believe my thoughts at the time was to transfer just the code that deserializes to a Specifically, in Bits and pieces like What do you think? |
private static final String OEMBED_NAMESPACE = "https://oembed.com/"; | ||
private static final String XPATH_OEMBED_SERVICES = | ||
SVCS_SERVICE + "[dcterms:conformsTo/@rdf:resource = \"" + OEMBED_NAMESPACE + "\"]"; | ||
private static final String XPATH_OEMBED_WEB_RESOURCES = EDM_WEBRESOURCE | ||
+ "[svcs:has_service/@rdf:resource = " + XPATH_OEMBED_SERVICES + "/@rdf:about]"; | ||
private static final String XPATH_IS_OEMBED_RESOURCE_CONDITION = "[. = " | ||
+ XPATH_OEMBED_WEB_RESOURCES + "/@rdf:about]"; | ||
private static final String OEMBED_XPATH_CONDITION_IS_SHOWN_BY = | ||
EDM_IS_SHOWN_BY + XPATH_IS_OEMBED_RESOURCE_CONDITION; | ||
private static final String OEMBED_XPATH_CONDITION_HAS_VIEW = | ||
EDM_HAS_VIEW + XPATH_IS_OEMBED_RESOURCE_CONDITION; | ||
|
||
private final XPathExpressionWrapper getObjectExpression = new XPathExpressionWrapper(xPath -> xPath.compile(EDM_OBJECT)); | ||
private final XPathExpressionWrapper getHasViewExpression = new XPathExpressionWrapper(xPath -> xPath.compile(EDM_HAS_VIEW)); | ||
private final XPathExpressionWrapper getIsShownAtExpression = new XPathExpressionWrapper( | ||
xPath -> xPath.compile(EDM_IS_SHOWN_AT)); | ||
private final XPathExpressionWrapper getIsShownByExpression = new XPathExpressionWrapper( | ||
xPath -> xPath.compile(EDM_IS_SHOWN_BY)); | ||
private final XPathExpressionWrapper getOEmbedExpression = new XPathExpressionWrapper( | ||
xPath -> xPath.compile(OEMBED_XPATH_CONDITION_HAS_VIEW + " | " + OEMBED_XPATH_CONDITION_IS_SHOWN_BY)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Media Processing specific code
public RdfResourceEntry getMainThumbnailResource(byte[] input) throws RdfDeserializationException { | ||
return performDeserialization(input, this::getMainThumbnailResource); | ||
} | ||
|
||
public RdfResourceEntry getMainThumbnailResource(InputStream inputStream) | ||
throws RdfDeserializationException { | ||
return getMainThumbnailResource(deserializeToDocument(inputStream)).orElse(null); | ||
} | ||
|
||
public Optional<RdfResourceEntry> getMainThumbnailResource(Document document) | ||
throws RdfDeserializationException { | ||
|
||
// Get the entries of the required types. | ||
final Map<String, ResourceInfo> resourceEntries = getResourceEntries(document, | ||
Collections.singleton(UrlType.URL_TYPE_FOR_MAIN_THUMBNAIL_RESOURCE)); | ||
|
||
// If there is not exactly one, we return an empty optional. | ||
if (resourceEntries.size() != 1) { | ||
return Optional.empty(); | ||
} | ||
|
||
// So there is exactly one. Convert and return. | ||
return Optional.of(convertToResourceEntries(resourceEntries).get(0)); | ||
} | ||
|
||
public List<RdfResourceEntry> convertToResourceEntries( | ||
Map<String, ResourceInfo> urlWithTypes) { | ||
return urlWithTypes.entrySet().stream().map(RdfDeserializer::convertToResourceEntry) | ||
.toList(); | ||
} | ||
|
||
private static RdfResourceEntry convertToResourceEntry(Map.Entry<String, ResourceInfo> entry) { | ||
return new RdfResourceEntry(entry.getKey(), entry.getValue().urlTypes(), | ||
entry.getValue().configuredForOembed()); | ||
} | ||
|
||
/** | ||
* Gets resource entries. | ||
* | ||
* @param document the document | ||
* @param allowedUrlTypes the allowed url types | ||
* @return the resource entries | ||
* @throws RdfDeserializationException the rdf deserialization exception | ||
*/ | ||
public Map<String, ResourceInfo> getResourceEntries(Document document, | ||
Set<UrlType> allowedUrlTypes) throws RdfDeserializationException { | ||
|
||
// Get the resources and their types. | ||
final Map<String, Set<UrlType>> urls = new HashMap<>(); | ||
for (UrlType type : allowedUrlTypes) { | ||
final Set<String> urlsForType = getUrls(document, type); | ||
for (String url : urlsForType) { | ||
urls.computeIfAbsent(url, k -> new HashSet<>()).add(type); | ||
} | ||
} | ||
|
||
// For each resource, check whether they are configured for oEmbed. | ||
final Map<String, ResourceInfo> result = HashMap.newHashMap(urls.size()); | ||
final Set<String> oEmbedUrls = getOEmbedUrls(document); | ||
for (Entry<String, Set<UrlType>> entry : urls.entrySet()) { | ||
boolean isConfiguredForOembed = oEmbedUrls.contains(entry.getKey()); | ||
result.put(entry.getKey(), new ResourceInfo(entry.getValue(), isConfiguredForOembed)); | ||
} | ||
|
||
// Done | ||
return result; | ||
} | ||
|
||
private Set<String> getUrls(Document document, UrlType type) throws RdfDeserializationException { | ||
|
||
// Determine the right expression to apply. | ||
final XPathExpressionWrapper expression = | ||
switch (type) { | ||
case OBJECT -> getObjectExpression; | ||
case HAS_VIEW -> getHasViewExpression; | ||
case IS_SHOWN_AT -> getIsShownAtExpression; | ||
case IS_SHOWN_BY -> getIsShownByExpression; | ||
}; | ||
|
||
// Evaluate the expression and convert the node list to a set of attribute values. | ||
final NodeList nodes = expression.evaluate(document); | ||
return IntStream.range(0, nodes.getLength()).mapToObj(nodes::item).map(Node::getNodeValue) | ||
.collect(Collectors.toSet()); | ||
} | ||
|
||
private Set<String> getOEmbedUrls(Document document) throws RdfDeserializationException { | ||
final NodeList oEmbedNodes = getOEmbedExpression.evaluate(document); | ||
return IntStream.range(0, oEmbedNodes.getLength()) | ||
.mapToObj(oEmbedNodes::item) | ||
.map(Node::getNodeValue) | ||
.collect(Collectors.toSet()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Media Processing specific code.
And I think my TODO suggested something else. Namely, removing duplicate code between |
public RDF deserialize(InputStream inputStream) throws RdfDeserializationException { | ||
try { | ||
final IUnmarshallingContext context = rdfBindingFactory.createUnmarshallingContext(); | ||
return (RDF) context.unmarshalDocument(inputStream, UTF8); | ||
} catch (JiBXException e) { | ||
throw new RdfDeserializationException( | ||
"Something went wrong with converting to or from the RDF format.", e); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate with RdfConversionUtils
public byte[] convertRdfToBytes(RDF rdf) throws RdfSerializationException { | ||
try { | ||
IMarshallingContext context = rdfBindingFactory.createMarshallingContext(); | ||
context.setIndent(INDENTATION_SPACE); | ||
ByteArrayOutputStream out = new ByteArrayOutputStream(); | ||
context.marshalDocument(rdf, UTF8, null, out); | ||
return out.toByteArray(); | ||
} catch (JiBXException e) { | ||
throw new RdfSerializationException( | ||
"Something went wrong with converting to or from the RDF format.", e); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate with RdfConversionUtils
.
Okay. As I mentioned earlier, the TODO comment was clear I just thought taking it a step further and move more code that could be potentially used by other projects. That being said I was a bit hesitant of this impactful change. |
@jeortizquan
This is a first step of the code migration.
In this first iteration Tests(if any) are not migrated, packaging organization not optimized, reported issues not resolved.
With the first review, we just agree on the direction and then we continue the implementation on this PR.
@jochen-vermeulen In the end the todo I mentioned was actually from you in the end. So perhaps you have a say in this one as well?