Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review file formats and ontology variants (can JSON replace OWL) #203

Open
brianraymor opened this issue May 2, 2024 · 0 comments
Open

Comments

@brianraymor
Copy link

brianraymor commented May 2, 2024

Questions to answer

Also see cell-science-platform.

Is is possible for COG to eliminate dependencies on owlready2 and transition to JSON?

Note: Early prototyping with cl-simple.json has demonstrated positive results in reproducing COG responses. Will add examples later.

Best practices

Tool developers developing tools that use the ontology (and do not need reasoners), such as database curation tools, web-browsers and similar, should typically use OBO graphs JSON and avoid using OBO format or any of the OWL focussed serialisations (Functional, Manchester or RDF/XML). OWL-focussed serialisations contain a huge deal of axiomatic content that make no sense to most users, and can lead to a variety of mistakes. We have seen it many times that software developers try to interpret OWL axioms somehow to extract relations. Do not do that! Work with the ontologies to ensure they provide the relationships you need in the appropriate form.

Also see developer-friendly JSON exchange format for ontologies

Current state of JSON support in required ontologies

Ontology JSON
Cell Ontology Y
Experimental Factor Ontology Y
Human Ancestry Ontology Y
Human Developmental Stages N
tracking in JSON versions of the ontologies
Mondo Disease Ontology Y
Mouse Developmental Stages N
tracking in JSON versions of the ontologies
NCBI organismal classification Y
Phenotype And Trait Ontology Y
Uberon multi-species anatomy ontology Y

robot convert

There is also the potential to generate missing JSON. See robot convert:

In the following example we convert an input ontology to OBOGraphs JSON, explicitly specifying the target format with --format:

robot convert -i ro-base.owl --format json -o results/ro-base.json

Can a less complex variant of an ontology be specified?

Release Artifacts
Variants

* Simple: A version of the ontology that only contains only a subset of the ontology (only the direct relations, see docs). The simple variant should be used by most users that build tools that use the ontology, especially when serialised as OBO graphs json. This variant should probably be avoided by power-users working with reasoners, as many of the axioms that drive reasoning are missing.

Can an ontology be partially processed?

We could partially consume/process a set of ontologies like EFO or PATO where a subset of terms is important to CELLxGENE - basically, here's the preferred root - parse all the terms under "experimental process" in EFO.

This would reduce processing time and library size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant