Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evolve development_stage_ontology_term_id to support multiple species #1033

Open
brianraymor opened this issue Oct 10, 2024 · 3 comments
Open
Assignees
Labels
multispecies discovery Adding new species to CELLxGENE schema CELLxGENE Discover dataset schema

Comments

@brianraymor
Copy link
Contributor

brianraymor commented Oct 10, 2024

Per November 13 2024 call with @ambrosejcarr, @jahilton, @BAevermann, @SESDNA :

  1. CELLxGENE schema will continue to use taxon specific development stage ontologies when available; otherwise, will default to UBERON.
  2. STRONGLY RECOMMENDED development stage terms will be removed from the schema.

Available from developmental-stage-ontologies

prefix namespace format
acardv (AcarDv) lizard owl
btaudv (BtauDv) cow owl
cfamdv (CfamDv) dog owl
cpordv (CporDv) cavy (Caviidae) owl
danadv (DanaDv) Drosophila ananassae obo
dmojdv (DmojDv) Drosophila mojavensis obo
dpsedv (DpseDv) Drosophila pseudobscura obo
dsimdv (DsimDv) Drosophila simulans owl
dvirdv (DvirDv) Drosophila virilis obo
dyakdv (DyakDv) Drosophila yakuba obo
ecabdv (EcabDv) Horse owl
eeurdv (EeurDv) Hedgehog obo
fcatdv (FcatDv) Cat owl
ggaldv (GgalDv) Chicken owl
ggordv (GgorDv) Gorilla owl
mdomdv (MdomDv) Opossum owl
metadv - -
mmuldv (MmulDv) Rhesus Macaque owl
oanadv (OanaDv) Platypus owl
oaridv (OariDv) Sheep owl
ocundv (OcunDv) Rabbit owl
olatdv (OlatDv) Medaka
adapted from MFO
by Thorsten Henrich
owl
pdumdv (PdumDv) Platynereis owl
ppandv (PpanDv) Bonobo owl
ppygdv (PpygDv) Orangutan obo
ptrodv (PtroDv) Chimpanzee owl
rnordv (RnorDv) Rat owl
ssaldv (SsalDv) Atlantic Salmon obo
sscrdv (SscrDv) Pig owl
tnigdv (TnigDv) Pufferfish obo not in release
@brianraymor brianraymor added schema CELLxGENE Discover dataset schema multispecies discovery Adding new species to CELLxGENE labels Oct 10, 2024
@brianraymor brianraymor self-assigned this Oct 10, 2024
@BAevermann
Copy link

For cases where there is a species specific development stages ontology, why not consider there usage as "REQUIRED"? I am specifically thinking about human and mouse as the terms in UBERON are clear downgrade as compared to the curation currently available.

@brianraymor
Copy link
Contributor Author

We depend on the kindness of curators to define the most accurate development stage terms. For example, the schema only requires

If organism_ontolology_term_id is "NCBITaxon:9606" for Homo sapiens, this MUST be the most accurate descendant of HsapDv:0000001 for life cycle with the following STRONGLY RECOMMENDED: ... followed by a list of HsapDv terms.

There's nothing preventing a submitter from selecting a high-level HsapDv term such as embyronic stage.

Further, the development stage ontologies duplicate the UBERON high-level hierarchical terms for stages such as blastula stage. For example, HsapDv vs UBERON.

The schema could certainly define tables per species with REQUIRED and STRONGLY RECOMMENDED UBERON and species specific ontology terms.

For Use
UBERON stage A term from the set of Carnegie stages 1-23
(up to 8 weeks after conception; e.g. HsapDv:0000003)
UBERON stage A term from the set of 9 to 38 week post-fertilization human stages
(9 weeks after conception and before birth; e.g. HsapDv:0000046)

If @jahilton and @jychien believe that we could strengthen the requirements for development stages to block high-level stages, then that's another possibility - MUST USE A term from the set of Carnegie stages 1-23

Currently, we're in the middle of the multiple species and relaxed schema experiment - but if multiple species begin to surface in the CELLxGENE Discover UX, then I'd expect that @niknak33 and @hthomas-czi may prefer to simplify the Development Stages UX Filter to be species neutral and rely more on the UBERON terms. The current design was based on constraints that are no longer valid.

@jahilton
Copy link
Collaborator

I would support requiring the species-specific Dv ontology to be used, like we currently do for human & mouse, "For cases where species specific development stages ontologies...exist". I don't see any reason to allow an UBERON term in those cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multispecies discovery Adding new species to CELLxGENE schema CELLxGENE Discover dataset schema
Projects
None yet
Development

No branches or pull requests

3 participants