Skip to content

Commit

Permalink
Merge pull request #97 from joerivandervelde/main
Browse files Browse the repository at this point in the history
Prepare for v1.1
  • Loading branch information
joerivandervelde authored Jul 20, 2021
2 parents 2af4806 + df0998d commit 1c995b2
Show file tree
Hide file tree
Showing 24 changed files with 986 additions and 966 deletions.
Binary file modified derived/pdf/fair-genomes.pdf
Binary file not shown.
6 changes: 3 additions & 3 deletions fair-genomes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
name: FAIR Genomes metadata schema
description: The FAIR Genomes semantic metadata schema to power reuse of NGS data in research and healthcare.
version: 1.1
releaseType: SNAPSHOT
date: 2021-07-15
releaseType: Minor
date: 2021-07-20
lookupGlobalOptions: lookups/NullFlavors.txt
authors:
- name: K. Joeri van der Velde
Expand Down Expand Up @@ -355,7 +355,7 @@ modules:
- name: Biospecimen type
description: The type of material taken from a biological entity for testing, diagnostic, propagation, treatment or research purposes.
ontology: NCIT:C70713 [http://purl.obolibrary.org/obo/NCIT_C70713]
values: LookupOne [lookups/MaterialTypes.txt], ofType [http://purl.obolibrary.org/obo/NCIT_C70699]
values: LookupOne [lookups/BiospecimenTypes.txt], ofType [http://purl.obolibrary.org/obo/NCIT_C70699]
- name: Anatomical source
description: Biological entity that constitutes the structural organization of an individual member of a biological species from which this material was taken.
ontology: NCIT:C103264 [http://purl.obolibrary.org/obo/NCIT_C103264]
Expand Down
1,890 changes: 947 additions & 943 deletions generated/art-decor/fair-genomes_en-US.xml

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions generated/latex/fair-genomes.tex
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
\textbf{FAIR Genomes semantic metadata schema}
\newline

The FAIR Genomes semantic metadata schema to power reuse of NGS data in research and healthcare. Version 1.1-SNAPSHOT, 2021-07-15. This model consists of 9 modules that contain 110 metadata elements and 85305 lookups in total (excluding null flavors).
The FAIR Genomes semantic metadata schema to power reuse of NGS data in research and healthcare. Version 1.1-Minor, 2021-07-20. This model consists of 9 modules that contain 110 metadata elements and 85307 lookups in total (excluding null flavors).

\begin{table}[htb]
\begin{tabular}{lll}
Expand All @@ -23,7 +23,7 @@
Analysis & EDAM:operation\_2945 & 11 \\
\hline
\end{tabular}
\caption[Module overview]{\label{table:table1} FAIR Genomes v1.1-SNAPSHOT overview of all modules.}
\caption[Module overview]{\label{table:table1} FAIR Genomes v1.1-Minor overview of all modules.}
\end{table}

\begin{table}[htb]
Expand Down Expand Up @@ -145,7 +145,7 @@
Sampling protocol & EFO:0005518 & Text \\
Sampling protocol deviation & NCIT:C50996 & String \\
Reason for sampling protocol deviation & NCIT:C93529 & String \\
Biospecimen type & NCIT:C70713 & MaterialTypes lookup (403 choices) \\
Biospecimen type & NCIT:C70713 & BiospecimenTypes lookup (403 choices) \\
Anatomical source & NCIT:C103264 & AnatomicalSources lookup (13827 choices) \\
Pathological state & GO:0001894 & PathologicalState lookup (4 choices) \\
Storage conditions & NCIT:C96145 & StorageConditions lookup (26 choices) \\
Expand All @@ -165,9 +165,9 @@
Sampleprep identifier & NCIT:C132299 & UniqueID \\
Belongs to material & NCIT:C25683 & Reference to Material \\
Input amount & AFRL:0000010 & Integer \\
Library preparation kit & GENEPIO:0000085 & NGSKits lookup (615 choices) \\
Library preparation kit & GENEPIO:0000085 & NGSKits lookup (616 choices) \\
PCR free & NCIT:C17003 & Boolean \\
Target enrichment kit & NCIT:C154307 & NGSKits lookup (615 choices) \\
Target enrichment kit & NCIT:C154307 & NGSKits lookup (616 choices) \\
UMIs present & EFO:0010199 & Boolean \\
Intended insert size & FG:0000001 & Integer \\
Intended read length & NCIT:C153362 & Integer \\
Expand Down
8 changes: 4 additions & 4 deletions generated/markdown/fairgenomes-semantic-model.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# FAIR Genomes semantic metadata schema

The FAIR Genomes semantic metadata schema to power reuse of NGS data in research and healthcare. Version 1.1-SNAPSHOT, 2021-07-15. This model consists of __9 modules__ that contain __110 metadata elements__ and __85305 lookups__ in total (excluding null flavors).
The FAIR Genomes semantic metadata schema to power reuse of NGS data in research and healthcare. Version 1.1-Minor, 2021-07-20. This model consists of __9 modules__ that contain __110 metadata elements__ and __85307 lookups__ in total (excluding null flavors).

## Module overview

Expand Down Expand Up @@ -121,7 +121,7 @@ A natural substance derived from living organisms such as cells, tissues, protei
| Sampling protocol | The procedure whereby this material was sampled for an analysis. | [EFO:0005518](http://www.ebi.ac.uk/efo/EFO_0005518) | Text |
| Sampling protocol deviation | A variation from processes or procedures defined in the sampling protocol. Deviations usually do not preclude the overall evaluability of subject data for either efficacy or safety, and are often acknowledged and accepted in advance by the sponsor. | [NCIT:C50996](http://purl.obolibrary.org/obo/NCIT_C50996) | String |
| Reason for sampling protocol deviation | The rationale for why a deviation from the sampling protocol has occurred. | [NCIT:C93529](http://purl.obolibrary.org/obo/NCIT_C93529) | String |
| Biospecimen type | The type of material taken from a biological entity for testing, diagnostic, propagation, treatment or research purposes. | [NCIT:C70713](http://purl.obolibrary.org/obo/NCIT_C70713) | [MaterialTypes](../../lookups/MaterialTypes.txt) lookup (403 choices [of type](http://purl.obolibrary.org/obo/NCIT_C70699)) |
| Biospecimen type | The type of material taken from a biological entity for testing, diagnostic, propagation, treatment or research purposes. | [NCIT:C70713](http://purl.obolibrary.org/obo/NCIT_C70713) | [BiospecimenTypes](../../lookups/BiospecimenTypes.txt) lookup (403 choices [of type](http://purl.obolibrary.org/obo/NCIT_C70699)) |
| Anatomical source | Biological entity that constitutes the structural organization of an individual member of a biological species from which this material was taken. | [NCIT:C103264](http://purl.obolibrary.org/obo/NCIT_C103264) | [AnatomicalSources](../../lookups/AnatomicalSources.txt) lookup (13827 choices [of type](http://purl.obolibrary.org/obo/UBERON_0001062)) |
| Pathological state | The pathological state of the tissue from which this material was derived. | [GO:0001894](http://purl.obolibrary.org/obo/GO_0001894) | [PathologicalState](../../lookups/PathologicalState.txt) lookup (4 choices [of type](http://purl.obolibrary.org/obo/NCIT_C164617)) |
| Storage conditions | The conditions under which this biological material was stored. | [NCIT:C96145](http://purl.obolibrary.org/obo/NCIT_C96145) | [StorageConditions](../../lookups/StorageConditions.txt) lookup (26 choices [of type](http://purl.obolibrary.org/obo/NCIT_C96145)) |
Expand All @@ -138,9 +138,9 @@ A sample preparation for a nucleic acids sequencing assay. Ontology: [OBI:000190
| Sampleprep identifier | A unique proper name or character sequence that identifies this particular sample preparation. | [NCIT:C132299](http://purl.obolibrary.org/obo/NCIT_C132299) | UniqueID |
| Belongs to material | Reference to the source material from which this sample was prepared. | [NCIT:C25683](http://purl.obolibrary.org/obo/NCIT_C25683) | Reference to instances of Material |
| Input amount | Amount of input material in nanogram (ng). | [AFRL:0000010](http://purl.allotrope.org/ontologies/role#AFRL_0000010) | Integer |
| Library preparation kit | Pre-filled, ready-to-use reagent cartridges intented to improve chemistry, cluster density and read length as well as improve quality (Q) scores for this sample. Reagent components are encoded to interact with the sequencing system to validate compatibility with user-defined applications. | [GENEPIO:0000085](http://purl.obolibrary.org/obo/GENEPIO_0000085) | [NGSKits](../../lookups/NGSKits.txt) lookup (615 choices [of type](http://purl.obolibrary.org/obo/GENEPIO_0000081)) |
| Library preparation kit | Pre-filled, ready-to-use reagent cartridges intented to improve chemistry, cluster density and read length as well as improve quality (Q) scores for this sample. Reagent components are encoded to interact with the sequencing system to validate compatibility with user-defined applications. | [GENEPIO:0000085](http://purl.obolibrary.org/obo/GENEPIO_0000085) | [NGSKits](../../lookups/NGSKits.txt) lookup (616 choices [of type](http://purl.obolibrary.org/obo/GENEPIO_0000081)) |
| PCR free | Indicates whether a polymerase chain reaction (PCR) was used to prepare this sample. PCR is a method for amplifying a DNA base sequence using multiple rounds of heat denaturation of the DNA and annealing of oligonucleotide primers complementary to flanking regions in the presence of a heat-stable polymerase. | [NCIT:C17003](http://purl.obolibrary.org/obo/NCIT_C17003) | Boolean |
| Target enrichment kit | Indicates which target enrichment kit was used to prepare this sample. Target enrichment is a pre-sequencing DNA preparation step where DNA sequences are either directly amplified (amplicon or multiplex PCR-based) or captured (hybrid capture-based) in order to only focus on specific regions of a genome or DNA sample. | [NCIT:C154307](http://purl.obolibrary.org/obo/NCIT_C154307) | [NGSKits](../../lookups/NGSKits.txt) lookup (615 choices [of type](http://purl.obolibrary.org/obo/GENEPIO_0000081)) |
| Target enrichment kit | Indicates which target enrichment kit was used to prepare this sample. Target enrichment is a pre-sequencing DNA preparation step where DNA sequences are either directly amplified (amplicon or multiplex PCR-based) or captured (hybrid capture-based) in order to only focus on specific regions of a genome or DNA sample. | [NCIT:C154307](http://purl.obolibrary.org/obo/NCIT_C154307) | [NGSKits](../../lookups/NGSKits.txt) lookup (616 choices [of type](http://purl.obolibrary.org/obo/GENEPIO_0000081)) |
| UMIs present | Indicates whether any unique molecular identifiers (UMIs) are present. An UMI barcode is a short nucleotide sequence that is used to identify reads originating from an individual mRNA molecule. | [EFO:0010199](http://www.ebi.ac.uk/efo/EFO_0010199) | Boolean |
| Intended insert size | In paired-end sequencing, the DNA between the adapter sequences is the insert. The length of this sequence is known as the insert size, not to be confused with the inner distance between reads. So, fragment length equals read adapter length (2x) plus insert size, and insert size equals read lenght (2x) plus inner distance. | [FG:0000001](https://w3id.org/fair-genomes/resource/FG_0000001) | Integer |
| Intended read length | The number of nucleotides intended to be ordered from each side of a nucleic acid fragment obtained after the completion of a sequencing process. | [NCIT:C153362](http://purl.obolibrary.org/obo/NCIT_C153362) | Integer |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -614,6 +614,7 @@ ClearSeq Inherited Disease Plus, 16, XT2 by Agilent Technologies ClearSeq Inheri
ClearSeq Inherited Disease Plus, 96, XT by Agilent Technologies ClearSeq Inherited Disease Plus, 96, XT by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020. FG 0000737 https://w3id.org/fair-genomes/resource/FG_0000737
ClearSeq Inherited Disease Plus, 96, XT2 by Agilent Technologies ClearSeq Inherited Disease Plus, 96, XT2 by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020. FG 0000738 https://w3id.org/fair-genomes/resource/FG_0000738
ClearSeq Inherited Disease, 16, XT by Agilent Technologies ClearSeq Inherited Disease, 16, XT by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020. FG 0000739 https://w3id.org/fair-genomes/resource/FG_0000739
Illumina TruSeq DNA PCR-Free Illumina TruSeq DNA PCR-Free provides simple, all-inclusive library preparation for whole-genome sequencing applications. Illumina truseq-dna-pcr-free https://www.illumina.com/products/by-type/sequencing-kits/library-prep-kits/truseq-dna-pcr-free.html
NoInformation (NI, nullflavor) The value is exceptional (missing, omitted, incomplete, improper). No information as to the reason for being an exceptional value is provided. This is the most general exceptional value. It is also the default exceptional value. HL7 NI http://terminology.hl7.org/CodeSystem/v3-NullFlavor#NI
Invalid (INV, nullflavor) The value as represented in the instance is not a member of the set of permitted data values in the constrained value domain of a variable. HL7 INV http://terminology.hl7.org/CodeSystem/v3-NullFlavor#INV
Derived (DER, nullflavor) An actual value may exist, but it must be derived from the provided information (usually an EXPR generic data type extension will be used to convey the derivation expression . HL7 DER http://terminology.hl7.org/CodeSystem/v3-NullFlavor#DER
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -614,6 +614,7 @@ ClearSeq Inherited Disease Plus, 16, XT2 by Agilent Technologies ClearSeq Inheri
ClearSeq Inherited Disease Plus, 96, XT by Agilent Technologies ClearSeq Inherited Disease Plus, 96, XT by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020. FG 0000737 https://w3id.org/fair-genomes/resource/FG_0000737
ClearSeq Inherited Disease Plus, 96, XT2 by Agilent Technologies ClearSeq Inherited Disease Plus, 96, XT2 by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020. FG 0000738 https://w3id.org/fair-genomes/resource/FG_0000738
ClearSeq Inherited Disease, 16, XT by Agilent Technologies ClearSeq Inherited Disease, 16, XT by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020. FG 0000739 https://w3id.org/fair-genomes/resource/FG_0000739
Illumina TruSeq DNA PCR-Free Illumina TruSeq DNA PCR-Free provides simple, all-inclusive library preparation for whole-genome sequencing applications. Illumina truseq-dna-pcr-free https://www.illumina.com/products/by-type/sequencing-kits/library-prep-kits/truseq-dna-pcr-free.html
NoInformation (NI, nullflavor) The value is exceptional (missing, omitted, incomplete, improper). No information as to the reason for being an exceptional value is provided. This is the most general exceptional value. It is also the default exceptional value. HL7 NI http://terminology.hl7.org/CodeSystem/v3-NullFlavor#NI
Invalid (INV, nullflavor) The value as represented in the instance is not a member of the set of permitted data values in the constrained value domain of a variable. HL7 INV http://terminology.hl7.org/CodeSystem/v3-NullFlavor#INV
Derived (DER, nullflavor) An actual value may exist, but it must be derived from the provided information (usually an EXPR generic data type extension will be used to convey the derivation expression . HL7 DER http://terminology.hl7.org/CodeSystem/v3-NullFlavor#DER
Expand Down
2 changes: 1 addition & 1 deletion generated/molgenis-emx/sys_md_Package.tsv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
id label description
fair-genomes FAIR Genomes metadata schema The FAIR Genomes semantic metadata schema to power reuse of NGS data in research and healthcare. Version 1.1-SNAPSHOT (2021-07-15)
fair-genomes FAIR Genomes metadata schema The FAIR Genomes semantic metadata schema to power reuse of NGS data in research and healthcare. Version 1.1-Minor (2021-07-20)
1 change: 1 addition & 0 deletions generated/molgenis-emx2/librarypreparationkit.csv
Original file line number Diff line number Diff line change
Expand Up @@ -614,6 +614,7 @@ value,description,codesystem,code,iri
"ClearSeq Inherited Disease Plus, 96, XT by Agilent Technologies","ClearSeq Inherited Disease Plus, 96, XT by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020.","FG","0000737","https://w3id.org/fair-genomes/resource/FG_0000737"
"ClearSeq Inherited Disease Plus, 96, XT2 by Agilent Technologies","ClearSeq Inherited Disease Plus, 96, XT2 by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020.","FG","0000738","https://w3id.org/fair-genomes/resource/FG_0000738"
"ClearSeq Inherited Disease, 16, XT by Agilent Technologies","ClearSeq Inherited Disease, 16, XT by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020.","FG","0000739","https://w3id.org/fair-genomes/resource/FG_0000739"
"Illumina TruSeq DNA PCR-Free","Illumina TruSeq DNA PCR-Free provides simple, all-inclusive library preparation for whole-genome sequencing applications.","Illumina","truseq-dna-pcr-free","https://www.illumina.com/products/by-type/sequencing-kits/library-prep-kits/truseq-dna-pcr-free.html"
"NoInformation (NI, nullflavor)","The value is exceptional (missing, omitted, incomplete, improper). No information as to the reason for being an exceptional value is provided. This is the most general exceptional value. It is also the default exceptional value.","HL7","NI","http://terminology.hl7.org/CodeSystem/v3-NullFlavor#NI"
"Invalid (INV, nullflavor)","The value as represented in the instance is not a member of the set of permitted data values in the constrained value domain of a variable.","HL7","INV","http://terminology.hl7.org/CodeSystem/v3-NullFlavor#INV"
"Derived (DER, nullflavor)","An actual value may exist, but it must be derived from the provided information (usually an EXPR generic data type extension will be used to convey the derivation expression .","HL7","DER","http://terminology.hl7.org/CodeSystem/v3-NullFlavor#DER"
Expand Down
1 change: 1 addition & 0 deletions generated/molgenis-emx2/targetenrichmentkit.csv
Original file line number Diff line number Diff line change
Expand Up @@ -614,6 +614,7 @@ value,description,codesystem,code,iri
"ClearSeq Inherited Disease Plus, 96, XT by Agilent Technologies","ClearSeq Inherited Disease Plus, 96, XT by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020.","FG","0000737","https://w3id.org/fair-genomes/resource/FG_0000737"
"ClearSeq Inherited Disease Plus, 96, XT2 by Agilent Technologies","ClearSeq Inherited Disease Plus, 96, XT2 by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020.","FG","0000738","https://w3id.org/fair-genomes/resource/FG_0000738"
"ClearSeq Inherited Disease, 16, XT by Agilent Technologies","ClearSeq Inherited Disease, 16, XT by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020.","FG","0000739","https://w3id.org/fair-genomes/resource/FG_0000739"
"Illumina TruSeq DNA PCR-Free","Illumina TruSeq DNA PCR-Free provides simple, all-inclusive library preparation for whole-genome sequencing applications.","Illumina","truseq-dna-pcr-free","https://www.illumina.com/products/by-type/sequencing-kits/library-prep-kits/truseq-dna-pcr-free.html"
"NoInformation (NI, nullflavor)","The value is exceptional (missing, omitted, incomplete, improper). No information as to the reason for being an exceptional value is provided. This is the most general exceptional value. It is also the default exceptional value.","HL7","NI","http://terminology.hl7.org/CodeSystem/v3-NullFlavor#NI"
"Invalid (INV, nullflavor)","The value as represented in the instance is not a member of the set of permitted data values in the constrained value domain of a variable.","HL7","INV","http://terminology.hl7.org/CodeSystem/v3-NullFlavor#INV"
"Derived (DER, nullflavor)","An actual value may exist, but it must be derived from the provided information (usually an EXPR generic data type extension will be used to convey the derivation expression .","HL7","DER","http://terminology.hl7.org/CodeSystem/v3-NullFlavor#DER"
Expand Down
10 changes: 10 additions & 0 deletions generated/ontology/fair-genomes-ngskits.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -3687,6 +3687,11 @@ fg:Sample_preparation_Library_preparation_kit_HaloPlex_HS_Prepack_Reagents_ILMN_
dc:description "ClearSeq Inherited Disease, 16, XT by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020.";
rdfs:isDefinedBy <https://w3id.org/fair-genomes/resource/FG_0000739> .

fg:Sample_preparation_Library_preparation_kit_Illumina_TruSeq_DNA_PCR-Free a obo:GENEPIO_0000081;
rdfs:label "Illumina TruSeq DNA PCR-Free";
dc:description "Illumina TruSeq DNA PCR-Free provides simple, all-inclusive library preparation for whole-genome sequencing applications.";
rdfs:isDefinedBy <https://www.illumina.com/products/by-type/sequencing-kits/library-prep-kits/truseq-dna-pcr-free.html> .

<https://w3id.org/fair-genomes/ontology/Sample_preparation_Library_preparation_kit_NoInformation_(NI,_nullflavor)>
a obo:GENEPIO_0000081;
rdfs:label "NoInformation (NI, nullflavor)";
Expand Down Expand Up @@ -7460,6 +7465,11 @@ fg:Sample_preparation_Target_enrichment_kit_HaloPlex_HS_Prepack_Reagents_ILMN_16
dc:description "ClearSeq Inherited Disease, 16, XT by Agilent Technologies sourced from https://www.biocompare.com, accessed May 2020.";
rdfs:isDefinedBy <https://w3id.org/fair-genomes/resource/FG_0000739> .

fg:Sample_preparation_Target_enrichment_kit_Illumina_TruSeq_DNA_PCR-Free a obo:GENEPIO_0000081;
rdfs:label "Illumina TruSeq DNA PCR-Free";
dc:description "Illumina TruSeq DNA PCR-Free provides simple, all-inclusive library preparation for whole-genome sequencing applications.";
rdfs:isDefinedBy <https://www.illumina.com/products/by-type/sequencing-kits/library-prep-kits/truseq-dna-pcr-free.html> .

<https://w3id.org/fair-genomes/ontology/Sample_preparation_Target_enrichment_kit_NoInformation_(NI,_nullflavor)>
a obo:GENEPIO_0000081;
rdfs:label "NoInformation (NI, nullflavor)";
Expand Down
Loading

0 comments on commit 1c995b2

Please sign in to comment.