Skip to content

Latest commit

 

History

History
107 lines (71 loc) · 5.39 KB

README.md

File metadata and controls

107 lines (71 loc) · 5.39 KB

LinkML schemas for the BICAN project

This folder contains all the original LinkML schemas written in YAML format. You can learn more about the LinkML language here.

Some of the models are written directly in the YAML format, while others automatically generated from Google sheets using the LinkML tool schemasheets, and the schema2model tool from the bkbit package.

Main models

The list below contains the main models that are exported to different formats, such as JSON Schema, Pydantic models, and JSON-LD context. We also have some additional auxiliary models that are used to extract the core types and used by the main models.

The Anatomical Structure schema is designed to represent types and relationships between anatomical brain structures.

Updates

The model has been created directly in the YAML format and all the updates can be done by editing the file directly.

The Assertion Evidence schema is designed to represent types and relationships between assertions and evidence items.

Updates

The model has been created from a Google sheet and all information of the Google sheet id and id of the specific tabs are in the setting file. The source_assertion_evidence/gsheet_output folder contains the cvs files generated from the Google sheet at the time of the model creation.

In order to update the model, the Google sheet has to be edited, and the generate_yaml_model workflow has to be triggered manually.

The Genome Annotation schema is designed to represent types and relationships between entities that constitute an organism's annotated genome.

Updates

The model has been created directly in the YAML format, and all the updates can be done by editing the file directly.

The Library Generation schema is designed to represent types and relationships between samples and digital data assets generated during processes that generate multimodal genomic data.

Updates

The model has been created from the Google sheet, all information of the Google sheet id and id of the specifics tabs are in the setting file. The source_library_generation/gsheet_output folder contains the cvs files generated from the Google sheet at the time of the model creation.

In order to update the model, the Google sheet has to be edited, and the generate_yaml_model workflow has to be triggered manually.

Auxiliary models

These models are used to extract the core types and used by the main models, you can see it in the imports sections.

Contains the core types used in the Anatomical Structure Schema.

Updates

The model has been created directly in the YAML format, and all the updates can be done by editing the file directly.

The model contains a subset of classes from the Biolink Model with some modifications to fit the needs of the BICAN project (currently only the category slot is modified). The model is created using the LinkML Schema Trimmer from the bkbit package. The Biolink Model was trimmed to contain these classes: 'gene', 'genome', 'organism taxon', 'thing with taxon', 'material sample', 'procedure', 'entity', 'activity', 'named thing'; as well as respective dependency classes, slots, and enums to create BICAN Biolink.

Updates

The yaml file can be recreated by running the LinkML Schema Trimmer from bkbit package:

$ bkbit linkml-trimmer --classes "gene, genome, organism taxon, thing with taxon, material sample, procedure, entity, activity, named thing" biolink.yaml > bican-biolink.yaml

In order to adjust the category slot, the following you can run:

python ../utils/bican_biolink_edit.py bican_biolink.yaml

The BICAN Core schema is designed to represent classes, slots, and enums that are frequently used in BICAN schemas.

Updates

The model has been created directly in the YAML format, and all the updates can be done by editing the file directly.

The BICAN Prov schema contains a subset of classes from the Prov Data Model (PROV-DM) that are frequently used in BICAN schemas.

Updates

The model has been created directly in the YAML format, and all the updates can be done by editing the file directly.

Deprecated models

These are models that are no longer used, but are kept for reference.

A depreciated model, initial attempt to convert a CCN2 model to LinkML.

A depreciated model, initial attempt to provide a schema for data presented on Figure1 from Yao, Z. et al., Nature 624 (2023).