From 0ca2c94caa6e2d9b037823a30c23c53423f2d79a Mon Sep 17 00:00:00 2001 From: ben stear Date: Sat, 25 Nov 2023 19:12:43 -0500 Subject: [PATCH] Update data_dict.md --- petagraph/data_dict.md | 34 ++++++++++++++++------------------ 1 file changed, 16 insertions(+), 18 deletions(-) diff --git a/petagraph/data_dict.md b/petagraph/data_dict.md index a5ac720..4f8bbf7 100644 --- a/petagraph/data_dict.md +++ b/petagraph/data_dict.md @@ -24,21 +24,18 @@ For clarity, all schema figures in this document follow this node color format: [GTEx, Expression data (GTEXEXP)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#gtex-expression-data-gtexexp ) [GTEx, eQTL data (GTEXEQTL)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#gtex-eqtl-data-gtexeqtl) -[GTEx Coexpression data (GTEXCOEXP)]() -[GlyGen (GLYGEN)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#glygen-computational-and-informatics-resources-for-glycoscience-glygen) +[GTEx, Coexpression data (GTEXCOEXP)]() +[GlyGen selected datasets (GLYGEN)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#glygen-computational-and-informatics-resources-for-glycoscience-glygen) [Homo Sapiens Chromosomal Location Ontology (HSCLO)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#homo-sapiens-chromosomal-location-ontology-hsclo) -[Human gene-phenotype mappings (HGNCHPO)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#human-gene-phenotype-hgnchpo) -[Human-Mouse Orthologs (HGNCHCOP)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#human-mouse-orthologs-hgnchcop) -[Mouse gene-phenotype (HCOPMP)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#mouse-gene-phenotype-hcopmp) -[Human Phenotype Ontology to Mouse Phenotype mappings (HPOMP)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#human-phenotype-ontology-to-mouse-phenotype-mappings-hpomp) -[Human-Rat ENSEMBL orthologs (RATHCOP)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#human-rat-ensembl-orthology-rathcop) - -[LINCS L1000 Gene-Perturbagen Associations (LINCS)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#lincs-l1000-gene-perturbagen-associations-lincs) - -[Molecular Signatures Database (MSIGDB)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#molecular-signatures-database-msigdb) - -[Protein - Protein Interactions (STRING)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#protein---protein-interactions-string) -[Single Cell Fetal Heart expression data (ASP2019)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#single-cell-fetal-heart-expression-data-asp2019) +[Human gene-to-phenotype mappings (HGNCHPO)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#human-gene-phenotype-hgnchpo) +[Human-to-mouse ortholog mappings (HGNCHCOP)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#human-mouse-orthologs-hgnchcop) +[Human-to-mouse phenotype mappings (HPOMP)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#human-phenotype-ontology-to-mouse-phenotype-mappings-hpomp) +[Human-to-rat ENSEMBL mappings (RATHCOP)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#human-rat-ensembl-orthology-rathcop) +[LINCS L1000 (LINCS)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#lincs-l1000-lincs) +[Molecular Signatures Database (MSIGDB)]() +[Mouse gene-to-phenotype (HCOPMP)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#mouse-gene-phenotype-hcopmp) +[Single Cell Fetal Heart data (ASP2019)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#single-cell-fetal-heart-expression-data-asp2019) +[STRING (STRING)](https://github.com/TaylorResearchLab/Petagraph/blob/main/petagraph/data_dict.md#string) @@ -180,7 +177,8 @@ return * limit 1 ``` --- -## GTEx Coexpression data (GTEXCOEXP) +## GTEx, Coexpression data (GTEXCOEXP) + **Source**: The source of this data is the `GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_median_tpm.gct` from the GTEx Expression dataset above. **Preproccessing**: Co-expression of genes was computed using Pearson’s correlation. Gene pairs were included if the Pearson correlation coefficient was greater than 0.99. Computing co-expression pairs for all genes in all tissues resulted in many pairs even after filtering for pairs with a score above 0.99. To reduce the size of the data we included only gene co-expression pairs that are highly co-expressed in at least 5 tissues. @@ -232,7 +230,7 @@ RETURN * LIMIT 1 ``` --- -## Human gene-phenotype mappings (HGNCHPO) +## Human gene-to-phenotype mappings (HGNCHPO) **Source**: We use the Human Phenotype (HPO) Ontology mappings for `genes_to_phenotype.txt` and `phenotype_to_genes.txt`. The HPO annotations can be found here: [https://hpo.jax.org/app/data/annotations](https://hpo.jax.org/app/data/annotations). These data are generated by the HPO group using OMIM disease-gene associations to map HPO phenotypes to genes. These data contain 4,545 genes mapped to at least one phenotype and 10,896 phenotypes mapped to at least one gene @@ -249,7 +247,7 @@ return * limit 1 ``` --- -## Human-Mouse Ortholog mappings (HGNCHCOP) +## Human-to-mouse ortholog mappings (HGNCHCOP) **Source**: Mouse genes were downloaded from HGNC Comparisons of Orthology Predictions (HCOP) [https://www.genenames.org/tools/hcop/](https://www.genenames.org/tools/hcop/) (scroll to the bottom, under Bulk Downloads. Select Human - Mouse ortholog data) The human to mouse orthology mapping data were also obtained in April 2023 from the HGNC HCOP tool. @@ -282,7 +280,7 @@ return * limit 1 ``` --- -## Human-Rat ENSEMBL orthologs (RATHCOP) +## Human-to-rat ENSEMBL mappings (RATHCOP) **Source**: The source of the human ENSEMBL to rat ENSEMBL orthologs is the HGNC Comparisons of Orthology Predictions tool. Go to https://www.genenames.org/tools/hcop/, scroll to the Bulk Downloads section at bottom of the page, select `Rat` in the first drop down menu and `15 columns` and download the data. **Preproccessing**: No preprocessing was needed on these mappings, we simply selected the `human_ensembl_gene` and `rat_ensembl_gene` columns from the dataset.