Skip to content

Commit

Permalink
Release 0.4.1 (#1)
Browse files Browse the repository at this point in the history
Release 0.4.1 squashed commit. Refer to release notes and pull request.
  • Loading branch information
brendanreardon authored Mar 1, 2021
1 parent 714649f commit d0140b3
Show file tree
Hide file tree
Showing 53 changed files with 434,497 additions and 445,495 deletions.
6 changes: 4 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,14 @@ RUN pip install -r requirements.txt

COPY example_data/ /example_data/
COPY example_output/ /example_output/
COPY test/ /test/

RUN mkdir /moalmanac/
RUN mkdir /moalmanac/datasources/
RUN mkdir /docs/

COPY moalmanac/test/ moalmanac/test/
COPY moalmanac/datasources/acmg/ /moalmanac/datasources/acmg/
COPY moalmanac/datasources/almanac/ /moalmanac/datasources/almanac/
COPY moalmanac/datasources/moalmanac/ /moalmanac/datasources/moalmanac/
COPY moalmanac/datasources/cancergenecensus/ /moalmanac/datasources/cancergenecensus/
COPY moalmanac/datasources/cancerhotspots/ /moalmanac/datasources/cancerhotspots/
COPY moalmanac/datasources/clinvar/ /moalmanac/datasources/clinvar/
Expand All @@ -38,6 +39,7 @@ COPY moalmanac/templates/ /moalmanac/templates/
COPY moalmanac/wrapper_deconstructsigs.sh moalmanac/run_deconstructsigs.R /moalmanac/
COPY moalmanac/*.py moalmanac/*.ini /moalmanac/

COPY docs/* /docs/
COPY README.md /
COPY LICENSE /
COPY Dockerfile /
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ Optional arguments:
--snv_handle <string> handle for MAF file of somatic single nucleotide variants
--indel_handle <string> handle for MAF file of somatic insertions and deletions
--bases_covered_handle <string> handle for text file which contains the number of calcable somatic bases
--called_cn_handle <string> handle for text file which contained genes and copy number calls, will be used over `--cnv_handle`
--cnv_handle <string> handle for annotated seg file for somatic copy number
--fusion_handle <string> handle for STAR fusion output, .final.abridged
--germline_handle <string> handle for MAF file of germline single nucleotide variants and insertions and deletions
Expand Down
24 changes: 21 additions & 3 deletions docs/description-of-inputs.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,24 @@ Required fields can be changed from their default expectations by editing the ap
### Required fields
This input is looking for an integer value.

## Called copy number alterations
`--called_cn_handle` anticipated a tab delimited file which contains one column for gene name and a second for the copy number call. For the latter, only the values `Amplification` and `Deletion` will be used by the Molecular Oncology Almanac.

### Example
|gene|call|
|-|-|
|TP53|Deletion|
|CDKN2A|Deletion|
|BRAF|Baseline|
|EGFR|Amplification|

The rows associated with _TP53_, _CDKN2A_, and _EGFR_ will be interpreted and scored by Molecular Oncology Almanac while _BRAF_ will be filtered.

### Required files
Required fields can be changed from their default expectations by editing the appropriate section of [colnames.ini](https://github.com/vanallenlab/moalmanac/blob/main/moalmanac/colnames.ini). Column names are case sensitive.
- `gene`, gene symbol associated with the copy number alteration
- `call`, copy number event of the gene. `Amplification` and `Deletion` are accepted and all other values will be filtered.

## Copy number alterations
`--cnv_handle` anticipates a tab delimited file which contains total copy number from a source such as GATK CNV or ReCapSeg, support for allele specific copy number is in progress. This file should have genes associated with segments. Amplifications are called from the top 2.5% of all unique segments and deletions called from the bottom 2.5% of all unique segments.

Expand All @@ -124,16 +142,16 @@ Required fields can be changed from their default expectations by editing the ap
`--fusion_handle` anticipates a tab delimited file which contains fusions, specifically in the format of STAR Fusion.

### Example
|#fusion_name|SpanningFrags|LeftBreakpoint|RightBreakpoint|
|#FusionName|SpanningFragCount|LeftBreakpoint|RightBreakpoint|
|-|-|-|-|
|EML4--ALK|0|6:47471176|11:66563752|
|COL1A2--APBA3|6|9:35657873|21:46320255|
|POLR2A--AP2M1|12|17:7406801|3:183898675|

### Required fields
Required fields can be changed from their default expectations by editing the appropriate section of [colnames.ini](https://github.com/vanallenlab/moalmanac/blob/main/moalmanac/colnames.ini). Column names are case sensitive.
- `#fusion_name`, gene symbols associated with the fusion separated by `--`. Genes are labeled from 5' to 3'.
- `SpanningFrags`, counts of RNA-seq fragments supporting the fusion
- `#FusionName`, gene symbols associated with the fusion separated by `--`. Genes are labeled from 5' to 3'.
- `SpanningFragCount`, counts of RNA-seq fragments supporting the fusion
- `LeftBreakpoint`, genomic position of the fusion's left breakpoint
- `RightBreakpoint`, genomic position of the fusion's right breakpoint

Expand Down
16 changes: 7 additions & 9 deletions docs/description-of-outputs.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,24 +169,22 @@ Each molecular feature will also receive a label in the `score_bin` column based
* Somatic and germline variants - Gene, variant classification, and protein change match a catalogued variant
* Copy number alterations - Gene and copy number direction match a catalogued event
* Fusions - Both genes involved in a fusion event match a catalogued event
* `Investigate Actionability - High`
* Somatic and germline variants - Gene and variant classification match a catalogued event but not a specific protein change
* `Investigate Actionability - Low`
* Somatic and germline variants - Gene and feature type match a catalogued variant but not variant classification
* `Investigate Actionability`
* Somatic and germline variants - Gene and feature type match a catalogued variant but not variant classification or a specific protein change
* Copy number alterations - Gene and feature type match a catalogued copy number alteration but not direction
* Fusions - One gene fusion partner is catalogued as a fusion in Molecular Oncology Almanac but not both
* `Biologically Relevance`
* The gene(s) associated with the molecular feature is present in Molecular Oncology Almanac but under a different feature type

The following second-order molecular features are evaluated in `score_bin` as follows:
* High mutational burden is labeled as `Investigate Actionability - High`
* MSI-High is labeled as `Investigate Actionability - High`
* Whole-genome doubling is labeled as `Investigate Actionability - High`
* Mutational signatures catalogued by Molecular Oncology Almanac are labeled as `Investigate Actionability - High` and otherwise labeled as `Biologically Relevant`
* High mutational burden is labeled as `Investigate Actionability`
* MSI-High is labeled as `Investigate Actionability`
* Whole-genome doubling is labeled as `Investigate Actionability`
* Mutational signatures catalogued by Molecular Oncology Almanac are labeled as `Investigate Actionability` and otherwise labeled as `Biologically Relevant`
* Variants associated with microsatellite instability are listed as "Supporting variants" as `Biologically Relevant`

### Evidence of clinical assertions
If a molecular feature matched as `Putatively Actionable`, `Investigate Actionability - High`, or `Investigate Actionability - Low` in Molecular Oncology Almanac, the molecular feature will be associated with clinical evidence. A molecular feature will be matched independently on catalogued events associated with therapeutic sensitivity, therapeutic resistance, and disease prognosis.
If a molecular feature matched as `Putatively Actionable`, or `Investigate Actionability` in Molecular Oncology Almanac, the molecular feature will be associated with clinical evidence. A molecular feature will be matched independently on catalogued events associated with therapeutic sensitivity, therapeutic resistance, and disease prognosis.

#### Associated evidence, Predictive Implication
All catalogued events in Molecular Oncology Almanac are cited and have associated evidence. Evidence tiers are as follows:
Expand Down
22 changes: 11 additions & 11 deletions example_data/example_patient.capture.germline.maf
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
Hugo_Symbol NCBI_Build Chromosome Start_position End_position Variant_Classification Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 Tumor_Sample_Barcode Matched_Norm_Sample_Barcode Annotation_Transcript Protein_Change t_alt_count t_ref_count
PRDM2 37 1 14105122 14105122 In_Frame_Del GAA GAA - __UNKNOWN__ example_normal_profile p.E282del 50 50
ALK 37 2 29416572 29416572 Missense_Mutation T T C __UNKNOWN__ example_normal_profile p.I1461V 50 50
BIRC6 37 2 32667182 32667182 Missense_Mutation G G C __UNKNOWN__ example_normal_profile p.V1332L 50 50
MSH6 37 2 48010488 48010488 Missense_Mutation G G A __UNKNOWN__ example_normal_profile p.G39E 50 50
FGFR4 37 5 176520243 176520243 Missense_Mutation G G A __UNKNOWN__ example_normal_profile p.G388R 50 50
BRAF 37 7 140476881 140476881 Nonsense_Mutation G G A __UNKNOWN__ example_normal_profile p.R509* 50 50
TP53 37 17 7579472 7579472 Missense_Mutation G G C __UNKNOWN__ example_normal_profile p.P72R 50 50
BRCA2 37 13 32906729 32906729 Missense_Mutation A A C __UNKNOWN__ example_normal_profile p.N372H 50 50
BRCA2 37 13 32914438 32914438 Frame_Shift_Del T T - __UNKNOWN__ example_normal_profile p.S1982fs 50 50
BCR 37 22 23627369 23627369 Missense_Mutation A A G __UNKNOWN__ example_normal_profile p.N796S 50 50
Hugo_Symbol NCBI_Build Chromosome Start_position End_position Variant_Classification Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 Tumor_Sample_Barcode Matched_Norm_Sample_Barcode Annotation_Transcript Protein_Change t_alt_count t_ref_count
PRDM2 37 1 14105122 14105122 In_Frame_Del GAA GAA - example_tumor_profile example_normal_profile p.E282del 50 50
ALK 37 2 29416572 29416572 Missense_Mutation T T C example_tumor_profile example_normal_profile p.I1461V 50 50
BIRC6 37 2 32667182 32667182 Missense_Mutation G G C example_tumor_profile example_normal_profile p.V1332L 50 50
MSH6 37 2 48010488 48010488 Missense_Mutation G G A example_tumor_profile example_normal_profile p.G39E 50 50
FGFR4 37 5 176520243 176520243 Missense_Mutation G G A example_tumor_profile example_normal_profile p.G388R 50 50
BRAF 37 7 140476881 140476881 Nonsense_Mutation G G A example_tumor_profile example_normal_profile p.R509* 50 50
TP53 37 17 7579472 7579472 Missense_Mutation G G C example_tumor_profile example_normal_profile p.P72R 50 50
BRCA2 37 13 32906729 32906729 Missense_Mutation A A C example_tumor_profile example_normal_profile p.N372H 50 50
BRCA2 37 13 32914438 32914438 Frame_Shift_Del T T - example_tumor_profile example_normal_profile p.S1982fs 50 50
BCR 37 22 23627369 23627369 Missense_Mutation A A G example_tumor_profile example_normal_profile p.N796S 50 50
7 changes: 7 additions & 0 deletions example_data/example_patient.capture.somatic.called.cna.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
gene call
TP53 Deletion
CDKN2A Deletion
BRAF Amplification
CDK4 Amplification
BLM Wild type
FGFR2 Deletion
4 changes: 3 additions & 1 deletion example_data/example_patient.capture.somatic.indels.maf
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
Hugo_Symbol NCBI_Build Chromosome Start_position End_position Variant_Classification Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 Tumor_Sample_Barcode Matched_Norm_Sample_Barcode Annotation_Transcript Protein_Change t_alt_count t_ref_countPMPCA 37 9 139312448 139312449 Intron - - G example_patient_tumor example_patient_normal ENST00000371717.3 31 92C10orf2 37 10 102748300 102748301 Frame_Shift_Del TC TC - example_patient_tumor example_patient_normal ENST00000370228.1 p.L112fs 28 294
Hugo_Symbol NCBI_Build Chromosome Start_position End_position Variant_Classification Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 Tumor_Sample_Barcode Matched_Norm_Sample_Barcode Annotation_Transcript Protein_Change t_alt_count t_ref_count
PMPCA 37 9 139312448 139312449 Intron - - G example_tumor_profile example_normal_profile ENST00000371717.3 31 92
C10orf2 37 10 102748300 102748301 Frame_Shift_Del TC TC - example_tumor_profile example_normal_profile ENST00000370228.1 p.L112fs 28 294
Loading

0 comments on commit d0140b3

Please sign in to comment.