Releases: vanallenlab/moalmanac
Release 0.8.0
This release utilizes the v.2024-12-05 version of the underlying database and contains changes from several pull requests: #27, #28, #29, #30, #31.
Additions:
- A new output of the suffix .input-metadata.txt. This converts the patients dictionary, which contains input details such as string label, tumor purity, ploidy, and msi status, into a dataframe and outputs it as a table. From #30 .
- A new output of the suffix .moalmanac-execution.json. This compiles all config settings set in config.ini, execution runtime, input file paths, input datasource paths, input metadata, as well as each tabular output into a single json file. From #30 .
- A new script
moalmanac/run_4x_for_output_regression_test.py
to run themoalmanac/run_example.py
four times using different config settings and with and without input files. From #29.
Revisions:
moalmanac/logger.py
was revised to shut down at the end of moalmanac's main function, allowing for multiple processes to be executed in series within a single Python process. From #29.- Annotating with ExAC, ExACExtended, and ClinVar will now skip rows that have missing data in any of the required columns. From #28.
- Docker now uses Python 3.12 and also pushes to the
latest
tag. From #27.
Bug fixes:
- Columns with empty values within read count columns would cause the process to fail. From #28.
Release 0.7.2
This release utilizes the v.2024-10-03 release of the underlying MOAlmanac database and otherwise fixes two bugs present in Release 0.7.1. This release incorporates pull requests #23 and #24.
Additions:
- An example log output was added to the example_output/ folder in the root directory of this repository.
Revisions:
datasources/moalmanac/molecular-oncology-almanac.json
was updated to the 2024-10-03 release.docs/template-pull-request.md
was renamed todocs/pull_request_template.md
- The section heads for both
moalmanac/annotation-databases.ini
andmoalmanac/preclinical-databases.ini
were renamed to paths.
Bug fixes:
moalmanac/moalmanac.py
would attempt to access the value "preclinical" in the preclinical_db_paths dictionary, even if no .ini file was provided using the --preclinical-dbs argument.- Templates to generate tables within the Frozen Flask report were modified to include an if statement to only generate tables if the underlying dataframe is not empty.
Release 0.7.1
This release utilizes the v.2024-04-11 release of the underlying MOAlmanac database and primarily focuses on adding logging to MOAlmanac and refactoring the codebase to remove the datasources/
directory from the application folder. It incorporates changes from two pull requests: #21 and #22.
Additions:
- A new output with the suffix
.log
is generated which details inputs provided, configuration settings, output locations, and MOAlmanac processes as they happen. This is detailed in pull request 22. - Two new configuration files:
annotation-databases.ini
andpreclinical-databases.ini
have been added to the application folder to detail datasource locations for both required datasources and datasources for the cancer cell line modules, respectively. These sections have been removed fromconfig.ini
.annotation-databases.ini
andconfig.ini
are now required arguments ofmoalmanac/moalmanac.py
andpreclinical-databases.ini
is an optional argument.
Revisions:
- The datasources folder,
moalmanac/datasources/
, was moved to the root directory of this repository. - Input, output, and runtime README files have been updated to reflect the changes of these two pull requests.
run_example.py
was revised to add the date in ISO format to the default output directory- Several unit tests were updated to accommodate the datasources location and aforementioned changes to configuration files.
- The files
moalmanac/run_deconstructSigs.R
andmoalmanac/wrapper_deconstructSigs.sh
have been removed from the repository.
Release 0.6.0
This release utilizes the v.2024-04-11 release of the underlying MOAlmanac database and primarily focuses on supporting Python 3.12. It incorporates changes from two pull requests: #18 and #19.
Additions:
- A new output regression test to hash and compare output text files from two folders,
run_output_regression_test.py
. This test is intended to be run on outputs generated from themain
branch and a branch intending to be merged. - The minimum coverage value is now taken for ONP variants when multiple coverage values are provided on a single row. This change was made to comply with pandas' change in behavior of object type downcasting.
Revisions:
- All dependencies in
requirements.txt
have been updated to their latest versions, and also alphabetized. moalmanac/datasources/moalmanac/create_almanac_db.py
was revised to (i) take the config.ini file as an input argument and (ii) skip a record if the_deprecated
field is set toFalse
.
Bug fixes:
investigator.py
'spopulate_feature_dictionary
function will no longer will include samples with an NA sample name. Some samples have an NA for sample name because not all samples have an alias across all three naming conventions: Broad Institute, CCLE, and Sanger labels.
Release 0.5.0
This release utilizes the v.2023-11-09 release of the underlying MOAlmanac database and incorporates changes from three pull requests: #15: Added AIP to hereditary cancers gene list, #16: Update db nov 2023, and #17: Revise handling of COSMIC mutational signatures.
Additions:
- Added AIP to genes related to hereditary cancers gene list.
- All entries within the MOAlmanac database now have filled fields for
publication_date
of the source and when the record waslast_updated
. moalmanac/moalmanac.py
now accepts--mutational_signatures
as an input argument and an example input as been added.
Revisions:
- MOAlmanac no longer runs deconstructSigs as a subprocess and, as a result, R dependencies have been removed.
- The annotation to the MOAlmanac database was updated to sort by the evidence type, publication date of the source, and then when the database record was last updated.
Release 0.4.6
This release utilizes the v.2023-04-06 release of the underlying MOAlmanac database. It incorporates changes from two pull requests: #13 and #14.
Revisions:
- Python was updated to version 3.11 for MOAlmanac and all requirements were updated to their latest versions. The codebase was also tested using Python 3.8. More detailed changes can be read from the pull request.
Bug fixes:
- Report generation was failing if preclinical efficacy was not being calculated but present in the data structure.
Release 0.4.5
This release utilizes the v.2023-02-02 release of the underlying MOAlmanac database. It incorporates changes from three pull requests: changing the underlying MOAlmanac database from TinyDB to JSON (#9), adding a simplified input option to run MOAlmanac (#10), and updating datasources (#11).
Additions:
- A section called function_toggle has been added to
moalmanac/config.ini
to enable or disable several features of moalmanac, see here for more details. - The script
moalmanac/simplified_input.py
has been added to allow users to input a tab delimited file of somatic variants, germline variants, copy number alterations, and rearrangements. Supporting functions have been added tomoalmanac/annotator.py
andmoalmanac/features.py
and example input data has been added toexample_data/
to support this change. See here for more details.
Revisions:
- The underlying moalmanac database is held in memory using the json python library, rather than TinyDB. See here for more details.
moalmanac/reporter.py
has been refactored to produce the report based on an object.- Flask has been upgraded to 2.0.0 in requirements.txt.
- Documentation has been updated for description of inputs, example outputs, and method usage.
- Datasources COSMIC and Cancer Gene Census have been updated to version 97, ACMG secondary findings has been updated to version 3, and OncoTree has been updated.
Bug fixes:
- Warnings are no longer produced when generating the actionability report.
- Drugs not present in
almanac-gdsc-mappings.json
no longer cause errors with execution. - The output folder is now properly handled by the R subprocesses.
- Therapies that have not been matched to preclinical drugs produced an error. moalmanac/investigator.py has been revised to only consider therapies that have already been mapped. This was identified in #11.
Release 0.4.2
This release utilizes the v.2021-02-04 release of the underlying MOAlmanac database.
Revisions:
- Added a README to the
moalmanac/datasources/
folder, specifying which datasources are available for use immediately from Github and which ones either need to be copied from Docker or processed from source. - Added Oncotree code as an output columns for assertions in the
.actionable.txt
output. - Revised documentation for COSMIC,
moalmanac/datasources/cosmic/
, and ExAC,moalmanac/datasources/exac/
, by expanding upon how to generate the datasources from source. - Revised documentation for preclinical data leveraged by MOAlmanac. Previously,
moalmanac/datasources/preclinical/
contained output tables from the paper's repository and referenced the repository for documentation. Now, all utilized tables can be generated within this repository from source. - Revised README in the root directory to emphasis that not all datasources are immediately ready for use after pulling this repository from Github.
- Updated Dockerfile to compile moalmanac and preclinical datasources later, to hopefully reduce build time.
- Updated WDL to use the latest version of this method, increased default resource requirements for buffer, and preallocated additional matches json output.
Bug fixes:
- Annotation with ExAC now coerces integer columns from strings to floats and then integers, making the annotator resilient to some MAFs which have positions coded as floats instead of integers.
Release 0.4.1
This release utilizes the v.2021-02-04 release of the underlying MOAlmanac database.
New features:
- Pass called copy number alterations rather than segmentation files with the
--called_cna
input argument. - A new output lists all therapeutic strategies highlighted by MOAlmanac and associated therapies, categorized by if the therapy was highlighted for sensitivity or resistance relationships. This is reported in the output of the suffix
.therapeutic_strategies.txt
. - Added the
docs/
folder, which includes detailed descriptions of inputs and outputs.
Revisions:
- Updated WDL to use the latest version of this method.
- Updated READMEs.
- Updated how mutational signatures are displayed, version number of COSMIC signatures are now shown.
- Removed requirement for segment valued copy number alterations to be of at least |1| in value to be reported as Putatively Actionable.
- Reports now show "Prognosis and rationale" for assertions related to prognosis.
- Updated datasources modal in report.
- Fusion inputs now are appropriate for the latest output of STAR fusion.
- The y-axis for preclinical efficacy figures are now dynamically set based on values plotted rather than a static range.
- The actionability categories "Investigate Actionability - High" and "Investigate Actionability - Low" have been collapsed to a single label, "Investigate Actionability". Molecular features will still sort as before.
- The sort order of dataframes based on datasource match are now explicitly set in the Writer, for if each datasource column should be ascending or descending.
- Added a function to annotate somatic variants without ExAC, which can be imported for custom annotation.
- Updated unit tests to reflect code changes.
- Updated license to GPL 2.0.
- Updated preclinical datasources to reflect the recent revisions to the paper repository.
Bug fixes:
- Extended ExAC annotations (annotating with subpopulation information) now properly annotates as expected.
- The second gene as part of fusions were not properly being ranked in the somatic output.
bioRxiv
Release for bioRxiv!