Build a map of genes present across a set of microbial genomes
Note: The documentation below refers to
gig-map
versions starting with0.2.0
. To run the initial version with a single monolithic workflow entrypoint, use the flag0.1.0
to access that pre-release.
When trying to understand the biological context of a set of microbial protein-coding genes, it can be very helpful to understand the distribution of organisms which contain that coding sequence in their genome. One of the most useful characteristics used to differentiate those microbial genomes which do or do not contain a set of genes is their overall evolutionary history, for which taxonomic assignment is commonly used as a proxy. With this type of analysis it can become more easily evident when a set of genes are, for example, common to a cohesive taxonomic grouping (i.e. genus-, or species-specific), found sporadically across representatives within a taxonomic grouping (i.e. strain-specific), or found across disparate taxonomic groupings (i.e., horizontally transfered genes).
To implement this type of analysis, this repository contains code needed to align a set of genes against a set of genomes (e.g. BLAST), calculate the similarity of those genomes to each other (e.g. mashtree), and visualize the results alongside annotations of the genes and/or genomes.
For a more complete set of documentation of the workflow commands which can be used to organize and align microbial genes and genomes, please visit the Wiki.
To visualize the results of the gig-map alignment workflow, the output files may be
rendered using the interactive tool provided as app.py
. In order to use this tool:
- Clone this repository locally
- Create a python virtual environment (tested with Python 3.8 and higher)
- Install the python dependencies contained in
requirements.txt
- Run
python3 app.py --help
for detailed instructions on launching the visualization
Any contributions to this repository are more than welcome. Please start by opening an issue to discuss any potential contribution, bugfix, or improvement.
To test the gig-map
toolset, clone the repository locally and execute run_tests.sh
from test_data
.
You will need to have Python3, Nextflow, BATS, and Docker configured appropriately for tests to run.
Note: Due to unresolved permissions issues, the act
utility does not appear to work for testing.
Any suggestions for resolving this issue would be greatly appreciated.