Noah A. Legall, Liliana Salvador (advisor)
This github is tied to the manuscript "Comparative genomic analysis of worldwide Mycobacterium bovis isolates reveal geographic and host associated evolution". Programs that were used to generate data for the analysis can be found in the 'scripts' directory. Other necessary files are located in the 'dependencies' file
Mycobacterium bovis, a bacterial zoonotic pathogen responsible for the economically and agriculturally important livestock disease bovine tuberculosis (bTB), has shown characteristics of a generalist pathogen, infecting worldwide a broad range of mammalian species. These characteristics have led to bidirectional transmission events between livestock and wildlife species and to the formation of wildlife reservoirs, impacting the ability of bTB control measures. Next Generation Sequencing (NGS) has transformed our ability to understand disease transmission events by tracking variant sites. However, factors for host adaptation following spillover, alongside the role of other genomic processes in the M. bovis transmission process are understudied problems. In this study, we analyzed three rich publicly available datasets (700 isolates) of M. bovis samples collected from both livestock and wildlife (11 host-species represented) from the United Kingdom, United States of America, and New Zealand to investigate if specific M. bovis genomic signatures (namely homologous recombination or sites of positive selection) are related to the emergence and evolution of M. bovis at the host-species, geographical, and/or sub-population levels. Regions predicted to be impacted by homologous recombination in the M. bovis genome were found to affect multiple genes (affecting processes such as lipid metabolism, cell wall architecture, and virulence). Amongst these genes, homologous recombination in Mb3510c, Mb0403, and rpfA were found to be helpful in distinguishing between different geographic regions, species, and population clusters. New Zealand isolates were also found to have highly significant selective sweep sites amongst its population clusters 2. The results of this study highlight the usefulness of comparing multiple host-associated isolates to understand genomic signatures that are linked to characteristics of a multi-host pathogen.