Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add Function to Generate MSA FASTA File Using results_allele.tsv #206

Open
vascokarla opened this issue Aug 28, 2024 · 3 comments
Assignees
Labels
Status: In Progress Has been assigned and is being worked on. Type: Enhancement

Comments

@vascokarla
Copy link

Description

We are currently using chewBBACA for genomic surveillance, and one of our workflows involves running the JoinProfiles function to add new samples to the results of historic samples. This workflow generates the results_allele.tsv file but does not produce all the output files required to run the AlleleCallEvaluator function.

In our use case, we often need to calculate SNP distances and generate phylogenetic trees based on a Multiple Sequence Alignment (MSA) from a subset of samples that have been identified as close by ReporTree, based on allelic distance cutoffs. The current requirement of having all the output files from the AlleleCall function limits our ability to easily generate MSAs when only the results_allele.tsv file is available.

Request

We would like to request the addition of a new function or an enhancement to the existing AlleleCallEvaluator that allows users to generate an MSA FASTA file directly from the results_allele.tsv file. This feature would enable more flexible and efficient workflows, especially for users who are primarily working with the output of the JoinProfiles function.

Use Case

  • Surveillance Workflow: After running JoinProfiles, we often need to:
    • Generate a phylogenetic tree using the MSA.
    • Calculate SNP distances between samples.
  • Subset Analysis: Sometimes, we need to perform these analyses only on a fraction of the samples identified as closely related by ReporTree.

Thank you for considering this request. chewBBACA is a valuable tool in our surveillance workflows, and this enhancement would significantly streamline our processes :)

@rfm-targa rfm-targa self-assigned this Aug 28, 2024
@rfm-targa rfm-targa added Type: Enhancement Status: In Progress Has been assigned and is being worked on. labels Aug 28, 2024
@ramirma
Copy link
Member

ramirma commented Aug 30, 2024

@vascokarla thank you for getting back to us with this suggestion. We will definitely consider this.

@rfm-targa
Copy link
Contributor

Greetings @vascokarla,

Adding a separate module to compute the MSA is a great idea. We plan to add several modules to perform smaller tasks currently included in larger modules. We can also remove some file requirements from the AlleleCallEvaluator module, which would allow the MSA to be computed just from the results_alleles.tsv file. That can be done sooner and would also address your feature request. We will look into it and let you know when it is done.

Best regards,

Rafael

@vascokarla
Copy link
Author

Thank you so much @ramirma and @rfm-targa! This new feature would help us a lot.

Thanks again,
Karla

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: In Progress Has been assigned and is being worked on. Type: Enhancement
Projects
None yet
Development

No branches or pull requests

3 participants