SEDA (SEquence DAtaset builder) is an open source application for processing FASTA files containing DNA and protein sequences. Please, visit the official web page of the project for downloads, a complete online manual and support.
Among other functions, SEDA allows you to:
- Filter sequences based on different criteria (including text patterns).
- Translate nucleic acid sequences into amino acid sequences.
- Edit sequence headers in different ways.
- Remove duplicated sequences.
- Remove isoforms.
- Sort, merge, split, or reformat FASTA files.
- Use BLAST to perform different types of queries.
- Use Clustal Omega to perform multiple sequence alignments.
- Perform gene annotation using different tools: Splign/Compart, ProSplign/ProCompart, Augustus (as implemented in SAPP), or the Conserved Genome Annotation (CGA) Pipeline.
In case you need see the commands executed by SEDA to run third-party software, just run SEDA with -Dseda.execution.showcommands=true
.
Programmers can take advantage of the SEDA core to develop new operations to process FASTA files. In addition, SEDA has a plugin-based architecture, so new functions can be added to SEDA through plugins. Take a look at the manual for detailed information about this.
Please, cite the following publication if you use SEDA:
- H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C. P. Vieira; J. Vieira (2022) SEDA: a Desktop Tool Suite for FASTA Files Processing. IEEE/ACM Transactions on Computational Biology and Bioinformatics. Volume 19(3), pp. 1850-1860.
- H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) A bioinformatics protocol for quickly creating large-scale phylogenetic trees. 12th International Conference on Practical Applications of Computational Biology & Bioinformatics: PACBB 2018. Toledo, Spain. 20 - June
- H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) Bioinformatics Protocols for Quickly Obtaining Large-Scale Data Sets for Phylogenetic Inferences. Interdisciplinary Sciences: Computational Life Sciences
- H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C.P. Vieira; J. Vieira (2019) Inferring Positive Selection in Large Viral Datasets. 13th International Conference on Practical Applications of Computational Biology & Bioinformatics: PACBB 2019. Ávila, Spain. 26 - June
The Command-Line Interface (CLI) available from SEDA v1.6.0 was developed by David Vila Fernández as Master's Project.