This repository contains a collection of Biopython-based tools and tutorials for handling various bioinformatics tasks, such as sequence manipulation, alignment, and structure analysis.
- data/: Contains example datasets like FASTA files (e.g., chimpanzee mRNA).
- notebooks/: Jupyter Notebooks for demonstrating projects and including examples.
- tools/: Python scripts implementing and tutoring bioinformatics tools using Biopython.
- Sequence Manipulation: Reverse complement, transcription, translation, GC content calculation.
- Sequence Alignment: Global and local alignment for nucleotide and protein sequences.
- Database Retrieval: Fetch sequences from NCBI databases (e.g., chimpanzee mRNA).
- Protein Structure Analysis: Retrieve and analyze protein structures from PDB.
- Sliding Window GC Content: Analyze GC content in sliding windows across large datasets.
- File Format Conversion: FASTA, GenBank, and other bioinformatics formats.
-
Chimpanzee mRNA GC Content Analysis:
- Retrieve chimpanzee mRNA sequences using Entrez.
- Perform GC content analysis on the large mRNA dataset.
-
Nucleotide Search and Retrieval:
- Use Entrez to search and retrieve specific nucleotide sequences from databases like NCBI.
- Using other Databases like Expassy and Swissprot.
-
Protein Structure Analysis:
- Fetch protein structures from the Protein Data Bank (PDB).
- Perform structure analysis and visualization using Biopython tools.
-
Pairwise Sequence Alignment:
- Perform global and local sequence alignments to analyze sequence similarity.
- Example alignment of nucleotide sequences (chimpanzee mRNA vs. human mRNA).
To install the required Python packages, run:
pip install -r requirements.txt