Skip to content

Lesson and tutorial on generating count tables from read-mapping data

Notifications You must be signed in to change notification settings

biovcnet/bvcn-binder-htseq

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Binder for BVCN Transcriptomics lesson 3

Initially forked from here. Thank you to the awesome binder team!

Binder

Part of the Bioinformatics Virtual Coordination Network :)

Walkthrough

Build bowtie2 index out of the Tremblaya genome

bowtie2-build -f Candidatus-Tremblaya-princeps_PCIT.fa Candidatus-Tremblaya-princeps_PCIT.fa

Map sample reads to the index using bowtie2

bowtie2 -x Candidatus-Tremblaya-princeps_PCIT.fa -U PCIT.sample.fastq -S Candidatus-Tremblaya-princeps_PCIT.sam --no-unal

Generate GFF file using Prokka

prokka Candidatus-Tremblaya-princeps_PCIT.fa

Extract CDS rows from the GFF file

grep -P '\tCDS' PROKKA_05192020.gff > cds_PROKKA_05192020.gff

Examine the feature type and GFF attribute to be used as feature ID.

Generate GFF file using Prodigal

prodigal -f gff -o Candidatus-Tremblaya-princeps_PCIT.prodigal.gff -i Candidatus-Tremblaya-princeps_PCIT.fa -a Candidatus-Tremblaya-princeps_PCIT.fa-proteins.faa -d Candidatus-Tremblaya-princeps_PCIT.fa-proteins.ffn -p meta

Run htseq-count

htseq-count Candidatus-Tremblaya-princeps_PCIT.sam PROKKA_05192020/cds_PROKKA_05192020.gff -t CDS -i ID -c Candidatus-Tremblaya-princeps_PCIT.reverse.counts --nonunique none

Run htseq-count, only counting the reads mapping to the reverse strand

htseq-count Candidatus-Tremblaya-princeps_PCIT.sam PROKKA_05192020/cds_PROKKA_05192020.gff -t CDS -i ID -c Candidatus-Tremblaya-princeps_PCIT.reverse.counts --nonunique none -s reverse

Inspect and compare the ouputs

run counts-to-tpm script to convert read counts to normalized value: transcripts per million (TPM)

About

Lesson and tutorial on generating count tables from read-mapping data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 61.9%
  • Python 38.1%