Skip to content

Transcript Assembly Visualization

Obi Griffith edited this page Jul 7, 2017 · 70 revisions

RNA-seq Flowchart - Module 5

4-v. Transcript Assembly Visualization (Splicing Visualization)

Visualizing Results at the Command Line

View the merged GTF file from the 'de_novo' mode. Remember this merged GTF file combines both UHR and HBR (GTFs for each individually were also produced earlier).

cd $RNA_HOME/expression/stringtie/de_novo/
head stringtie_merged.gtf

For details on the format of these files, refer to the following links:

FIX How many genes have at least one transcript assembled by StringTie in the 'de_novo' results?

cd $RNA_HOME/expression/stringtie/de_novo/
cat stringtie_merged.gtf | perl -ne 'if ($_ =~ /gene_name\s\"(\w+)\"/){print "$1\n"}' | sort | uniq | wc -l

How many genes have at least one novel transcript assembled?

grep "j" merged.stringtie_merged.gtf.tmap

grep "j" merged.stringtie_merged.gtf.tmap | cut -f 1 | sort | uniq | wc -l
	

Visualizing Results in the IGV Browser

merged.gtf files:

  • View the grand merged.gtf files that were generated by each of the StringTie modes: 'ref_guided', 'de_novo'.
  • Note: For the 'ref_only' mode, only the supplied transcript were considered. Therefore the gtf file from any individual stringtie (unmerged) will be the same and serve for comparison.
  • The following can be loaded directly in IGV by url
  • http://YOUR_IP_ADDRESS/rnaseq/expression/stringtie/ref_only/HBR_Rep1/transcripts.gtf
  • http://YOUR_IP_ADDRESS/rnaseq/expression/stringtie/ref_guided/stringtie_merged.gtf
  • http://YOUR_IP_ADDRESS/rnaseq/expression/stringtie/de_novo/stringtie_merged.gtf

Load the BAM files at the same time as the junctions.bed and merged.gtf files:

  • The following can be loaded directly in IGV by url
  • http://YOUR_IP_ADDRESS/rnaseq/alignments/hisat2/UHR.bam
  • http://YOUR_IP_ADDRESS/rnaseq/alignments/hisat2/HBR.bam

Go to the following regions:

  • 22:45,334,669-45,342,395
  • 22:45,210,970-45,214,832

Do you see the evidence for any novel exons/transcript that are found in 'de_novo' or 'ref_guided' modes but NOT found in 'ref_only' mode? Explore in IGV for other examples of novel or different transcript predictions from the different cufflinks modes. Pay attention to how the predicted transcripts line up with known transcripts. Try loading the Ensembl transcripts track (File -> Load from Server).

NOTE: We have obviously just scratched the surface exploring these output files.

| Previous Section | This Section | Next Section | |:-----------------------------------------------:|:------------------------------------------------------------:|:-------------------------:| | Differential Splicing | Splicing Visualization | Kallisto |