You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've run into an issue with the likely misclassification of the strandedness of certain splice junctions. I'm working with a strand-specific RNA-Seq library for A. thaliana and generated a BAM supplemented Whippet index using the STAR alignments of said library.
A. thaliana has various cases of overlapping genes on opposing strands, as can be seen in the IGV screenshot below. For this particular case, the STAR alignment identified what I presume to be the correct junctions for the AT4G36050 (on the - strand) and the AT4G36052 (on the + strand) genes. When I look for the specific junction (chr4 17052585 17052653) in the STAR SJ.out.tab, STAR reports it as a junction on the positive strand (which seems to be correct, seeing as it corresponds to the AT4G36052 gene on the plus strand). However, the same junction in the Whippet .jnc.gz output file is reported to be on the minus strand. Subsequently, an RI event using this junction is annotated by Whippet for AT4G36050 and for these particular samples gives a very strong dPSI of >0.6. However, I do believe this is an error, given the origin of the splice junction.
I would like to know how Whippet handles overlapping genes when building the (supplemented) index and when quantifying. Is there any way for me to parse out those events, or potentially fix their quantification?
I've used julia version 0.6.4 and whippet v0.11.1 for my analysis. The --bam-both-novel flag was used when generating the index.
Thanks
The text was updated successfully, but these errors were encountered:
# if is spliced process splice sitesifisspliced(rec) && strand ==strandpos(rec)
known =process_spliced_record!( novelacc, noveldon, rec, known, oneknown )
end
So the strand encoded in the fifth bit of the BAM flag (0x10) has to match the strand of the gene, where 0 is + and 1 is - in both. That's not to say there isn't potentially some other bug going on here somewhere, but it would be good if you could give a minimally sized example with reproducible inputs.
On another front, my initial benchmarking (in the paper supplement) suggests that using the --bam-both-novel flag increases the FDR considerably, so I wouldn't recommend this in general. Requiring one of the splice sites to match a known splice-site in the gene is a reasonable approach I think-- also good to note here that the set of "known" splice sites in a gene expands iteratively as new splice junctions are added.
I've run into an issue with the likely misclassification of the strandedness of certain splice junctions. I'm working with a strand-specific RNA-Seq library for A. thaliana and generated a BAM supplemented Whippet index using the STAR alignments of said library.
A. thaliana has various cases of overlapping genes on opposing strands, as can be seen in the IGV screenshot below. For this particular case, the STAR alignment identified what I presume to be the correct junctions for the AT4G36050 (on the - strand) and the AT4G36052 (on the + strand) genes. When I look for the specific junction (chr4 17052585 17052653) in the STAR SJ.out.tab, STAR reports it as a junction on the positive strand (which seems to be correct, seeing as it corresponds to the AT4G36052 gene on the plus strand). However, the same junction in the Whippet .jnc.gz output file is reported to be on the minus strand. Subsequently, an RI event using this junction is annotated by Whippet for AT4G36050 and for these particular samples gives a very strong dPSI of >0.6. However, I do believe this is an error, given the origin of the splice junction.
I would like to know how Whippet handles overlapping genes when building the (supplemented) index and when quantifying. Is there any way for me to parse out those events, or potentially fix their quantification?
I've used julia version 0.6.4 and whippet v0.11.1 for my analysis. The --bam-both-novel flag was used when generating the index.
Thanks
The text was updated successfully, but these errors were encountered: