-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Novel isoform not detected #62
Comments
You could add a transcript with that novel splice junction or the full transcript that you see in IGV to your .gtf file. Then ESPRESSO would treat the isoform as annotated and report its abundance You could try reducing If you know the read IDs that you think are for that novel isoform then you could check in the output files to see what ESPRESSO did with those reads. If you ran the Q step with |
Thank you Eric for your suggestions. I repeated the process with more samples, and including the standard gencode v45 annotation file. No custom SJ annotations. This time the isoform in my first comment was detected. There are however two other problems:
I added the option In file
So the reads with the new SJ are not filtered out but corrected (even with
It's weird that ESPRESSO could identify the same intron retention when processing only a subset of the samples. The corresponding reads are marked "ISM" in Thanks |
For the 4 base different splice junction it looks like the issue is the sequence at the junction. The original alignment is The annotated junction looks like its GT AG. The other junction would be GT AA. ESPRESSO won't create a novel isoform unless all the junctions are either annotated or have the expected junction sequence For the intron retention, it looks like the intron is between the last 2 exons of the transcript. In that case ESPRESSO would treat the read as ending at the 2nd to last exon but with a different transcript endpoint. When ESPRESSO classifies a read as ISM it's only looking at the splice junctions in the read. Since the intron is more than 1000 bases, even if those reads are ISM they should fail the endpoint check to the annotated isoform they share the junctions with. ESPRESSO could report it as a novel isoform, but only if there are no reads assigned to the annotated isoform that it is ISM in relation to |
In my RNAseq there's a novel isoform (extended exon) which is not detected by ESPRESSO. (See BAM file with ESPRESSO gtf below). Only two reads are visible supporting the isoform, but there are actually many more.
Is there any option I can try to identify it?
Many thanks
The text was updated successfully, but these errors were encountered: