Transcripts support different numbers of reads #66

liuxiaoning-wq · 2024-07-20T01:16:36Z

Hi there,Do I need full-length transcripts to use ESPRESSO? For example, if the reads are not full-length, do they need to be filtered out?
Why do different positions of the same transcript support different read numbers? For example, my image has 20 at the beginning and 178 at the end, and another image has 87.93 at the beginning and 100 at the end.

EricKutschera · 2024-07-22T15:54:42Z

ESPRESSO expects that some of the reads will cover all the splice junctions in the transcript that the read is from and that other reads will only cover some of the junctions in the transcript. If a read has a sequence of splice junctions that could have come from multiple different full length transcripts then ESPRESSO can assign a partial count for that read to each matching transcript

Those different numbers are likely from different transcripts, not different positions of the same transcript. If you zoom in more you may see see more details about the transcripts

liuxiaoning-wq · 2024-07-23T08:53:00Z

Thanks for your reply. I would like to ask if there is a corresponding relationship between the gene_ID N1 and N2 in the esp file and the values in igv. If so, why are the values different? For example: ENSG00000124713.6 N1 is 1175.3, but N1 in igv is 1461? Also, there are five transcripts in esp, but only two are shown in igv?

liuxiaoning-wq · 2024-07-23T09:04:10Z

And can the transcript ID be displayed in igv visualization?

EricKutschera · 2024-07-24T12:45:06Z

It looks like N1 and N2 are your sample names and you loaded the N1.bed and N2.bed files output from visualize.py. The image shown in the README doesn't load those sample level bed files. Instead it uses the transcript level bed files output under target_genes/: https://github.com/Xinglab/espresso/tree/v1.4.0?tab=readme-ov-file#igv

liuxiaoning-wq · 2024-07-25T02:00:28Z

Thank you very much. In this case, there are only four bed file for each sample of one ENST transcript in the target_genes of the visualization results, however, there are five transcripts in the esp file with four novel ESPRESSO transcripts. these four novel ESPRESSO transcripts are not show in target_genes file

. Why is that?

EricKutschera · 2024-07-25T12:23:23Z

What was the command you ran? From https://github.com/Xinglab/espresso/tree/v1.4.0?tab=readme-ov-file#visualization-arguments

--target-gene TARGET_GENE
the name of the gene to visualize. transcripts with
name like {target-gene}-{number} or gene_id like
{target-gene}.* will have output generated. Use the
gene_id to match novel isoforms output by ESPRESSO

Based on that description it seems like --target-gene GNMT would only create the bed files for ENST00000372808.4 since it has transcript name GNMT-201. If you run with --target-gene ENSG00000124713 then I think it should create output for the novel transcripts

liuxiaoning-wq · 2024-07-26T03:10:20Z

Thanks again. 1. Can we just look at the numbers to determine whether there is a new transcript in this sample? If the number is zero, it means that the transcript does not exist, right? 2. Can these numbers represent the expression levels of these transcripts? Can we use these numbers to do differential analysis?

EricKutschera · 2024-07-26T13:24:08Z

Those numbers are from the abundance.esp file and they show the number of reads from that sample which ESPRESSO counted toward each isoform. If it's zero then ESPRESSO did not detect that transcript in that sample. Yes, you can use them for differential analysis (rMATS-long uses ESPRESSO output for differential analysis: https://github.com/Xinglab/rMATS-long)

liuxiaoning-wq · 2024-07-29T00:52:06Z

thank you for your reply

liuxiaoning-wq · 2024-07-30T02:42:24Z

Hello, these are three new transcripts of this gene. How can I obtain the sequences of these three new transcripts?

EricKutschera · 2024-07-30T12:17:16Z

The coordinates for those transcripts should be in the updated.gtf file. See this post for a way to get the sequence from the gtf and fasta: #48

liuxiaoning-wq · 2024-08-02T02:14:03Z

okay, thank you

liuxiaoning-wq · 2024-08-13T08:16:27Z

Hello, can I use espresso to analyze fusion genes? If so, how do I do it and where can I see the results?

EricKutschera · 2024-08-13T14:13:42Z

ESPRESSO doesn't specifically look for fusion genes and it might filter out alignments for fusion genes. There is a filter for alignments with large insertions (defaults to 20bp): https://github.com/Xinglab/espresso/blob/v1.5.0/src/ESPRESSO_S.pl#L924
Also ESPRESSO will only use 1 alignment per read even if there are supplementary alignments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transcripts support different numbers of reads #66

Transcripts support different numbers of reads #66

liuxiaoning-wq commented Jul 20, 2024

EricKutschera commented Jul 22, 2024

liuxiaoning-wq commented Jul 23, 2024

liuxiaoning-wq commented Jul 23, 2024

EricKutschera commented Jul 24, 2024

liuxiaoning-wq commented Jul 25, 2024

EricKutschera commented Jul 25, 2024

liuxiaoning-wq commented Jul 26, 2024

EricKutschera commented Jul 26, 2024

liuxiaoning-wq commented Jul 29, 2024

liuxiaoning-wq commented Jul 30, 2024

EricKutschera commented Jul 30, 2024

liuxiaoning-wq commented Aug 2, 2024

liuxiaoning-wq commented Aug 13, 2024

EricKutschera commented Aug 13, 2024

Transcripts support different numbers of reads #66

Transcripts support different numbers of reads #66

Comments

liuxiaoning-wq commented Jul 20, 2024

EricKutschera commented Jul 22, 2024

liuxiaoning-wq commented Jul 23, 2024

liuxiaoning-wq commented Jul 23, 2024

EricKutschera commented Jul 24, 2024

liuxiaoning-wq commented Jul 25, 2024

EricKutschera commented Jul 25, 2024

liuxiaoning-wq commented Jul 26, 2024

EricKutschera commented Jul 26, 2024

liuxiaoning-wq commented Jul 29, 2024

liuxiaoning-wq commented Jul 30, 2024

EricKutschera commented Jul 30, 2024

liuxiaoning-wq commented Aug 2, 2024

liuxiaoning-wq commented Aug 13, 2024

EricKutschera commented Aug 13, 2024