Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploring the transposition profile of specific LTRs #168

Open
shuren7 opened this issue Apr 8, 2024 · 2 comments
Open

Exploring the transposition profile of specific LTRs #168

shuren7 opened this issue Apr 8, 2024 · 2 comments
Labels

Comments

@shuren7
Copy link

shuren7 commented Apr 8, 2024

Dear Professor Ou, @oushujun
I'm sorry to disturb you during your busy schedule, but now I have a question that I have to ask you.

The issue I am most concerned about is the issue regarding the determination of the number of transpositions for a specific LTR-RT. Quantitative expression of some LTRs by TEtranscripts indicates that they are active LTRs, but I am curious how many times they have undergone transposition in a single genome. Please note that I am currently only targeting a single genome.

This involves the issue of LTR_retriever result files. What I don't quite understand is what "LTRlib.redundant.fa" specifically refers to, and I can't find an explanation in many places. Does "reduntant" refer to all existing LTRs? And "LTRlib.fa" refers to representative LTRs after removing duplicates?

I think this involves the search for the mother of LTRs. Because I focus on the possible origins of specific LTRs and "how many copies it has formed in the genome"; especially highly active LTRs and their transposition patterns.

I tried to use BLASTN to find the parent of individual LTRs (not sure if it is correct), and the library searched was "LTRlib.redundant.fa". The results are shown below. I don’t know if it is possible to determine whether the paired LTRs with “identity greater than 99% and length coverage greater than 99%” are the parent of the search LTR or the individual resulting from transposition?
Or should I do BLASTN on genomic DNA?
image

Many LTRs will leave fragments of LTR after transposition, but I may not consider this situation for now. However, it may still depend on your opinion. In short, briefly, I just want to know the copy number of specific LTRs.

What suggestions do you have for this? I would be very grateful if you could answer; your answer is very important to me!
Sincerely!
Shuren

@oushujun
Copy link
Owner

I see you have your questions answered zhangrengang/TEsorter#54 (comment). Copy numbers of TEs are generally difficult to estimate because of nested insertions and fragmentations. RepeatMasker will give you an estimation but I don't think that's very accurate due to these challenges.

Please let me know if you still have questions.

Shujun

@shuren7
Copy link
Author

shuren7 commented Apr 12, 2024

Thanks for the reply, Shujun. Okay, I'll try RepeatMasker.
However, based on my biological question, maybe I don't need very detailed copy number of specific LTRs. I read your original article on LTR_Retriever. the nucleic acid sequences seem to be very different between LTRs. So I think it should be reliable to use BLASTN to estimate the approximate copy number of specific LTRs? Just like many theories in physics research are obtained by approximate estimation. My main goal is to get an approximation of which LTRs in the set of LTRs are frequently active LTRs and which are not. Through the past few days of thinking and exploring, I think I did find a particular few frequently transposed LTRs as well.
May I ask your opinion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants