You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Comparing results using NUG starts compared to AUG-only starts, we see many, many more canonical_extended ORFs when using NUG than expected. For example, there are usually more canonical_extended than canonical predictions.
(There is no strand bias, though.)
The text was updated successfully, but these errors were encountered:
This seems to occur when we have a "close" upstream, in-frame NUG start. The attached image shows an example, but this appears to be much more common than upstream, in-frame AUGs.
This is a result of the "select the longest ORF for each stop codon" postprocessing step. Thus, there is not really a simple fix for the behavior. A few ideas are:
Incorporate the ORF type in the model, where "canonical" is more likely to be translated than others.
Run both AUG and NUG and subtract out the AUG canonical results from the NUG predictions.
There is not an immediate plan to address this issue.
I see the same phenomena and in fact looking at the orf profiles it seems to be correct because of leaky scanning - some fraction of translation starts on each potential start codon and it is what actually should be expected. The only problem is lack of annotation for these leaky-scanning derived isoforms - sometimes as little as 0.1% of translation initiates on particular alternative start codon and the only isoform found in final "filtered.prediction.orfs.bed" is the longest one. Could one work around bayes_factors to delineate between possible starts?
Comparing results using NUG starts compared to AUG-only starts, we see many, many more
canonical_extended
ORFs when using NUG than expected. For example, there are usually morecanonical_extended
thancanonical
predictions.(There is no strand bias, though.)
The text was updated successfully, but these errors were encountered: