Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GFF output and Fasta-headers give different start-coordinates of rRNA-genes #53

Open
jvollme opened this issue Jan 13, 2021 · 0 comments

Comments

@jvollme
Copy link

jvollme commented Jan 13, 2021

Barrnap v.0.9 produces gff-output and (optionally) a fasta output. The fasta output has the coordinates of each rRNA prediciton in the header, but not the evalue of that prediction. The gff output has also the evalue.

I now noticed that the start positions given in the fasta headers differ from the start positions given in the gff-output (usually by a value of 1).
For me this is a bit of a problem, because in order to catch any possible variation of rRNA genes in metagenomic bins, I am running barrnap runs for all three kingdoms (bac, arc & euk) consecutively and then try to identify overlapping hits and keep only the highest scoring (i.e. lowest evalue) hit for each overlapping possibility.
This means I have to compare the gff output (in order to get the evalues) with the fasta-headers.

Is this difference perhaps a bug or is it due to some special gff-specifications?
Can i safely assume that it is off by exactly 1 in ALL cases in order to correct for this difference, or could it be a bit more problematic?

Alternatively, it would be most helpful to either add the corresponding fasta seqid to the gff-output, or the evalue to the fasta-header.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant