You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, both wrappers require maker and unconditionally run the maker2zffscript with default settings over the annotation gff with the aim of only retaining high-quality annotations for training.
This approach is suboptimal in several ways:
The default behavior of maker2zff is to filter features based on qi and aed attribute values if the feature states maker in its source column. When source is not maker no filtering is performed.
All of this happens behind the scene without telling the user who doesn't know that maker and non-maker gffs are treated differently.
The built-in filtering means the user cannot decide to apply less strict criteria than the default ones (unless they know about the secret workaround to disable default filtering by removing maker from the gff source column).
If default filtering results in all features getting eliminated augustus and snap training fails with hard to diagnose errors.
Augustus example: Error: training set file jwd05e/76659517/working/genome.gff3 has neither Genbank nor GFF nor FASTA format! from which it is very hard to deduce that genome.gff3 is the filtered intermediate file and that it's simply empty.
Suggestion:
Offer a separate wrapper for maker2zff with full control over settings and only suggest to filter the input gff in the downstream tools.
The text was updated successfully, but these errors were encountered:
Yeah I think these training tools would need some love, I remember implementing it for use within a maker workflow, but something more agnostic would be much better.
I don't have much bandwidth at the moment to work on it, maybe Romane could help at some point. Of course if anyone proposes something I'd be happy to review!
Currently, both wrappers require maker and unconditionally run the
maker2zff
script with default settings over the annotation gff with the aim of only retaining high-quality annotations for training.This approach is suboptimal in several ways:
The default behavior of
maker2zff
is to filter features based on qi and aed attribute values if the feature states maker in its source column. When source is not maker no filtering is performed.All of this happens behind the scene without telling the user who doesn't know that maker and non-maker gffs are treated differently.
The built-in filtering means the user cannot decide to apply less strict criteria than the default ones (unless they know about the secret workaround to disable default filtering by removing maker from the gff source column).
If default filtering results in all features getting eliminated augustus and snap training fails with hard to diagnose errors.
Augustus example:
Error: training set file jwd05e/76659517/working/genome.gff3 has neither Genbank nor GFF nor FASTA format!
from which it is very hard to deduce that genome.gff3 is the filtered intermediate file and that it's simply empty.Suggestion:
Offer a separate wrapper for
maker2zff
with full control over settings and only suggest to filter the input gff in the downstream tools.The text was updated successfully, but these errors were encountered: