Skip to content

Latest commit

 

History

History
451 lines (428 loc) · 8.91 KB

stats.md

File metadata and controls

451 lines (428 loc) · 8.91 KB

Statistics

Extension

File Extension Count
gff 13
gff3 153

Headers

Header Key Value Percentage Using at least Once
# 59 17.47%
FASTA 2 1.20%
Index-subfeatures 2 1.20%
NOTE: 1 0.60%
Type 1 0.60%
attribute-ontology 1 0.60%
color-code 1 0.60%
date 2 1.20%
feature-ontology 6 3.61%
genome-build 2 1.20%
gff-version 3 3 1.81%
gff-version 3 1 0.60%
gff-version 2 1 0.60%
gff-version 3 164 83.13%
hdr 1 0.60%
history 3 0.60%
line-order 1 0.60%
max-num-mismatches 1 0.60%
max-read-length 1 0.60%
primer-base 1 0.60%
sequence-region 274 56.63%
solid-gff-version 1 0.60%
source 1 0.60%
source-ontology 1 0.60%
source-version 1 0.60%
species 8 4.82%
time 1 0.60%

Tabs

Tabs In Line Count
10 140
9 1286559

Tools

Tools Count
feature 31
GenBank 22
. 19
CPT_ShineFind 8
CPT 8
annotation 7
RefSeq 6
maker 6
getOrfsOrCds 6
blast 6
blastn 6
est2genome 6
FlyBase 5
progressiveMauve 5
AU9 5
glimmer 4
TMHMM 4
DGRC_1 3
example 3
cpt.fixModel 3

Feature Types

Feature Count
match_part 644648
match 314926
orthologous_to 57863
Z_over_Input 50198
CDS 42052
exon 18843
TF_binding_site 15781
mRNA 15676
gene 11570
RNAi_reagent 9416
oligonucleotide 9257
paralogous_to 8610
oligo 8303
protein_coding_gene 7969
polypeptide_region 6555
Shine_Dalgarno_sequence 5813
Match 5635
transposable_element_insertion_site 4837
five_prime_UTR 4785
intron 4774
exon_junction 4709
three_prime_UTR 3722
polyA_site 3425
contig 2204
TSS 1990
golden_path 1817
region 1753
BAC_cloned_genomic_insert 1603
regulatory_region 1491
stop_codon 1425
start_codon 1423
orthologous_region 1382
protein 1105
polypeptide 1062
sgRNA 940
Topological domain 898
origin_of_replication 568
insulator 556
deletion 556
pcr_product 556
PCR_product 500
Transmembrane 454
Chain 453
point_mutation 396
chromosome_breakpoint 350
repeat_region 313
chromosome_band 304
terminator 260
breakpoint 234
ncRNA 233
delins 181
tRNA 132
protein_binding_site 126
protein_match 126
ncRNA_gene 115
read 112
sequence_feature 106
transposable_element 102
sequence_variant 91
HSP 88
insertion_site 83
pseudogene 72
tandem_repeat 58
pseudogenic_transcript 54
modified_RNA_base_feature 53
Beta strand 47
non_canonical_five_prime_splice_site 47
transcript 45
rescue_region 44
syntenic_region 44
non_canonical_three_prime_splice_site 41
mature_protein_region 38
rescue_fragment 34
experimental_result_region 34
long_terminal_repeat 32
remark 31
binding_site 29
ARS 26
noncoding_exon 26
insertion 25
expressed_sequence_match 24
Disulfide bond 22
miRNA 20
silencer 20
Lipobox 20
cds 18
nucleotide_to_protein_match 16
polypeptide_domain 15
pre_miRNA 14
MNV 11
rRNA 11
Helix 10
enhancer 10
my_feature 10
motif 10
Glycosylation 9
snoRNA 9
DNA 9
Peptide 8
misc_feature 8
SNP 8
transposable_element_gene 8
complex_substitution 7
snRNA 7
supercontig 7
regulatory 7
micro_array_oligo 7
nucleotide_match 7
Domain 6
Nucleotide binding 6
promoter 6
recombination_feature 6
loop 6
EST_match 6
Site 5
Turn 5
chromosome 5
Active site 4
stop_codon_read_through 4
peptide_helix 4
telomere 4
LTR_retrotransposon 4
sequence_alteration 3
mature_peptide 3
mobile_genetic_element 3
5'-UTR 3
gene_component_region 3
STS 3
sequence_difference 3
direct_repeat 3
Signal peptide 2
Propeptide 2
Non-terminal residue 2
BAC 2
coding 2
processed_transcript 2
3'-UTR 2
biological_region 2
protein_coding_primary_transcript 2
UTR 2
centromere 2
Region 1
DNA binding 1
Modified residue 1
chromosome_arm 1
right_end_read 1
left_end_read 1
trace 1
ultracontig 1
lincRNA_gene 1
lincRNA 1
signal_peptide 1
inverted_repeat 1
scaffold 1
clone_start 1
clone_end 1
Feature (Using SO term) Count

Scores?

Score Range Count
Does Not Use Scores 113
[0, 100] 30
[0, 10000] 13
[-2.8, 8728.0] 2
[102.0, 17680.0] 2
[-33.075, 76535.0] 1
[-4438.107454, -183.539619] 1
[-11.2071, 9.9974] 1
[-37.0, 336.0] 1
[-37.0, 207.0] 1
[0, 1000] 1

Tags

Tag Value Percentage Using at least Once
Name 1070347 322393.67%
Target 832972 250895.18%
target_type 808204 243434.94%
Parent 742440 223626.51%
ID 568467 171225.00%
program 307207 92532.23%
programversion 307207 92532.23%
sourcename 307207 92532.23%
Dbxref 127851 38509.34%
qseq 69585 20959.34%
sseq 69585 20959.34%
to_species 67899 20451.51%
to_name 66473 20021.99%
diopt_source 60884 18338.55%
__.__ 50273 15142.47%
Alias 39408 11869.88%
library 35910 10816.27%
description 18771 5653.92%
bound_moiety 17925 5399.10%
gene_id 16458 4957.23%

Percent Encoding

Tag Value Count
%2C , 20561
%20 13404
%3D = 9337
%2A * 5069
%3B ; 2164
%2B + 1415
%7C ` `
%28 ( 979
%29 ) 979
%40 @ 856
%27 ' 596
%25 % 154
%26 & 145
%23 # 118
%C3 115
%C2 106
%2F / 74
%C4 26
%87 8
%80 6

Non-percent encoded values

Tag Count
0x20 2710255
0xa 1286558

Trailing Semicolon in field 9

Tag Count
random-gff3s-from-helena/yeast_chr1+2.gff3 616
random-gff3s-from-helena/Genus_species.gff3 116
random-gff3s-from-helena/annot_mapped.gff3 38
gffutils/glimmer_nokeyval.gff3 4
gffutils/gms2_example.gff3 4
gffutils/FBgn0031208.gff 3
gffutils/unsanitized.gff 1

Top Level Features

Tag Count
match 313671
orthologous_to 57863
Z_over_Input 50198
TF_binding_site 15781
gene 11570
RNAi_reagent 9416
oligonucleotide 9257
paralogous_to 8610
oligo 8303
protein_coding_gene 7969
polypeptide_region 6555
Match 5635
CDS 5408
transposable_element_insertion_site 4837
exon_junction 4709
polyA_site 3414
TSS 1990
golden_path 1817
region 1743
BAC_cloned_genomic_insert 1603
regulatory_region 1491
orthologous_region 1382
protein 1104
polypeptide 1062
sgRNA 940
Shine_Dalgarno_sequence 860
origin_of_replication 568
insulator 556
deletion 556
pcr_product 556
PCR_product 500
Chain 453
point_mutation 396
chromosome_breakpoint 350
repeat_region 312
chromosome_band 304
terminator 260
breakpoint 229
delins 181
protein_binding_site 126
protein_match 126
ncRNA_gene 115
read 112
sequence_feature 106
transposable_element 102
insertion_site 83
mRNA 82
tandem_repeat 58
pseudogene 58
modified_RNA_base_feature 53
Beta strand 47
rescue_region 44
syntenic_region 44
transcript 42
mature_protein_region 36
rescue_fragment 34
experimental_result_region 34
long_terminal_repeat 32
remark 31
sequence_variant 30
ARS 26
insertion 25
expressed_sequence_match 24
Disulfide bond 22
miRNA 20
silencer 20
binding_site 19
contig 18
tRNA 18
nucleotide_to_protein_match 16
polypeptide_domain 15
MNV 11
Helix 10
enhancer 10
my_feature 10
motif 10
Glycosylation 9
DNA 9
Peptide 8
misc_feature 8
SNP 8
transposable_element_gene 8
complex_substitution 7
micro_array_oligo 7
Domain 6
Nucleotide binding 6
promoter 6
regulatory 6
loop 6
EST_match 6
Site 5
Turn 5
chromosome 5
Active site 4
peptide_helix 4
telomere 4
LTR_retrotransposon 4
sequence_alteration 3
mobile_genetic_element 3
recombination_feature 3
sequence_difference 3
direct_repeat 3
snoRNA 3
Signal peptide 2
Propeptide 2
Non-terminal residue 2
BAC 2
coding 2
processed_transcript 2
biological_region 2
protein_coding_primary_transcript 2
ncRNA 2
centromere 2
Region 1
DNA binding 1
Modified residue 1
chromosome_arm 1
trace 1
ultracontig 1
lincRNA_gene 1
STS 1
inverted_repeat 1
scaffold 1
supercontig 1
snRNA 1