forked from TransDecoder/TransDecoder
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathChangelog.txt
72 lines (37 loc) · 3.44 KB
/
Changelog.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
## v5.0.0 August 26, 2017
-algorithm updates: frame[0] score > 0 and max for first 3 reading frames (instead of all 6), and orf with highest frame[0] score is chosen allowing for minimal overlap among selected predictions.
-option --single_best_only provides the single longest of the selected orfs per contig.
-long orfs unlikely to appear in random sequence are automatically selected as candidates with this minimal long orf length set dynamically according to GC content.
-orf score and blast or pfam info is propagated to gff3 output
## v4.1.0
-single best orf now selected by default. If more than the single best orf is wanted, use the --all_good_orfs parameter.
-start codon refinement is now done by default. To turn it off and get the original behavior of extending to the longest orf position, use parameter: --no_refine_starts
-cdhit has been removed and replaced by our own fast method for removing redundancies.
-selection of coding regions is strictly governed by Markov-based likelihood scores across reading frames. No auto-retention of long orfs by default, but can be activated by parameter: --retain_long_orfs_length
** all v4 releases pre-v4.1 were fairly quickly retracted due to bugs and insufficient benchmarking **
## v3.0.2 release Oct 31, 2016
minor bugfix release - when checking for required utilities to be installed, doesn't require ^/ in path
## v3.0.0 release April 26, 2016
TransDecoder v3.0.0 includes the following changes:
TransDecoder.LongOrfs now includes parameter '--gene_trans_map ' as a way to retain the gene identifier information. In the case of Cufflinks and Trinity, the gene identifiers will automatically be recognized and retained. For PASA and other inputs, it is necessary to provide the gene-to-transcript identifier mappings in order to generate isoform-clustered output files (gff3).
TransDecoder.Predict now includes flag ' --single_best_orf ' to retain only the single 'best' ORF per transcript. ORFs are prioritized according to homology information (if given the blast and pfam results) and by sequence length, with longer ORFs preferred.
Codon phase information is now included in the GFF3 output files.
The .mRNA files that were generated by default for genome-free TransDecoder runs are now deprecated, but of course the .cds and .pep files are provided.
The sample data sets include examples for running TransDecoder in a few different contexts, including starting from Trinity, PASA, or Cufflinks data.
More useful logging information is provided to it's clearer as to how many orfs are being retained and which are being eliminated along the way.
## 2016-03-11 v2.1 release
-added cpu parameter to TransDecoder.predict
-retaining gene identifier information from cufflinks output
-added sample data and examples for the various use-cases.
## 2015-01-26 v2.0 release
-overhauled the build
-removed the active searching of Pfam and all MPI-related funcitonality
-runs in 2 phase:
-TransDecoder.LongOrfs : extracs the long ORFs
-TransDecoder.Predict : predicts the likely coding regions among the ORFs
-step can use Pfam and blastp search results (blast support is a new addition)
-run Pfam and/or BlastP searches directly or try using "HPC GridRunner" (http://HpcGridRunner.github.io)
-moved to github
## 2014-07-04
-added 'make simple' to build just the essential components involving parafly and cdhit
-removed the 'cds.' prefix from the pep and cds sequence accessions.