Package for creating and manipulating phylogenies
Documentation | PackageEvaluator | Build Status of master |
---|---|---|
The package is registered in METADATA
on Julia v0.6 and the General
registry on v0.7 and v1.0 and so can be installed with add
. For example
on Julia v1.0:
(v1.0) pkg> add Phylo
Resolving package versions...
Updating `~/.julia/environments/v1.0/Project.toml`
[aea672f4] + Phylo v0.3.2
Updating `~/.julia/environments/v1.0/Manifest.toml`
[7d9fca2a] + Arpack v0.2.2
[9e28174c] + BinDeps v0.8.9
[31c24e10] + Distributions v0.16.2
[90014a1f] + PDMats v0.9.4
[aea672f4] + Phylo v0.3.2
[1fd47b50] + QuadGK v2.0.0
[79098fc4] + Rmath v0.5.0
[276daf66] + SpecialFunctions v0.7.0
[4c63d2b9] + StatsFuns v0.7.0
[0796e94c] + Tokenize v0.5.2
[30578b45] + URIParser v0.4.0
[4607b0f0] + SuiteSparse
(v1.0) pkg>
Note some features are currently broken on the binary release of Julia v1.0.3 for linux. Unfortunately, this appears to be a bug in the Julia release. The current workarounds are to remain on Julia v1.0.2 or to use the Julia v1.1.0-rc1 release candidate.
The package is tested against the current Julia v1.0 release, but also the previous v0.6 and v0.7 versions on Linux, macOS, and Windows.
Contributions are very welcome, as are feature requests and suggestions. Please open an issue if you encounter any problems or would just like to ask a question.
Phylo is a Julia package that provides
functionality for generating phylogenetic trees to feed into our
Diversity package to calculate phylogenetic
diversity. Phylo
is currently in alpha, and is missing much
functionality that people may desire, so please
raise an issue if/when you find problems or missing
functionality - don't assume that I know! Currently the package can
be used to make trees manually, to generate random trees using the
framework from Distributions
, and to read newick and nexus format
trees from files. For instance, to construct a sampler for 5 tip
non-ultrametric trees, and then generate one or two random tree of
that type (the examples below are from the master branch, but work similarly
on the current release):
julia> using Phylo
julia> nu = Nonultrametric(5);
julia> tree = rand(nu)
PolytomousTree{ManyRoots,DataFrames.DataFrame,Dict{String,Any}} with 5 tips, 9 nodes and 8 branches.
Leaf names are tip 1, tip 2, tip 3, tip 4 and tip 5
julia> trees = rand(nu, ["Tree 1", "Tree 2"])
TreeSet with 2 trees, each with 5 tips.
Tree names are Tree 2 and Tree 1
Tree 2: PolytomousTree{ManyRoots,DataFrames.DataFrame,Dict{String,Any}} with 5 tips, 9 nodes and 8 branches.
Leaf names are tip 1, tip 2, tip 3, tip 4 and tip 5
Tree 1: PolytomousTree{ManyRoots,DataFrames.DataFrame,Dict{String,Any}} with 5 tips, 9 nodes and 8 branches.
Leaf names are tip 1, tip 2, tip 3, tip 4 and tip 5
The code also provides iterators, and filtered iterators over the branches, nodes, branchnames and nodenames of a tree, though this may soon be superseded by a simpler strategy.
julia> collect(nodeiter(tree))
9-element Array{Node{ManyRoots,String,Branch{ManyRoots,String}},1}:
Node{ManyRoots,String,Branch{ManyRoots,String}}("tip 1", Branch{ManyRoots,String}(7, "Node 4", "tip 1", 1.1281538707050067), Branch{ManyRoots,String}[])
Node{ManyRoots,String,Branch{ManyRoots,String}}("tip 2", Branch{ManyRoots,String}(1, "Node 1", "tip 2", 1.4283209045962866), Branch{ManyRoots,String}[])
Node{ManyRoots,String,Branch{ManyRoots,String}}("tip 3", Branch{ManyRoots,String}(4, "Node 2", "tip 3", 0.6551342237894014), Branch{ManyRoots,String}[])
Node{ManyRoots,String,Branch{ManyRoots,String}}("tip 4", Branch{ManyRoots,String}(2, "Node 1", "tip 4", 0.0029623552238387534), Branch{ManyRoots,String}[])
Node{ManyRoots,String,Branch{ManyRoots,String}}("tip 5", Branch{ManyRoots,String}(3, "Node 2", "tip 5", 0.25029135145968845), Branch{ManyRoots,String}[])
Node{ManyRoots,String,Branch{ManyRoots,String}}("Node 1", Branch{ManyRoots,String}(5, "Node 3", "Node 1", 0.3763450182758717), Branch{ManyRoots,String}[Branch{ManyRoots,String}(1, "Node 1", "tip 2", 1.42832), Branch{ManyRoots,String}(2, "Node 1", "tip 4", 0.00296236)])
Node{ManyRoots,String,Branch{ManyRoots,String}}("Node 2", Branch{ManyRoots,String}(6, "Node 3", "Node 2", 0.20796611994615047), Branch{ManyRoots,String}[Branch{ManyRoots,String}(3, "Node 2", "tip 5", 0.250291), Branch{ManyRoots,String}(4, "Node 2", "tip 3", 0.655134)])
Node{ManyRoots,String,Branch{ManyRoots,String}}("Node 3", Branch{ManyRoots,String}(8, "Node 4", "Node 3", 3.5927792827310996), Branch{ManyRoots,String}[Branch{ManyRoots,String}(5, "Node 3", "Node 1", 0.376345), Branch{ManyRoots,String}(6, "Node 3", "Node 2", 0.207966)])
Node{ManyRoots,String,Branch{ManyRoots,String}}("Node 4", nothing, Branch{ManyRoots,String}[Branch{ManyRoots,String}(7, "Node 4", "tip 1", 1.12815), Branch{ManyRoots,String}(8, "Node 4", "Node 3", 3.59278)])
julia> collect(nodenamefilter(isroot, tree))
1-element Array{String,1}:
"Node 4"
The current main purpose of this package is to provide a framework for phylogenetics to use in our Diversity package, and they will both be adapted as appropriate until both are functioning as required (though they are currently working together reasonably successfully).
It can also read newick trees either from strings or files:
julia> using Phylo
julia> simpletree = parsenewick("((,Tip:1.0)Internal,)Root;")
PolytomousTree{ManyRoots,DataFrames.DataFrame,Dict{String,Any}} with 3 tips, 5 nodes and 4 branches.
Leaf names are Node 1, Tip and Node 2
julia> getbranches(simpletree)
Base.ValueIterator for a Dict{Int64,Branch{ManyRoots,String}} with 4 entries. Values:
Branch{ManyRoots,String}("Root", "Node 2", NaN)
Branch{ManyRoots,String}("Internal", "Node 1", NaN)
Branch{ManyRoots,String}("Root", "Internal", NaN)
Branch{ManyRoots,String}("Internal", "Tip", 1.0)
julia> tree = open(parsenewick, Phylo.path("H1N1.newick"))
PolytomousTree{ManyRoots,DataFrames.DataFrame,Dict{String,Any}} with 507 tips, 1013 nodes and 1012 branches.
Leaf names are 44, 429, 294, 295, 227, ... [501 omitted] ... and 418
And it can read nexus trees from files too:
julia> ts = open(parsenexus, Phylo.path("H1N1.trees"))
[ Info: Created a tree called 'TREE1'
[ Info: Created a tree called 'TREE2'
TreeSet with 2 trees, each with 507 tips.
Tree names are TREE2 and TREE1
TREE2: PolytomousTree{ManyRoots,DataFrames.DataFrame,Dict{String,Any}} with 507 tips, 1013 nodes and 1012 branches.
Leaf names are H1N1_A_MIYAGI_3_2000, H1N1_A_PARMA_6_2008, H1N1_A_AKITA_86_2002, H1N1_A_DAKAR_14_1997, H1N1_A_EGYPT_84_2001, ... [501 omitted] ... and H1N1_A_HONGKONG_2070_1999
TREE1: PolytomousTree{ManyRoots,DataFrames.DataFrame,Dict{String,Any}} with 507 tips, 1013 nodes and 1012 branches.
Leaf names are H1N1_A_MIYAGI_3_2000, H1N1_A_PARMA_6_2008, H1N1_A_AKITA_86_2002, H1N1_A_DAKAR_14_1997, H1N1_A_EGYPT_84_2001, ... [501 omitted] ... and H1N1_A_HONGKONG_2070_1999
julia> ts["TREE1"]
PolytomousTree{ManyRoots,DataFrames.DataFrame,Dict{String,Any}} with 507 tips, 1013 nodes and 1012 branches.
Leaf names are H1N1_A_MIYAGI_3_2000, H1N1_A_PARMA_6_2008, H1N1_A_AKITA_86_2002, H1N1_A_DAKAR_14_1997, H1N1_A_EGYPT_84_2001, ... [501 omitted] ... and H1N1_A_HONGKONG_2070_1999
julia> gettreeinfo(ts)
Dict{String,Dict{String,Any}} with 2 entries:
"TREE2" => Dict{String,Any}("lnP"=>-1.0)
"TREE1" => Dict{String,Any}("lnP"=>1.0)
julia> gettreeinfo(ts, "TREE1")
Dict{String,Any} with 1 entry:
"lnP" => 1.0
And while we wait for me (or kind contributors!) to fill out
the other extensive functionality that many phylogenetics packages
have in other languages, the other important feature that it offers is
a fully(?)-functional interface to R, allowing any existing R library
functions to be carried out on julia trees, and trees to be read from
disk and written using R helper functions. Naturally the medium-term
plan is to fill in as many of these gaps as possible in Julia, so the R interface does not make RCall a dependency of the package (we use the
Requires
package to avoid dependencies). Instead, if you want to use
the R interface you just need to use both packages:
julia> using Phylo
julia> using RCall
Creating Phylo RCall interface...
R> library(ape)
You can then translate back and forth using rcopy
on
R phylo
objects, and RObject
constructors on julia NamedTree
types to keep them in Julia or @rput
to move the object into R:
julia> rt = rcall(:rtree, 10)
RObject{VecSxp}
Phylogenetic tree with 10 tips and 9 internal nodes.
Tip labels:
t3, t5, t8, t1, t10, t9, ...
Rooted; includes branch lengths.
julia> jt = rcopy(NamedTree, rt)
PolytomousTree{ManyRoots,DataFrames.DataFrame,Dict{String,Any}} with 10 tips, 19 nodes and 18 branches.
Leaf names are t3, t5, t8, t1, t10, ... [4 omitted] ... and t7
julia> rjt = RObject(jt); # manually translate it back to R
R> all.equal($rjt, $rt)
[1] TRUE
julia> @rput rt; # Or use macros to pass R object back to R
julia> @rput jt; # And automatically translate jt back to R
R> jt
Phylogenetic tree with 10 tips and 9 internal nodes.
Tip labels:
t3, t5, t8, t1, t10, t9, ...
Rooted; includes branch lengths.
R> all.equal(rt, jt) # check no damage in translations
[1] TRUE
For the time being the code will only work with rooted trees with named tips and branch lengths. If there's demand for other types of trees, I'll look into it.