Version: "v0.1.0"
Author: "Zijun Zhang"
Date: "2.17.2019"
The recommended way to install Darts_DNN
is through Anaconda.
You can also create a new environment for Darts, because currently DARTS works in Python 2.7.
conda create -n darts python=2.7 # optional
source activate darts
conda install -c darts-comp-bio darts_dnn
This will allow conda to do all the heavy-lifting and most often the easiest way to get things spinning.
Alternatively, to install Darts_DNN
python package from Github, navigate to this folder, then type
> cd Darts_DNN
> make install
There are a few Deep-learning packages that Darts_DNN
requires, including
the popular high-level interface Keras.
To test whether you have successfully installed Darts_DNN
, type the following command in your shell:
> Darts_DNN -h
usage: Darts_DNN [-h] [--version] {train,predict,build_feature,get_data} ...
Darts_DNN -- DARTS - Deep-learning Augmented RNA-seq analysis of Transcript
Splicing
positional arguments:
{train,predict,build_feature,get_data}
train Darts_DNN train: train a DNN model using Darts
Framework from scratch
predict Darts_DNN predict: make predictions on a built feature
sets in h5 format
build_feature Darts_DNN build_feature: build feature file given
required information
get_data Darts_DNN get_data: connects online to get Darts_DNN
data for the current version.
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
For command line options of each sub-command, type: Darts_DNN COMMAND -h
Most often one will be using the predict
subcommand to make predictions. Using the test_data download from Github, you can test whether this function works properly:
cd test_data/
tar -xvzf A5SS.thymus_adipose.tgz
Darts_DNN predict -i darts_bht.flat.txt -e RBP_tpm.txt -o pred.txt -t A5SS
The "A5SS.thymus_adipose.tgz" is the Roadmap thymus-adipose tissue-specific Alternative 5' splice sites analysis results generated by Darts BHT
. Note that all thymus-related comparisons, including this one, is held-out data and never seen by the trained Darts_DNN
model.
For more details, please refer to the documentation site at ReadTheDocs here. Below we provide a minimal example.
In the simplest case, the predict function can be invoked by providing a labelled input file (generated from Darts_BHT bayes_infer) and a trans gene expression file.
If you have not installed the Darts_DNN previously, you will need to download the cis-Features and trained model parameters, etc. through Darts_DNN get_data. get_data function will automatically resume previous run and check md5sum - so don't worry about doubled storage space.
For the purpose of this walk-through tutorial, since our test data is A5SS, we only need to download the files for A5SS splicing events.
Darts_DNN get_data -d transFeature cisFeature trainedParam -t A5SS
Next as an example, download the test_data from GitHub then run:
wget https://github.com/zj-zhang/DARTS-BleedingEdge/raw/master/Darts_DNN/test_data/A5SS.thymus_adipose.tgz
tar -xvzf A5SS.thymus_adipose.tgz
Darts_DNN predict -i darts_bht.flat.txt -e RBP_tpm.txt -o pred.txt -t A5SS
In the screen log output, you should see something like:
2019-02-25 15:02:32,659 - Darts_DNN.predict - INFO -
AUROC=0.8686118716025868
2019-02-25 15:02:32,659 - Darts_DNN.predict - INFO -
AUPR=0.5410178835754661