Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I'm confused with DNN predicts #18

Open
kezhouqin opened this issue Apr 6, 2021 · 4 comments
Open

I'm confused with DNN predicts #18

kezhouqin opened this issue Apr 6, 2021 · 4 comments

Comments

@kezhouqin
Copy link

hi, Dr. Zhang,
I'm running the Darts_DNN, and I have several questions to ask. I have run Darts_BHT, and I have learned Darts_DNN get_data, and Darts_DNN build_feature, now I'am confused with Darts_DNN predict.

when I use Darts_BHT, I import the bam files obtained by star soft, but Darts_DNN build_feature requires kallisto, so I'am confused, whether I can use the rsem to obtain the expression data? could you tell me how I can get the file ENCODE_sequenceFeature_absmax_normalized.h5, and RBP_tpm.txt? is RBP_tpm.txt the expression file obtained by kallisto? I have learned some information from the issues, and will have a try. Another question is bam files which I used should be generated by star or by kallisto? Thank you very much!

Best wishes!

kezhou

@zj-zhang
Copy link
Collaborator

zj-zhang commented Apr 6, 2021

Hi to answer your questions:

  1. You can use rsem for the RBP expression quantification, as long as the transcript mapping is the same as kallisto; however, it's quite possible that they are not.
    The current transcript to gene mapping is hard-coded which is far from extensible. Pull requests to fix this are welcomed - should make the expression quantification a lot more easy to use.

  2. The files you mentioned are cis- and trans-features. They can be downloaded by
    Darts_DNN get_data -d transFeature cisFeature trainedParam -t ${eventType}
    I would recommend running the test data first:
    https://darts-dnn.readthedocs.io/en/latest/#using-predict

  3. The bam files to run Darts_BHT should be generated by STAR, not kallisto.

Hope this helps.

@kezhouqin
Copy link
Author

Thanks for you quick reply. It works on the test data.
other qestions: There is a parameter --readlength in Darts BHT, do you mean that it is seqence length (about 50bp) without primer adapter , not 150bp in RNA-seq? I remember Darts BHT depends on STAR soft, Darts DNN requires kallisto to further prove the AS event, is it right?

@zj-zhang
Copy link
Collaborator

zj-zhang commented Apr 6, 2021

I am not entirely sure what 50bp and 150bp means in your particular case, or why the sequencing adapter would take 100bp which looks odd to me. In any case, -t is the read length in BAM that your STAR is run against.
Right, Darts DNN requires the gene expression to improve the AS event.

@kezhouqin
Copy link
Author

I am not entirely sure what 50bp and 150bp means in your particular case, or why the sequencing adapter would take 100bp which looks odd to me. In any case, -t is the read length in BAM that your STAR is run against.
Right, Darts DNN requires the gene expression to improve the AS event.

Oh,I‘m sorry, I mean that 50bp is the shortest length from the result tested by the soft fastqc, and 150bp is the length based on illumina second-generation sequencing platform.

The readlength of all sequences from the BAM file is different, isn't it? so how do I choose the length? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants