Incorrect Model Fit #62

mmilevskiy · 2017-06-20T06:48:42Z

I am having difficulties with the model fit for NucleoATAC. The model is fitting to all my reads, which includes FR and RF reads. The FR read insert length patterns match closely with my ATAC Tn5 banding patterns and expected output from this technique whilst the RF reads do not. I am unsure on how to exclude these, as a result the model may be over-representing the NFR regions and under-representing the nucleosome reads.

Can you comment on how the model affects the output of NucleoATAC? If I use my own insert length file, based on the FR reads only, this finds fewer NFR regions and nucleosome positions.

AliciaSchep · 2017-06-21T15:18:42Z

Can you include some plots of the fragment size to show what you mean?

mmilevskiy · 2017-06-22T00:29:25Z

Hi @AliciaSchep. I have attached the insert length histogram from Picard as well as the model for the NucleoATAC using the default setting under "nucleoatac run'' and when I use my own insert length text file based on the FR reads only.

ATAC_FACS_Pool_Merged_insert_size_histogram.pdf

nucleoATAC_Default.occ_fit.pdf
nucleoATAC_FR_Read_Ins_Leng.occ_fit.pdf

Could the problem be that when I use my own insert length text file the model created based off that doesn't match the sequencing. So if I could remove the RF reads that might solve the problem.

mmilevskiy · 2017-06-23T01:19:46Z

The RF reads, could they be due to smaller read lengths <75bp (75bp PE sequencing, NEXTSeq) and these reads overlapping and being mapped in the wrong orientation, but are reads to be included in the analysis? If so, the model from NucleoATAC would be correct?

AliciaSchep · 2017-06-26T04:08:01Z

I would recommend re-examining how the adapter trimming & alignment is being performed, because there should be no reason for the RF reads to have different distribution than FR, and the sharp cutoff suggests that there is some kind of issue with the alignment that RF framgents longer than the read length are not getting mapped. There also seems to be a general dip of fragments around the read length for the FR reads, which also suggests that the adapter trimming isn't being fully effective at catching cases when only a couple of basepairs of adapter are included.

In general, the fragment length distribution from NucleoATAC might be a little bit different than what you get from Picard because only fragments within the peaks are queried.

mmilevskiy · 2017-06-26T05:01:57Z

Hi Alicia. Thanks, I have been using trim_galore to do the adapter trimming. Would you recommend using the "-nextera" option or the "-a (sequence adapter)." Unless you have another trimming tool you find works better?

I am getting 90-95% mapping for my libraries with Bowtie2 on default settings, except "-X 2000."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect Model Fit #62

Incorrect Model Fit #62

mmilevskiy commented Jun 20, 2017

AliciaSchep commented Jun 21, 2017

mmilevskiy commented Jun 22, 2017 •

edited

Loading

mmilevskiy commented Jun 23, 2017

AliciaSchep commented Jun 26, 2017

mmilevskiy commented Jun 26, 2017

Incorrect Model Fit #62

Incorrect Model Fit #62

Comments

mmilevskiy commented Jun 20, 2017

AliciaSchep commented Jun 21, 2017

mmilevskiy commented Jun 22, 2017 • edited Loading

mmilevskiy commented Jun 23, 2017

AliciaSchep commented Jun 26, 2017

mmilevskiy commented Jun 26, 2017

mmilevskiy commented Jun 22, 2017 •

edited

Loading