Skip to content

Running RRBS pipeline

Archana Raja edited this page Nov 3, 2022 · 17 revisions

Welcome to the motrpac-rrbs-pipeline wiki!

Steps

  • Optimized memory requirements for processing RRBS pass samples

default parameters

memory = 40

disk_space = 150

threads =6

preTrimFastQC

memory=memory,

disk_space=disk_space,

num_threads=1

attachUMI

memory=40,

disk_space=disk_space,

num_threads=8

trimgalore

memory=40,

disk_space=disk_space,

num_threads=1

NuGen specific diversity adapaters trimmed

 ` memory=memory,`

  `disk_space=disk_space,`

  `num_threads=1,`

  `num_preempt=num_preempt,`

FastQC ran on post trimming reads

  `memory=memory,`

  `disk_space=disk_space,`

  `num_threads=1,`

  `num_preempt=num_preempt,`

multiQC

memory=memory,

   `disk_space=disk_space,`

   `num_threads=1,`

Align trimmed reads to species of interest

memory=100,

      `disk_space=200,`

      `num_threads=12,`

      `num_preempt=0,`

      `bismark_multicore=3,`

Align trimmed reads to lambda for control

memory=100

   `disk_space=200`

   `num_threads=12`

   `num_preempt=0`

Tag UMI duplications in sample and spike-in

   `memory=memory,`

   `disk_space=disk_space,`

   `num_threads=6,`

Remove PCR Duplicates from sample

 `memory=30,`

 `disk_space=200,`

 `num_threads=1,`

Remove PCR Duplicates from Lambda phage spike in

     `memory=memory,`

     `disk_space=disk_space,`

     `num_threads=1,`

Quantify Methylation for sample

  `memory=60,`

  `disk_space=200,`

  `num_threads=16,`

Quantify Methylation for Lambda control spike in

  `memory=60,`

  `disk_space=200,`

  `num_threads=16,`

Align trimGalore trimmed reads to phix genome using bowtie

      `memory=40,`

      `disk_space=200,`

      `num_threads=10,`

      `num_preempt=0`

Compute % mapped to chromosomes and contigs

 `num_threads=8,`

  `memory=30,`

  `disk_space=200,`

  `num_preempt=0,`

Collect required QC Metrics from reports

      `memory=20,`

      `disk_space=50,`

      `num_threads=4,`

      `num_preempt=0`

RRBS pipeline need to be rerun on all of the PASS1A datasets

  • Datasets to process
  1. gs://motrpac-portal-transfer-sinai/RRBS/PASS1A/batch1_20190528/fastq_raw

  2. gs://motrpac-portal-transfer-sinai/RRBS/PASS1A/batch2_20190706/fastq_raw

Output location

gs://rna-seq_araja/rrbs/PASS1A/reprocessed/

`

Clone this wiki locally