-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathrnateam.rnaseq.yaml
814 lines (716 loc) · 52.6 KB
/
rnateam.rnaseq.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
name: Galaxy RNA-workbench RNA-Seq Introduction Exercise
description: The Galaxy RNA-workbench RNA-Seq tour
title_default: "Galaxy RNA-workbench RNA-Seq Exercise"
tags:
- "RNA"
steps:
- title: "<b>Welcome to the Galaxy RNA-worbench RNA-Seq example tour!</b>"
content: "This tour will walk you through an exercise which guides you on how to use the Galaxy RNA-workbench. <br><br>
Read and Follow the instructions before clicking <b>'Next'</b>.<br><br>
Click <b>'Prev'</b> in case you missed out on any step"
backdrop: true
- title: "<b>Scenario</b>"
content: "In the study of <a href=\"http://genome.cshlp.org/content/21/2/193.long\" target=\"_blank\">Brooks et al. 2011</a>, the Pasilla (PS) gene, *Drosophila* homologue of the Human splicing regulators Nova-1 and Nova-2 Proteins, was depleted in *Drosophila melanogaster* by RNAi. The authors wanted to identify exons that are regulated by Pasilla gene using RNA sequencing data.<br>
Total RNA was isolated and used for preparing either single-end or paired-end RNA-seq libraries for treated (PS depleted) samples and untreated samples. These libraries were sequenced to obtain a collection of RNA sequencing reads for each sample. The effects of Pasilla gene depletion on splicing events can then be analyzed by comparison of RNA sequencing data of the treated (PS depleted) and the untreated samples.<br>
The genome of *Drosophila melanogaster* is known and assembled. It can be used as reference genome to ease this analysis. In a reference based RNA-seq data analysis, the reads are aligned (or mapped) against a reference genome, *Drosophila melanogaster* here, to significantly improve the ability to reconstruct transcripts and then identify differences of expression between several conditions."
backdrop: true
- title: "<b>Goal</b>"
content: "The goal of this exercise is to <b>become familiar with basic RNA-Seq analysis</b>."
backdrop: true
- title: "<b>Disclaimer</b>"
content: "We are <b>not affiliated</b> with the authors of the paper and we don't make a statement about the relevance or quality of the paper. It is <b>just a fitting example</b> and nothing else.<br>"
backdrop: true
- title: "<b>Overview</b>"
content: "Together we will go through the following:<br>
<b>Pretreatments, Mapping and Analysis of differential expression</b>
<dir>
<li>Step 1: Create and name a new history</li>
<li>Step 2: Download data</li>
<li>Step 3: Quality control</li>
<li>Step 4: Mapping</li>
<li>Step 5: Inspection of TopHat results</li>
<li>Step 6: IGV</li>
<li>Step 7: Analysis of differential gene expression</li>
<li>Step 8: Count the number of reads per annotated gene</li>
<li>Step 9: Analysis of DGE</li>
<li>Step 10: Inspect DGE</li>
<li>Step 11: Visualize DGE</li>
<li>Step 12: Analysis of the functional enrichment among differentially expressed genes</li>
<li>Step 13: Inference of the differential exon usage</li>
<li>Step 14: Annotation of the result tables with gene information</li>
<li>Step 15: Make a workflow</li>
<li>Step 16: Share workflow</li>
<li>Conclusion</li>
</dir>"
backdrop: true
- title: "<b>Step 1: Create and name a new history</b>"
element: "#current-history-panel > div.controls > div.title > div"
intro: "Change the name of your history."
position: "bottom"
- title: "<b>Step 2: Download data</b>"
content: "We will now proceed to download data to Galaxy. The original data is available at NCBI Gene Expression Omnibus (GEO) under accession number <a href=\"http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE18508\" target=\"_blank\">GSE18508</a>. We will look at the 7 first samples: <br>
- 3 treated samples with Pasilla (PS) gene depletion: <a href=\"https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM461179\" target=\"_blank\">GSM461179</a>, <a href=\"https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM461180\" target=\"_blank\">GSM461180</a>, <a href=\"https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM4611810\" target=\"_blank\">GSM461181</a><br>
- 4 untreated samples: <a href=\"https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM461176\" target=\"_blank\">GSM461176</a>, <a href=\"https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM461177\" target=\"_blank\">GSM461177</a>, <a href=\"https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM461178\" target=\"_blank\">GSM461178</a>, <a href=\"https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM461182\" target=\"_blank\">GSM461182</a><br>
Each sample constitutes a separate biological replicate of the corresponding condition (treated or untreated). Moreover, two of the treated and two of the untreated samples are from a paired-end sequencing assay, while the remaining samples are from a single-end sequencing experiment.<br>
We have extracted sequences from the Sequence Read Archive (SRA) files to build FASTQ files."
backdrop: true
- title: "<b>Step 2: Download fastq files with RNA sequences</b>"
element: ".upload-button"
intro: "Use the upload button to upload the file to Galaxy.<br><br>
Click <b>'Next'</b> and the tour will take you to the Upload screen"
position: "right"
postclick:
- ".upload-button"
- title: "<b>Step 2: Download fastq files with RNA sequences</b>"
element: ".upload-text-content:first"
intro: "We now paste the links to a fastq dataset pair into the upload-box. Click next to do so."
preclick:
- ".upload-button"
- "button#btn-new"
textinsert: |
https://zenodo.org/record/61771/files/GSM461177_untreat_paired_subset_1.fastq
https://zenodo.org/record/61771/files/GSM461177_untreat_paired_subset_2.fastq
- title: "<b>Step 2: Download fasta file with RNA sequence</b>"
element: "button#btn-start"
intro: "Now that you've selected the file, select <b>'dm3'</b> as the genome
and fastqsanger as file format.<br><br>
Click <b>'Next'</b> and the tour will <b>'Start'</b> the upload.<br>
Galaxy will automatically unpack the file."
position: "bottom"
postclick:
- "button#btn-start"
- "button#btn-close"
- title: "<b>Step 2: Download fasta file with RNA sequence</b>"
element: "#right"
intro: "This is your history!<br><br>
All <b>analysis steps will be recorded</b> and can be redone at any time.<br><br>
You should be able to see your uploaded file here in a few moments."
position: "left"
- title: "<b>Step 3: Quality Control</b>"
intro: "These files contain the first 100.000 paired-end reads of one sample. The sequences are raw sequences from the sequencing machine, without any pretreatments. They need to be controlled for their quality.<br>
For quality control, we use similar tools as described in <a href=\"https://www.github.com/bgruening/training-material/NGS-QC\">NGS-QC tutorial</a>"
position: "right"
- title: "<b>Step 3: Quality Control</b>"
element: '#tool-search-query'
intro: "We now want to examine the quality of our RNA-Seq reads using <b>FastQC</b>.<br>
This Galaxy instance has FastQC already integrated, so we don't need to install it.<br>
<b>Note:</b> You can use 'tool search' to locate tools. Tools may take a couple of moments to load, please bear with us."
position: "right"
- title: "<b>Step 3: Quality Control</b>"
element: '#tool-search-query'
intro: "You can now type and select <b>'FastQC'</b>.<br><br>
<b>Follow this set of instructions once the tool was loaded:</b><br>
<dir>
<li>Select one of the samples from the paired datset.</li>
<li>Keep the rest of the options at their default values.</li>
<li>Click button 'Execute' and wait for the tool to finish.</li>
<li>Repeat with the other dataset.</li>
</dir>"
position: "right"
- title: "<b>Step 3: Quality Control</b>"
element: '#current-history-panel'
intro: "To inspect the results of the FastQC run just<br><br>
<dir>
<li>Click on the <b>eye icon</b> of the latest dataset and have a look at the output, what do you see? What is the read length, is there anything you notice when you compare both datasets?</li>
</dir>"
position: "left"
- title: "<b>Step 3: Quality Control</b>"
element: '#current-history-panel'
intro: "You should notice the following:<br><br>
<dir>
<li>The read length is 37 bp</li>
<li>The report for GSM461177_untreat_paired_subset_1 is quite good compared to the one for GSM461177_untreat_paired_subset_2. For the latter, the per base sequence quality is bad around the 25th bp (same for the per base N content), because the quality in the 2nd tile is bad (maybe because of some event during sequencing). We need to process these samples according to this quality control and keep in mind the paired-end information.</li>
</dir>"
position: "left"
- title: "<b>Step 3: Quality Control</b>"
element: '#current-history-panel'
intro: "We now process the samples according to the quality of sequences by running <b>Trim Galore</b> on the paired-end datasets<br><br>
</dir>"
position: "left"
- title: "<b>Step 3: Quality Control</b>"
element: '#tool-search-query'
intro: "You can now type and select <b>'TrimGalore'</b>.<br><br>
<b>Follow this set of instructions once the tool was loaded:</b><br>
<ol>
<li>Select the samples from the paired datset and set the sequencing type to paired-end.</li>
<li>Keep the rest of the options at their default values.</li>
<li>Click button 'Execute' and wait for the tool to finish.</li>
</ol>"
position: "right"
- title: "<b>Step 3: Quality Control</b>"
element: '#tool-search-query'
intro: "You can now type and select <b>'FastQC'</b>.<br><br>
<b>Follow this set of instructions once the tool was loaded:</b><br>
<ol>
<li>Select one of the TrimGalore processed samples.</li>
<li>Keep the rest of the options at their default values.</li>
<li>Click button 'Execute' and wait for the tool to finish.</li>
<li>Repeat with the other dataset.</li>
</ol>"
position: "right"
- title: "<b>Step 3: Quality Control</b>"
element: '#current-history-panel'
intro: "To inspect the results of the FastQC run again just<br><br>
<dir>
<li>Click on the <b>eye icon</b> of the latest dataset and have a look at the output, what do you see? What is the read length now, is there anything you notice when you compare both datasets now to before the quality processing?</li>
</dir>"
position: "left"
- title: "<b>Step 4: Mapping</b>"
intro: "Now that we have quality controlled our samples, we want to continue our analysis.<br>
As the genome of *Drosophila melanogaster* is known and assembled, we can use this information and map the sequences on this genome to identify the effects of Pasilla gene depletion on splicing events.<br>
To make sense of the reads, their positions within *Drosophila melanogaster* genome must be determined. This process is known as aligning or 'mapping' the reads to the reference genome."
position: "center"
- title: "<b>Step 4: Mapping</b>"
intro: "Because in the case of a eukaryotic transcriptome, most reads originate from processed mRNAs lacking exons, they cannot be simply mapped back to the genome as we normally do for DNA data. Instead the reads must be separated into two categories:<br>
<dir>
<li>Reads that map entirely within exons</li>
<li>Reads that cannot be mapped within an exon across their entire length because they span two or more exons</li>
</dir>"
position: "center"
- title: "<b>Step 4: Mapping</b>"
intro: "Spliced mappers have been developed to efficiently map transcript-derived reads against genomes.<br>
<a href=\"https://ccb.jhu.edu/software/tophat/index.shtml\">TopHat</a> was one of the first tools designed specifically to address this problem:<br>
<dir>
<li>1. Identification of potential exons using reads that do map to the genome</li>
<li>2. Generation of possible splices between neighboring exons</li>
<li>3. Comparison of reads that did not initially map to the genome against these *in silico* created junctions</li>
</dir>"
position: "center"
- title: "<b>Step 4: Mapping</b>"
intro: "TopHat needs to know two important parameters about the sequencing library<br>
<dir>
<li>The library type</li>
<li>The mean inner distance between the mate pairs for paired end data</li>
</dir><br>
These information should usually come with your FASTQ files, ask your sequencing facility! If not, try to find them on the site where you downloaded the data or in the corresponding publication.<br>
Another option is to estimate these parameters with a *preliminary mapping* of a *downsampled* file and some analysis programs. Afterward, the actual mapping can be redone on the original files with the optimized parameters.<br>
To help finding the needed previous information and afterward annotating RNA sequences, we can take advantage from already known reference gene annotations."
position: "center"
- title: "<b>Step 4: Mapping</b>"
element: ".upload-button"
intro: "Use the upload button to upload the dm3 reference genome annotation file to Galaxy.<br><br>
Click <b>'Next'</b> and the tour will take you to the Upload screen"
position: "right"
postclick:
- ".upload-button"
- title: "<b>Step 4: Mapping</b>"
element: ".upload-text-content:first"
intro: "We now paste the link to the ENSEMBL gene annotation gtf file into the upload-box. Click next to do so."
preclick:
- ".upload-button"
- "button#btn-new"
textinsert: |
https://zenodo.org/record/61771/files/Drosophila_melanogaster.BDGP5.78.gtf
- title: "<b>Step 4: Mapping</b>"
element: "button#btn-start"
intro: "Now that you've selected the file, select <b>'dm3'</b> as the genome
and gtf as file format.<br><br>
Click <b>'Next'</b> and the tour will <b>'Start'</b> the upload.<br>
Galaxy will automatically unpack the file."
position: "bottom"
postclick:
- "button#btn-start"
- "button#btn-close"
- title: "<b>Step 4: Mapping</b>"
element: "#tool-search-query"
intro: "Now we will use TopHat to map out reads to the dm3 genome. You can now type and select <b>'TopHat'</b> and use the full parameter set to get the best mapping results.<br><br>
<b>Follow this set of instructions once the tool was loaded:</b><br>
<ol>
<li>Paired-end instead of single-end</li>
<li>TrimGalore output as input in correct order (forward and reverse reads)</li>
<li>Unstranded</li>
<li>'dm3' as reference genome</li>
<li>Mean inner distance to 112</li>
<li>Library type to</li>
<li>Minimum length of read segments to 18</li>
<li>'Yes' to use own junction data</li>
<li>'Yes' to use Gene Annotation Model</li>
<li>`Drosophila_melanogaster.BDGP5.78.gtf` as Gene Model Annotations (to enable transcriptome alignment)</li>
<li>'No (--coverage-search)' to use coverage-based search for junctions as it needs a lot a time. But consider this option for real world data.<br>
The TopHat algorithm splits reads into segments to map the reads across splice junctions. Coverage-based search for junctions increases the sensitivity.</li>
<li>Keep the rest of the options at their default values.</li>
<li>Click button 'Execute' and wait for the tool to finish.</li>
</ol>"
position: "right"
- title: "<b>Step 5: Inspect TopHat Output</b>"
element: '#current-history-panel'
intro: "<b>To inspect the output of <b>TopHat</b>:</b><br>
<dir>
<li>Click on the <b>'eye icon'</b> of the corresponding dataset</li>
<li>Inspect the 'align summary' file</li>
<li>How many forward and reverse reads where mapped?</li>
<li>What is the 'overall read mapping rate' and the 'concordant pair alignment rate'?</li>
<li>Why do some reads have multiple alignments?</li>
</dir>"
position: "left"
- title: "<b>Step 5: Inspect TopHat Output</b>"
intro: "<b>You should see the following:</b><br>
<dir>
<li>90.7% of the forward reads were mapped and 85.8% of the reverse reads</li>
<li>The 'overall read mapping rate' is the rate of mapping when we take into account all reads (forward and reverse reads). Here it is 88.3%.<br>
The 'concordant pair alignment rate' is (number of aligned pair - number of discordant alignments)/(number of paired reads). Here the value is 80.3%, a quite good value. Maximizing this value is the goal.</li>
<li>The reads are small and with pseudogenes and other valid genome duplications, it is possible that the reads are mapped multiple times</li>
</dir>"
position: "center"
- title: "<b>Step 5: Inspect TopHat Output</b>"
intro: "<b>TopHat generates a BAM file with the mapped reads and three BED files containing splice junctions, insertions and deletions.<br>
The datasets we used were a subset of the original data. They are then too small to give you a good impression of how real data looks like. So we have run TopHat for you on the real datasets. We extracted only the reads mapped to chromosome 4 of Drosophila, which we will now inspect using 'IGV'</b>"
position: "center"
- title: "<b>Step 6: IGV</b>"
element: "#current-history-panel > div.controls > div.title > div"
intro: "Create and name a new history"
position: "bottom"
- title: "<b>Step 6: IGV</b>"
element: ".upload-text-content:first"
intro: "We now paste the links to the new dataset into the upload-box. Click next to do so."
preclick:
- ".upload-button"
- "button#btn-new"
textinsert: |
https://zenodo.org/record/61771/files/GSM461177_untreat_paired_chr4.bam
https://zenodo.org/record/61771/files/GSM461177_untreat_paired_deletions_chr4.bed
https://zenodo.org/record/61771/files/GSM461177_untreat_paired_insertions_chr4.bed
https://zenodo.org/record/61771/files/GSM461177_untreat_paired_junctions_chr4.bed
- title: "<b>Step 6: IGV</b>"
element: "button#btn-start"
intro: "Now that you've selected the file, select <b>'dm3'</b> as the genome
and the corresponding file formats.<br><br>
Click <b>'Next'</b> and the tour will <b>'Start'</b> the upload.<br>
Galaxy will automatically unpack the files."
position: "bottom"
postclick:
- "button#btn-start"
- "button#btn-close"
- title: "<b>Step 6: IGV</b>"
element: '#current-history-panel'
intro: "Visualize this BAM file and the three BED files, particularly the region on chromosome 4 between 560 kb to 600 kb (`chr4:560,000-600,000`). Click on the 'IGV' symbol of the bam dataset you just uploaded to galaxy to start IGV.<br>
<dir>
<li>Open dataset click on display with IGV and web current</li>
<li>Open the file with a JAVA plugin (e.g.,IcedTea)</li>
<li>Go to View and Preferences and Alignments and set the visibility range to $>=50$kb</li>
<li>Inspect the region on chr4 between 560 kb to 600 kb and copy chr4:560000-600000 to locus window and click GO</li>
<li>Now import the bed output into IGV and Open dataset and click on display with IGV and local</li>
<li>Inspect the results using a Sashimi plot (right-click on the bam file and select Sashimi Plot from the context menu)</li>
</dir>"
position : "center"
- title: "<b>Step 6: IGV</b>"
intro: "What you should be able to see: Which information does the `GSM461177_untreat_paired_junctions_chr4.bed` BED file contain?<br>
How is this information represented in the BED file? And in IGV?<br>
Where is the 'JUNC00013368' junction situated? What is its score?<br>
How many reads are concerned by the 'JUNC00013368' junction, visible when we zoom on `chr4:568,476-571,814`? Can you relate that to the score?<br>
And how many are concerned by the 'JUNC00013369' junction?"
backdrop: true
- title: "<b>Step 6: IGV</b>"
intro: "Answers
<dir>
<li>`GSM461177_untreat_paired_junctions_chr4.bed` BED file contain the splicing events, *i.e.* when at least a single read splits across two exons in the alignment track</li>
<li>The BED file is a tabular with: Chrom, Start, End, Name, Score, Strand, ThickStart, ThickEnd, ItemRGB, BlockCount, BlockSizes, BlockStart. In IGV, the junctions are represented by an arc from the beginning to the end of the junction. The color of the arc represent the strand on which the junction is found. The height of the arc, and its thickness, are proportional to the depth of read coverage. </li>
<li>The 'JUNC00013368' junction starts at 568,736 and ends at 569,905. It has a score of 6.</li>
<li>6 reads split across 'JUNC00013368', exactly the score</li>
<li>8 reads split across 'JUNC00013369'. 3 reads are also mapped in the junction chromosome part: these reads are then part of the exon and may be implied in a different splicing.</li>
</dir>
"
- title: "<b>Step 6: IGV</b>"
intro: "Sashimi Plot<br>
In the IGV window Right click on the BAM file and select <b>Sashimi Plot</b> from the context menu.<br>
<dir>
<li>What does the bar graph represent? And the numbered line?</li>
<li>What does the number means?</li>
<li>What is the name of the junction where 10 reads split? What is its position on the genome?</li>
</dir>"
- title: "<b>Step 6: IGV</b>"
intro: "Sashimi Plot<br>
<dir>
<li>The coverage for each alignment track is plotted as a bar graph. Arcs representing splice junctions connecting exons</li>
<li>Arcs display the number of reads split across the junction (junction depth). </li>
<li>JUNC00013370 starts at 574338 and ends at 578091.</li>
</dir>"
- title: "<b>Step 7: Analysis of the differential gene expression</b>"
intro: "To identify exons that are regulated by the Pasilla gene, we need to identify genes and exons which are differentially expressed between samples with PS gene depletion and control samples.<br>
To compare the expression of single genes between different conditions (e.g. with or without PS depletion), an first essential step is to quantify the number of reads per gene. <a href=\"http://www-huber.embl.de/users/anders/HTSeq/doc/count.html\">HTSeq-count</a> is one of the most popular tools for gene expression quantification.<br>
To quantify the number of reads mapped to a gene, an annotation of the genomic features is needed. We already uploaded the <a href=\"https://zenodo.org/record/61771/files/Drosophila_melanogaster.BDGP5.78.gtf\">Drosophila_melanogaster.BDGP5.78.gtf</a> with the Ensembl gene annotation for *Drosophila melanogasterto Galaxy."
position: "center"
- title: "<b>Step 8: Count the number of reads per annotated gene</b>"
content: "In principle, the counting of reads overlapping with genomic features is a fairly simple task, but there are some details that need to be decided. HTSeq-count offers 3 choices ('union', 'intersection_strict' and 'intersection_nonempty') to handle read mapping to multiple locations, reads overlapping introns, or reads that overlap more than one genomic feature<br>
The recommended mode is 'union', which counts overlaps even if a read only shares parts of its sequence with a genomic feature and disregards reads that overlap more than one feature."
- title: "<b>Step 8: Count the number of reads per annotated gene</b>"
element: '#current-history-panel'
intro: " Copy the `Drosophila_melanogaster.BDGP5.78.gtf` file from the first history<br>
Click on 'View all histories' in the top right <br>
Drag and drop the file you want to copy to your new history <br>
Click on 'Done' on the top left "
- title: "<b>Step 8: Count the number of reads per annotated gene</b>"
element: "#tool-search-query"
intro: "You can now type and select <b>'HTSeq-count'</b>.<br><br>
<b>Follow this set of instructions once the tool was loaded:</b><br>
<ol>
<li>Input is the sorted bam file downloaded before.</li>
<li>`Drosophila_melanogaster.BDGP5.78.gtf` as 'GFF file'</li>
<li>The 'union' mode</li>
<li>A 'Minimum alignment quality' of 10</li>
<li>Click button 'Execute' and wait for the tool to finish.</li>
</ol>"
position: "right"
- title: "<b>Step 8: Count the number of reads per annotated gene</b>"
intro: "Which feature has the most reads mapped on it?<br>
To display the most often found feature, we first need to sort the output file with the feature by the number of reads found for these feature. We do that using Sort tool, sort on the second column and in descending order. This shows us that FBgn0017545 is the feature with the most reads mapped on it with 4,030 reads."
- title: "<b>Step 9: Analysis of DGE</b>"
intro: "In the previous section, we counted only reads that mapped to chromosome 4 for only one sample. To be able to identify differential gene expression induced by PS depletion, all datasets (3 treated and 4 untreated) must be analyzed with the similar procedure.<br>
You can export a workflow from the previous steps and rerun it on the 7 samples whose the raw sequences are available on [Zenodo](http://dx.doi.org/10.5281/zenodo.61771). For time saving, we run the previous steps for you and obtain 7 count files, available on [Zenodo](http://dx.doi.org/10.5281/zenodo.61771)<br>
These files contain for each gene the number of reads mapped to it. We could compare directly the files and then having the differential gene expression. But the number of sequenced reads mapped to a gene depends on:<br>
<dir>
<li> Its own expression level</li>
<li> Its length</li>
<li> The sequencing depth</li>
<li> The expression of all other genes within the sample</li>
</dir>"
- title: "<b>Step 9: Analysis of DGE</b>"
content: "For within as well as for inter-sample comparison, the counts need to be normalized. We can then run Differential Gene Expression (DGE) analysis, which has two basic tasks:<br>
<dir>
<li> Estimate the biological variance using the replicates for each condition</li>
<li> Estimate the significance of expression differences between any two conditions</li>
</dir>
This expression analysis is estimated from read counts and attempts are made to correct for variability in measurements using replicates that are <b>absolutely essential<b> for accurate results. For your own analysis, we advice you to use at least 3, better even 5 biological replicates."
- title: "<b>Step 9: Analysis of DGE</b>"
content: "
<a href=\"https://bioconductor.org/packages/release/bioc/html/DESeq2.html\">DESeq2</a> is a great tool for DGE analysis. It takes read counts produced by **HTseq-count** and applies size factor normalization:
<dir>
<li> Computation for each gene of the geometric mean of read counts across all samples</li>
<li> Division of every gene count by the geometric mean</li>
<li> Use of the median of these ratios as sample's size factor for normalization</li>
</dir>
Multiple factors can then be incorporated in the analysis. In our example, we have samples with two varying factors:
<dir>
<li> Treatment (either treated or untreated)</li>
<li> Sequencing type (paired-end or single-end)</li>
</dir>
Here treatment is the primary factor which we are interested in.The sequencing type is further information that we know about the data that might effect the analysis. This particular multi-factor analysis allows us to assess the effect of the treatment taking also the sequencing type into account."
- title: "<b>Step 9: Analysis of DGE</b>"
element: "#current-history-panel > div.controls > div.title > div"
intro: "Create and name a new history"
position: "bottom"
- title: "<b>Step 9: Analysis of DGE</b>"
element: ".upload-button"
intro: "Use the upload button to upload the file to Galaxy.<br><br>
Click <b>'Next'</b> and the tour will take you to the Upload screen"
position: "right"
postclick:
- ".upload-button"
- title: "<b>Step 9: Analysis of DGE</b>"
element: ".upload-text-content:first"
intro: "We now paste the links to the new dataset into the upload-box. Click next to do so."
preclick:
- ".upload-button"
- "button#btn-new"
textinsert: |
https://zenodo.org/record/61771/files/GSM461176_untreat_single.counts
https://zenodo.org/record/61771/files/GSM461177_untreat_paired.counts
https://zenodo.org/record/61771/files/GSM461178_untreat_paired.counts
https://zenodo.org/record/61771/files/GSM461179_treat_single.counts
https://zenodo.org/record/61771/files/GSM461180_treat_paired.counts
https://zenodo.org/record/61771/files/GSM461181_treat_paired.counts
https://zenodo.org/record/61771/files/GSM461182_untreat_single.counts
- title: "<b>Step 9: Analysis of DGE</b>"
element: "button#btn-start"
intro: "Now that you've selected the file, select <b>'dm3'</b> as the genome.<br><br>
Click <b>'Next'</b> and the tour will <b>'Start'</b> the upload.<br>
Galaxy will automatically unpack the files."
position: "bottom"
postclick:
- "button#btn-start"
- "button#btn-close"
- title: "<b>Step 9: Analysis of DGE</b>"
element: "#tool-search-query"
intro: "You can now type and select <b>'DESeq2'</b>.<br><br>
<b>Follow this set of instructions once the tool was loaded:</b><br>
<ol>
<li>Treatment as first factor and untreated as levels and selectio of count files corresponding to both levels.</li>
<li>You can select several files by keeping the CTRL (or COMMAND) key pressed and clicking on the interesting files</li>
<li>'Sequencing' as second factor with 'PE' and 'SE' as levels and selection of count files corresponding to both levels</li>
<li>Keep the rest of the options at their default values.</li>
<li>Click button 'Execute' and wait for the tool to finish.</li>
</ol>"
position: "right"
- title: "Step 10: Inspect DGE"
element: '#current-history-panel'
intro: "<b>To inspect the output of <b>DESeq2</b>:</b><br>
<dir>
<li>Click on the <b>'eye icon'</b> of the corresponding dataset</li>
<li>First insepct the tabular file, the columns are Gene Identifiers; Mean normalized counts, averaged over all samples from both conditions; Logarithm (to basis 2) of the fold change </li>
</dir>"
position: "left"
- title: "Step 10: Inspect DGE"
content: "The log2 fold changes are based on primary factor level 1 vs. factor level 2. The order of factor levels is important. For example, for the factor 'Treatment', DESeq2 computes fold changes of 'treated' samples against 'untreated', i.e. the values correspond to up- or downregulations of genes in treated samples.<br>
<dir>
<li>Standard error estimate for the log2 fold change estimate</li>
<li><a href=\"https://en.wikipedia.org/wiki/Wald_test) statistic\">Wald</a></li>
<li>*p*-value for the statistical significance of this change</li>
<li>*p*-value adjusted for multiple testing with the Benjamini-Hochberg procedure which controls false discovery rate <a href=\"https://en.wikipedia.org/wiki/False_discovery_rate\"FDR</a></li>"
position: "center"
- title: "Step 10: Inspect DGE"
content: " Run <b>Filter</b> to extract genes with a significant change in gene expression (adjusted *p*-value equal or below 0.05) between treated and untreated samples<br>
<b>Find out: How many genes have a significant change in gene expression between these conditions?</b>"
backdrop: true
- title: "Step 10: Inspect DGE"
content: " Run <b>Filter</b> to extract genes with a significant change in gene expression (adjusted *p*-value equal or below 0.05) between treated and untreated samples<br>
<b>Find out: How many genes have a significant change in gene expression between these conditions?</b><br>
Filter for all genes from the DESeq2 result file that have a significant adjusted p-value of 0.05 or below (Filter tool: condition c7$<=$0.05). Please note that the output was already sorted by adjusted p-value.<br>
We get 751 genes (5.05%) with a significant change in gene expression between treated and untreated samples."
- title: "Step 10: Inspect DGE"
content: "The file with the independent filtered results can be used for further downstream analysis as it excludes genes with only few read counts as these genes will not be considered as significantly differentially expressed."
- title: "Step 10: Inspect DGE"
element: '#current-history-panel'
intro: "Rename your filtered datasets for downstream analysis"
- title: "Step 10: Inspect DGE"
content: "<b>Are there more upregulated or downregulated genes in the treated samples?</b><br>
To obtain the up-regulated genes, we filter the previously generated file (with the significant change in gene expression) with the condition 'c3>0' (the log2 fold changes must be greater than 0). We obtain 331 genes (44.07% of the genes with a significant change in gene expression). For the down-regulated genes, we do the inverse and we find 420 genes (55.93% of the genes with a significant change in gene expression"
- title: "Step 11: Visualize DGE"
element: '#current-history-panel'
intro: "In addition to the list of genes, <b>DESeq2</b> outputs a graphical summary of the results, useful to evaluate the quality of the experiment:<br>
To inspect the Histogram of *p*-values for all tests:<br>
<dir>
<li>Click on the <b>'eye icon'</b> of the corresponding dataset</li>
</dir>"
position: "right"
- title: "Step 11: Visualize DGE"
element: '#current-history-panel'
intro: "<a href=\"https://en.wikipedia.org/wiki/MA_plot\">MA plot</a>: global view of the relationship between the expression change of conditions (log ratios, M), the average expression strength of the genes (average mean, A), and the ability of the algorithm to detect differential gene expression. The genes that passed the significance threshold (adjusted p-value < 0.1) are colored in red.<br>
To inspect the MA-plot:<br>
<dir>
<li>Click on the <b>'eye icon'</b> of the corresponding dataset</li>
</dir>"
position: "right"
- title: "Step 11: Visualize DGE"
element: '#current-history-panel'
intro: "Principal Component Analysis <a href=\"https://en.wikipedia.org/wiki/Principal_component_analysis\">PCA</a><br>
Each replicate is plotted as an individual data point. This type of plot is useful for visualizing the overall effect of experimental covariates and batch effects.<br>
To inspect the PCA-plot:<br>
<dir>
<li>Click on the <b>'eye icon'</b> of the corresponding dataset</li>
</dir><br>
<b>What are the two axis separating?</b>"
position: "right"
backdrop: true
- title: "Step 11: Visualize DGE"
intro: "<dir>
<li>The first axis is seperating the treated samples from the untreated samples, as defined when DeSeq was launched</li>
<li>The second axis is separating the single-end datasets from the paired-end datasets</li>
</dir>"
- title: "Step 11: Visualize DGE"
element: '#current-history-panel'
intro: "Heatmap of sample-to-sample distance matrix: overview over similarities and dissimilarities between samples<br>
To inspect the Heatmap:<br>
<dir>
<li>Click on the <b>'eye icon'</b> of the corresponding dataset</li>
</dir><br>
<b>How are the samples grouped?</b>"
position: "right"
backdrop: true
- title: "Step 11: Visualize DGE"
intro: "They are first grouped depending on the treatment (the first factor) and then on the library type (the second factor), as defined when DeSeq was launched"
position: "center"
- title: "Step 11: Visualize DGE"
element: '#current-history-panel'
intro: "Dispersion estimates: gene-wise estimates (black), the fitted values (red), and the final maximum a posteriori estimates used in testing (blue)<br>
To inspect the Dispersion estimates plot:<br>
<dir>
<li>Click on the <b>'eye icon'</b> of the corresponding dataset</li>
</dir><br>"
position: "right"
backdrop: true
- title: "Step 11: Visualize DGE"
intro: "This dispersion plot is typical, with the final estimates shrunk from the gene-wise estimates towards the fitted estimates. <br>
Some gene-wise estimates are flagged as outliers and not shrunk towards the fitted value.<br>
The amount of shrinkage can be more or less than seen here, depending on the sample size, the number of coefficients, the row mean and the variability of the gene-wise estimates.<br>
For more information about <b>DESeq2</b> and its outputs, you can have a look at <a href=\"https://www.bioconductor.org/packages/release/bioc/manuals/DESeq2/man/DESeq2.pdf\">DESeq2 documentation</a>"
position: "center"
backdrop: true
- title: "Step 12: Analysis of the functional enrichment among differentially expressed genes"
content: "We have extracted genes that are differentially expressed in treated (with PS gene depletion) samples compared to untreated samples. We would like to know the functional enrichment among the differentially expressed genes.<br>
The Database for Annotation, Visualization and Integrated Discovery ([DAVID](https://david.ncifcrf.gov/)) provides a comprehensive set of functional annotation tools for investigators to understand the biological meaning behind large lists of genes.<br>
We use then DAVID to identify functional annotations of the upregulated genes and the downregulated genes."
position: "center"
- title: "Step 12: Analysis of the functional enrichment among differentially expressed genes"
element: '#current-history-panel'
content: "Sort the 2 datasets generated previously (upregulated genes and downregulated genes) given the log2 fold change, in descending or ascending order (to obtain the higher absolute log2 fold changes on the top)."
position: "right"
- title: "Step 12: Analysis of the functional enrichment among differentially expressed genes"
element: '#current-history-panel'
content: "Extract the first 100 lines of sorted files and then run DAVID on these files"
- title: "Step 12: Analysis of the functional enrichment among differentially expressed genes"
element: "#tool-search-query"
intro: "You can now type and select <b>'DAVID'</b>.<br><br>
<b>Follow this set of instructions once the tool was loaded:</b><br>
<dir>
<li>Input from first 100 lines of sorted files</li>
<li>First column as 'Column with identifiers'</li>
<li>'FLYBASE_GENE_ID' as 'Identifier type'</li>
<li>Click button 'Execute' and wait for the tool to finish.</li>
</dir>"
position: "right"
- title: "Step 12: Analysis of the functional enrichment among differentially expressed genes"
content: "The output of the <b>DAVID</b> tool is a HTML file with a link to the DAVID website.<br>
Inspect the Functional Annotation Chart<br>
What functional categories are the most represented ones?"
- title: "Step 12: Analysis of the functional enrichment among differentially expressed genes"
content: "The up-regulated genes are mostly related to membrane (in the number of genes). The most represented functional categories are linked to signal and pathways for the down-regulated genes.<br>
Now inspect the Functional Annotation Clusterings<br>
What functional annotations are the first clusters related to?"
- title: "Step 12: Analysis of the functional enrichment among differentially expressed genes"
content: "For the up-regulated genes, the first cluster is more composed of functions related to chaperone and stress response. The down-regulated genes are more linked to ligase activity."
- title: "Step 13: Inference of the differential exon usage"
content: "Now, we would like to know the differential exon usage between treated (PS depleted) and untreated samples using RNA-seq exon counts.<br>
We will therefore go back and work on the mapping results <a href=\"https://zenodo.org/record/61771/files/GSM461177_untreat_paired_chr4.bam\">GSM461177_untreat_paired_chr4.bam</a>.<br>
Copy the `Drosophila_melanogaster.BDGP5.78.gtf` file and the bam file from the first and second history<br>
We use <a href=\"http://www.bioconductor.org/packages/release/bioc/html/DEXSeq.html>DEXSeq</a> which detects high sensitivity genes, and in many cases exons, that are subject to differential exon usage."
- title: "Step 13: Inference of the differential exon usage"
element: "#tool-search-query"
intro: "First, we need to count the number of reads mapping the exons. This step is similar to counting the number of reads per annotated gene. Here instead of HTSeq-count, we are using DEXSeq-Count<br>
You can now type and select <b>'DEXSeq-Count'</b>.<br><br>
<b>Follow this set of instructions once the tool was loaded:</b><br>
<dir>
<li>Run on Drosophila_melanogaster.BDGP5.78.gtf</li>
<li>'Prepare annotation' as 'Mode of operation'</li>
<li>Keep the rest of the options at their default values.</li>
<li>Click button 'Execute' and wait for the tool to finish.</li>
<li>The output is again a GTF file that is ready to use for counting.</li>
</dir>"
position: "right"
- title: "Step 13: Inference of the differential exon usage"
element: "#tool-search-query"
intro: "Now we count reads<br>
You can now type and select <b>'DEXSeq-Count'</b>.<br><br>
<b>Follow this set of instructions once the tool was loaded:</b><br>
<dir>
<li>Run on GSM461177_untreat_paired_chr4.bam</li>
<li>'Count Reads' as 'Mode of operation'</li>
<li>Keep the rest of the options at their default values.</li>
<li>Click button 'Execute' and wait for the tool to finish.</li>
</dir>"
position: "right"
- title: "Step 13: Inference of the differential exon usage"
element: '#current-history-panel'
intro: "Inspect the output<br>
<b>Which exon has the most read mapped on it? From which gene has this exon beed extracted? Is it similar to the previous result with HTSeq-count?</b>"
position: "right"
- title: "Step 13: Inference of the differential exon usage"
intro: "FBgn0017545:004 is the exon with the most read mapped on it. It is part of FBgn0017545, the feature with the most reads mapped from HTSeq-count"
position: "center"
- title: "Step 13: Inference of the differential exon usage"
intro: "DEXSeq usage is similar to DESeq2. It uses similar statistics to find differentially used exons.<br>
As for DESeq2, we counted only reads that mapped to exons on chromosome 4 for only one sample in the previous step. To be able to identify differential exon usage induced by PS depletion, all datasets (3 treated and 4 untreated) must be analyzed with the similar procedure.<br>
To save time, we did that for you. The results are available on <a href=\"http://dx.doi.org/10.5281/zenodo.61771\">Zenodo</a>, we will load them into your history using the file upload procedure as before. You can now create a new history."
position: "center"
backdrop: true
- title: "Step 13: Inference of the differential exon usage"
element: ".upload-text-content:first"
intro: "We now paste the links to the new dataset into the upload-box. Click next to do so."
preclick:
- ".upload-button"
- "button#btn-new"
textinsert: |
https://zenodo.org/record/61771/files/dexseq.gtf
https://zenodo.org/record/61771/files/treated1_singlea.txt
https://zenodo.org/record/61771/files/treated2_paired.txt
https://zenodo.org/record/61771/files/treated3_paired.txt
https://zenodo.org/record/61771/files/untreated1_single.txt
https://zenodo.org/record/61771/files/untreated2_single.txt
https://zenodo.org/record/61771/files/untreated3_paired.txt
https://zenodo.org/record/61771/files/untreated4_paired.txt
- title: "Step 13: Inference of the differential exon usage"
element: "button#btn-start"
intro: "Now that you've selected the file, select <b>'dm3'</b> as the genome.<br><br>
Click <b>'Next'</b> and the tour will <b>'Start'</b> the upload.<br>
Galaxy will automatically unpack the files."
position: "bottom"
postclick:
- "button#btn-start"
- "button#btn-close"
- title: "Step 13: Inference of the differential exon usage"
element: "#tool-search-query"
intro: "You can now type and select <b>'DEXSeq'</b>.<br><br>
<b>Follow this set of instructions once the tool was loaded:</b><br>
<ol>
<li>Condition as first factor and treated and untreated as levels and selectio of count files corresponding to both levels.</li>
<li>You can select several files by keeping the CTRL (or COMMAND) key pressed and clicking on the interesting files</li>
<li>Unlike DESeq2, DEXSeq does not allow flexible primary factor names. Always use your primary factor name as 'condition'</li>
<li>'Sequencing' as second factor with 'PE' and 'SE' as levels and selection of count files corresponding to both levels</li>
<li>Keep the rest of the options at their default values.</li>
<li>Click button 'Execute' and wait for the tool to finish.</li>
</ol>"
position: "right"
- title: "Step 13: Inference of the differential exon usage"
content: "Similarly to DESeq2, DEXSeq generates a table with:<br>
<dir>
<li>Exon identifiers</li>
<li>Gene identifiers</li>
<li>Exon identifiers in the Gene</li>
<li>Mean normalized counts, averaged over all samples from both conditions</li>
<li>Logarithm (to basis 2) of the fold change</li>
<li>The log2 fold changes are based on primary factor level 1 vs. factor level 2. The order of factor levels is then important. For example, for the factor 'Condition', DESeq2 computes fold changes of 'treated' samples against 'untreated', *i.e.* the values correspond to up- or downregulations of genes in treated samples.</li>
<li>Standard error estimate for the log2 fold change estimate</li>
<li>p-value for the statistical significance of this change</li>
<li>p-value adjusted for multiple testing with the Benjamini-Hochberg procedure which controls false discovery rate <a href=\"https://en.wikipedia.org/wiki/False_discovery_rate\">FDR</a></li>
</dir>"
position: "center"
- title: "Step 13: Inference of the differential exon usage"
content: "Run <b>Filter</b> to extract exons with a significant usage (adjusted *p*-value equal or below 0.05) between treated and untreated samples<br>
How many exons have a significant change in usage between these conditions?"
- title: "Step 13: Inference of the differential exon usage"
content: "We get 38 exons (12.38%) with a significant usage change between treated and untreated samples"
- title: "Step 14: Annotation of the result tables with gene information"
content: "Unfortunately, in the process of counting, we loose all the information of the gene except its identifier. In order to get the information back to our final counting tables, we can use the tool 'Annotate DE(X)Seq result' to get the link between identifier and annotation."
- title: "Step 14: Annotation of the result tables with gene information"
element: "#tool-search-query"
content: "Run <b>Annotate DE(X)Seq result</b> on a counting table (from DESeq or DEXSeq) using the `Drosophila_melanogaster.BDGP5.78.gtf` as annotation file."
- title: "<b>Step 12: Name your history</b>"
element: "#current-history-panel > div.controls > div.title > div"
intro: "Change the name of your history."
position: "bottom"
- title: "<b>Step 13: Make a workflow out of steps 5 to 9</b>"
element: '#history-options-button'
intro: "Please extract your history to a workflow.<br>
<b>(History options :: Extract workflow)</b><br><br>
<b>Do not include:</b> 'RNAplot'<br><br>
Click <b>'Create Workflow'</b>."
position: "left"
- title: "<b>Step 15: Make a workflow</b>"
element: '#history-options-button'
intro: "To make sure the workflow is correct, check it in the editor and make some small adjustments.<br><br>
<dir>
<li>Click on the name of your new workflow and select <b>'Edit'</b></li>
<li>The individual steps are displayed as boxes, their <b>outputs and inputs are connected through lines</b></li>
<li>When you click on a box you see the tool options on the right. Besides the tools, you should see two additional boxes titled <b>'Input dataset'</b>. These represent the data we want to feed into our workflow.</li>
</dir>"
position: "left"
- title: "<b>Step 15: Make a workflow</b>"
element: '#history-options-button'
intro: "To make sure our workflow is correct, we look at it in the editor and make some small adjustments.<br><br>
<dir>
<li>Although we have several inputs in the workflow they are missing their connection to some tools we used, because we didn't carry over the intermediate steps</li>
<li><b>Connect</b> each input dataset to the Intersect tool by <b>dragging</b> the arrow pointing outwards on the right of its box (which denotes an output) to an arrow on the left of the Intersect box pointing inwards (which denotes an input). Connect each input dataset with a different input of Intersect</li>
<li>You can also <b>change the names</b> of the input datasets. Don't forget to save it in the end by clicking on <b>'Options'</b> (top right) and selecting <b>'Save'</b></li>
</dir>"
position: "left"
- title: "<b>Step 16: Sharing workflow</b>"
element: 'a[href$="/workflow/list_for_run"]'
intro: "You can share your new workflow.<br>
<dir>
<li>Click on your workflow's name and select <b>'Share or publish'</b></li>
<li>Click <b>'Share with a user'</b></li>
<li>Enter the email address of the person you wish to share your workflow with.(the same as he/she uses to login to Galaxy)</li>
<li>Hit <b>'Share'</b></li>
</dir>"
position: "top"
- title: "<b>Step 16: Sharing workflow</b>"
element: '#center-panel'
intro: "If a workflow has been shared with you, you can find it under <b>'Workflows shared with you by others'</b>:<br>
<dir>
<li>Click on a workflow name and select <b>'View'</b></li>
<li>You can compare the workflows of others with your workflow</li>
</dir>"
position: "right"
- title: "Concluding remarks"
content: "In this tutorial, we have analyzed real RNA sequencing data to extract useful information, such as which genes are up- or downregulated by depletion of the Pasilla gene and which genes are regulated by the Pasilla gene. To answer these questions, we analyzed RNA sequence datasets using a reference-based RNA-seq data analysis approach."
- title: "<b>Enjoy the Galaxy RNA-workbench</b>"
intro: "Thanks for taking this tour! Happy research with Galaxy!"