New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

New Evaluating Reference Data for Bulk RNA Deconvolution tutorial #5549

Open

hexhowells wants to merge 72 commits into galaxyproject:main from hexhowells:deconvolution-evaluation-tutorial

Collaborator

hexhowells commented Nov 19, 2024

New tutorial on evaluating reference data for bulk RNA deconvolution tools, evaluating both MuSiC and NNLS deconvolution tools within Galaxy.

hexhowells and others added 30 commits

October 5, 2024 08:18


          add initial tutorial

d64f9ca


          Merge branch 'galaxyproject:main' into deconvolution-evaluation-tutorial

471ecd7


          add introduction and agenda

ba0299a


          update processing single-cell data section

3517cdb


          Merge branch 'galaxyproject:main' into deconvolution-evaluation-tutorial

9cb86a0


          update visualise results section

4fd1de0


          update lformatting for the lst of fields parameterrs

d48d424


          remove newline in scatterplot tool metadata

6d7aacc


          add scatterplot tool nstrution forrNNLS

655ac6b


          re-order remove header tool before combine collections

68c1216


          Merge branch 'galaxyproject:main' into deconvolution-evaluation-tutorial

8776cfc


          update visualisation and metrics section

b9eea60


          Merge branch 'galaxyproject:main' into deconvolution-evaluation-tutorial

ee63844


          add equation diagrams

51cbbec


          Merge branch 'deconvolution-evaluation-tutorial' of https://github.co…

0c2c3a0

…m/hexhowells/training-material into deconvolution-evaluation-tutorial


          add compare scatterplots image

ba4a59d


          update tutorial

bac873c


          add answer key to scatterplot question

3aaf9f1


          fix advanced cut tool parameter

c0784f1


          add explaination of RMSE

6bcbb47


          add metric analysis

a867acb


          add violin plot tools

1cb54c0


          change scatter plot names

f4a1f70


          add keys to metric equations

ad5ecec


          add question box for violin plots

ea372ba


          update metrics workflow

f404f75


          minor updates to intro

ffb1d4b


          minor updates to get data section

5f900c6


          add inspect single-cell data section

1369f39


          standardise header capitalisation

dde437b

hexhowells and others added 15 commits

November 13, 2024 12:41


          update workflow links

fe9a7e0


          add embedded workflows

d454879


          minor updates/typo fixes

9d9a93d


          minor updates

cac4815


          minor updates

4f2e319


          minor updates

a63d832


          update second workflow

36de17e


          update results table

469d11a


          update scatter plot size

5ab221a


          update plot images

249b602


          update scatter plot questions

4abce18


          update the violin plot questions

09aca40


          minor update

fbcf9d4


          add punctuation

7c1aa84


          Merge branch 'galaxyproject:main' into deconvolution-evaluation-tutorial

hexhowells added new tutorial single-cell labels

hexhowells requested a review from a team as a code owner

November 19, 2024 12:26

hexhowells and others added 9 commits

November 20, 2024 11:22


          fix linting error

c06186e


          Merge branch 'deconvolution-evaluation-tutorial' of https://github.co…

27971e2

…m/hexhowells/training-material into deconvolution-evaluation-tutorial


          Merge branch 'main' into deconvolution-evaluation-tutorial

8f2f946


          possibly fix linting issue

068476d


          Merge branch 'deconvolution-evaluation-tutorial' of https://github.co…

39921ca

…m/hexhowells/training-material into deconvolution-evaluation-tutorial


          Merge branch 'main' into deconvolution-evaluation-tutorial

86cfc3a


          change parameter names for 1st workflow

21b16ab


          Merge branch 'deconvolution-evaluation-tutorial' of https://github.co…

793a88b

…m/hexhowells/training-material into deconvolution-evaluation-tutorial


          Merge branch 'main' into deconvolution-evaluation-tutorial

dd9cc6f

shiltemann reviewed

View reviewed changes

Member

shiltemann left a comment

Thanks @hexhowells! This looks great. A few minor comment below. And I can't speak to the science, but perhaps @nomadscientist can have a look here as well?

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md

+              title: Evaluating Reference Data for Bulk RNA Deconvolution
+              subtopic: deconvo
+              priority: 3
+              zenodo_link: ''

Member

shiltemann Dec 5, 2024

Suggested change

      
            zenodo_link: ''
          
            zenodo_link: 'https://zenodo.org/records/5719228'

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md

+              >    - *"Delimited by"*: `Tab`
+              >    - *"How should the results be sorted?"*: `With the most common value first`
+              >
+              > 2. **Rename** {% icon galaxy-pencil %} output `Cell type counts`

Member

shiltemann Dec 5, 2024

maybe add the faq for renaming a dataset here at the first time people are asked to do it?

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md

+              >    {% snippet faqs/galaxy/workflows_run.md %}
+              {: .hands_on}
+              <iframe title="Galaxy Workflow Embed" style="width: 100%; height: 700px; border: none;" src="https://usegalaxy.eu/published/workflow?id=76d3408d0d22ad05&embed=true&buttons=true&about=false&heading=false&minimap=true&zoom_controls=true&initialX=-20&initialY=-20&zoom=0.5"></iframe>

Member

shiltemann Dec 5, 2024

cool!

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md

+              title: Evaluating Reference Data for Bulk RNA Deconvolution
+              subtopic: deconvo
+              priority: 3

Member

shiltemann Dec 5, 2024

did you mean for this tutorial to be 3rd in the subsection? It is second now because the other tutorials in the subsections have priorities 1 and 4 listed. So please double check if the order is how you want it now.

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md


		Remember since we have a collection of 20 inputs, the output of this workflow will be a collection of 20 elements, each corresponding to the input elements. Each output will have its own random selection of 200 cells.

		> <comment-title>Inputting multiple datasets</comment-title>

Member

shiltemann Dec 5, 2024 •

edited

Loading

this could probably be an FAQ. or maybe you could use the existing "select multiple datasets" faq ? And maybe enhance it with your screenshot?

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md

+              >    - Copy the URL (e.g. via right-click) of [this workflow](https://usegalaxy.eu/u/hexhowells/w/deconv-eval-stage-1) or download it to your computer.
+              >    - Import the workflow into Galaxy
+              >
+              >    {% snippet faqs/galaxy/workflows_run_trs.md path="topics/transcriptomics/tutorials/rna-seq-reads-to-counts/workflows/qc_report.ga" title="QC Report" %}

Member

shiltemann Dec 5, 2024

this is a tad confusing, as it is a hands-on box within a hands-on box. Are they meant to run this QC report workflow at this point? Or did you mean to replace the workflow with the one mentioned in step 1 here?

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md

+              > <hands-on-title>Run pseudo-bulk and actual proportions workflow</hands-on-title>
+              >
+              > 1. **Import the workflow** into Galaxy
+              >    - Copy the URL (e.g. via right-click) of [this workflow](https://usegalaxy.eu/u/hexhowells/w/deconv-eval-stage-1) or download it to your computer.

Member

shiltemann Dec 5, 2024

please include the workflow here in the GTN as well and refer to that in the link.

carloscheemendonca reviewed

View reviewed changes

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md

+              >    - {% icon param-file %} *"Input Dataset"*: `Transposed expression matrix`
+              >    - *"Size of output collection"*: `20`
+              >
+              > 4. **Rename** {% icon galaxy-pencil %} output `Expression data`

carloscheemendonca Dec 8, 2024

I would recommend thinking about keeping the terms consistent, as later on when the first workflow is run, these inputs are described with different terminology

carloscheemendonca reviewed

View reviewed changes

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md

+              > In order to upload the input collections into the workflow, you first need to set the input type to **Multiple datasets** in the input file selection.
+              >
+              > ![Multiple Datasets](../../images/bulk-deconvolution-evaluate/batch-mode.png "Multiple Datasets button in Galaxy")
+              {: .comment}

carloscheemendonca Dec 8, 2024

I experienced some difficulty loading these collections into the workflow. They would not appear in the pull down menu, and I had to drag and drop the collections without visual confirmation that they had been uploaded.

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md

+              >    - {% icon param-collection %} *"Expression Data"*: `Expression Data`
+              >
+              >    {% snippet faqs/galaxy/workflows_run.md %}
+              > 3. Add a tag labelled `#A` to the first "Actual cell proportions" and "Pseudobulk" collections

carloscheemendonca Dec 8, 2024

I am not completely sure, but I feel like the term "actual cell proportions" might be a little misleading. The cell proportions, as indicated by proportional representation in the single-cell data, are often different from the true in vivo cell type proportions due to systematic drop out biases during data collection. This might be worth mentioning, or maybe a different term which doesn't use "actual" could be substituted.

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md

+              >    {% snippet faqs/galaxy/workflows_run_trs.md path="topics/transcriptomics/tutorials/rna-seq-reads-to-counts/workflows/qc_report.ga" title="QC Report" %}
+              >
+              > 2. Run **Workflow inferring cellular proportions** {% icon workflow %} using the following parameters:
+              >    - {% icon param-collection %} *"Pseudobulk - A"*: `expression data - A`

carloscheemendonca Dec 8, 2024

These collections also did not appear in the drop down menus, and rather had to be dragged and dropped.

topics/single-cell/tutorials/bulk-deconvolution-evaluate/tutorial.md

+              > >
+              > > ![Scatter plot comparison](../../images/bulk-deconvolution-evaluate/scatterplot-compare.png "Scatter plot comparison between Music and NNLS")
+              > >
+              > > 1. Comparing scatter plots, the MuSiC tool has the most accurate results since the points fall closer onto the x=y line

carloscheemendonca Dec 8, 2024

Imagine the case that the NNLS deconvolution more closely resembled the cell proportions in the real, biological context, while MuSic more accurately recapitulated with proportions from the single cell data. Which if these two methods are really more accurate, then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new tutorial single-cell