-
Notifications
You must be signed in to change notification settings - Fork 923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Evaluating Reference Data for Bulk RNA Deconvolution tutorial #5549
base: main
Are you sure you want to change the base?
New Evaluating Reference Data for Bulk RNA Deconvolution tutorial #5549
Conversation
…m/hexhowells/training-material into deconvolution-evaluation-tutorial
…m/hexhowells/training-material into deconvolution-evaluation-tutorial
…m/hexhowells/training-material into deconvolution-evaluation-tutorial
…m/hexhowells/training-material into deconvolution-evaluation-tutorial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hexhowells! This looks great. A few minor comment below. And I can't speak to the science, but perhaps @nomadscientist can have a look here as well?
title: Evaluating Reference Data for Bulk RNA Deconvolution | ||
subtopic: deconvo | ||
priority: 3 | ||
zenodo_link: '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zenodo_link: '' | |
zenodo_link: 'https://zenodo.org/records/5719228' |
> - *"Delimited by"*: `Tab` | ||
> - *"How should the results be sorted?"*: `With the most common value first` | ||
> | ||
> 2. **Rename** {% icon galaxy-pencil %} output `Cell type counts` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add the faq for renaming a dataset here at the first time people are asked to do it?
> {% snippet faqs/galaxy/workflows_run.md %} | ||
{: .hands_on} | ||
|
||
<iframe title="Galaxy Workflow Embed" style="width: 100%; height: 700px; border: none;" src="https://usegalaxy.eu/published/workflow?id=76d3408d0d22ad05&embed=true&buttons=true&about=false&heading=false&minimap=true&zoom_controls=true&initialX=-20&initialY=-20&zoom=0.5"></iframe> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool!
|
||
title: Evaluating Reference Data for Bulk RNA Deconvolution | ||
subtopic: deconvo | ||
priority: 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you mean for this tutorial to be 3rd in the subsection? It is second now because the other tutorials in the subsections have priorities 1 and 4 listed. So please double check if the order is how you want it now.
|
||
**Remember** since we have a collection of 20 inputs, the output of this workflow will be a collection of 20 elements, each corresponding to the input elements. Each output will have its own random selection of 200 cells. | ||
|
||
> <comment-title>Inputting multiple datasets</comment-title> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could probably be an FAQ. or maybe you could use the existing "select multiple datasets" faq ? And maybe enhance it with your screenshot?
> - Copy the URL (e.g. via right-click) of [this workflow](https://usegalaxy.eu/u/hexhowells/w/deconv-eval-stage-1) or download it to your computer. | ||
> - Import the workflow into Galaxy | ||
> | ||
> {% snippet faqs/galaxy/workflows_run_trs.md path="topics/transcriptomics/tutorials/rna-seq-reads-to-counts/workflows/qc_report.ga" title="QC Report" %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a tad confusing, as it is a hands-on box within a hands-on box. Are they meant to run this QC report workflow at this point? Or did you mean to replace the workflow with the one mentioned in step 1 here?
> <hands-on-title>Run pseudo-bulk and actual proportions workflow</hands-on-title> | ||
> | ||
> 1. **Import the workflow** into Galaxy | ||
> - Copy the URL (e.g. via right-click) of [this workflow](https://usegalaxy.eu/u/hexhowells/w/deconv-eval-stage-1) or download it to your computer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please include the workflow here in the GTN as well and refer to that in the link.
> - {% icon param-file %} *"Input Dataset"*: `Transposed expression matrix` | ||
> - *"Size of output collection"*: `20` | ||
> | ||
> 4. **Rename** {% icon galaxy-pencil %} output `Expression data` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend thinking about keeping the terms consistent, as later on when the first workflow is run, these inputs are described with different terminology
> In order to upload the input collections into the workflow, you first need to set the input type to **Multiple datasets** in the input file selection. | ||
> | ||
> ![Multiple Datasets](../../images/bulk-deconvolution-evaluate/batch-mode.png "Multiple Datasets button in Galaxy") | ||
{: .comment} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I experienced some difficulty loading these collections into the workflow. They would not appear in the pull down menu, and I had to drag and drop the collections without visual confirmation that they had been uploaded.
> - {% icon param-collection %} *"Expression Data"*: `Expression Data` | ||
> | ||
> {% snippet faqs/galaxy/workflows_run.md %} | ||
> 3. Add a tag labelled `#A` to the first "Actual cell proportions" and "Pseudobulk" collections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not completely sure, but I feel like the term "actual cell proportions" might be a little misleading. The cell proportions, as indicated by proportional representation in the single-cell data, are often different from the true in vivo cell type proportions due to systematic drop out biases during data collection. This might be worth mentioning, or maybe a different term which doesn't use "actual" could be substituted.
> {% snippet faqs/galaxy/workflows_run_trs.md path="topics/transcriptomics/tutorials/rna-seq-reads-to-counts/workflows/qc_report.ga" title="QC Report" %} | ||
> | ||
> 2. Run **Workflow inferring cellular proportions** {% icon workflow %} using the following parameters: | ||
> - {% icon param-collection %} *"Pseudobulk - A"*: `expression data - A` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These collections also did not appear in the drop down menus, and rather had to be dragged and dropped.
> > | ||
> > ![Scatter plot comparison](../../images/bulk-deconvolution-evaluate/scatterplot-compare.png "Scatter plot comparison between Music and NNLS") | ||
> > | ||
> > 1. Comparing scatter plots, the MuSiC tool has the most accurate results since the points fall closer onto the x=y line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine the case that the NNLS deconvolution more closely resembled the cell proportions in the real, biological context, while MuSic more accurately recapitulated with proportions from the single cell data. Which if these two methods are really more accurate, then?
New tutorial on evaluating reference data for bulk RNA deconvolution tools, evaluating both MuSiC and NNLS deconvolution tools within Galaxy.