Merge pull request #4 from levitsky/tm/move_files

Move teachingmodule.qmd, add combined_protein.tsv, add SDRF section
hds-sandbox · Jan 13, 2025 · ea04245 · ea04245
2 parents 44eec0c + 948fab5
commit ea04245
Show file tree

Hide file tree

Showing 9 changed files with 12,619 additions and 97 deletions.
diff --git a/TeachingModule/AnalysisMSData_FragPipe.qmd b/TeachingModule/AnalysisMSData_FragPipe.qmd
@@ -1,4 +1,6 @@
-In this section of the teaching module, we will work with data from the paper. The first task is to download sample files from the paper, guided by the questions provided below:
+### Getting data and preparing the work environment
+
+In this section of the teaching module, we will be working with data from the paper. The first task is to download sample files from the paper, guided by the questions provided below:
 
 ::: {.question}
 Where can the data be found?
@@ -21,8 +23,9 @@ By examining the accession code for the data deposited on ProteomeXChange, we ca
 What is FTP, and what is its functionality?
 :::
 
+#### Starting the Proteomics Sandbox
 
-To download the data, we will use the **Proteomics Sandbox Application** on UCloud. This platform provides the necessary storage capacity and computational power to perform this process.
+To download and then process the data, we will use the **Proteomics Sandbox Application** on UCloud. This platform provides the necessary storage capacity and computational power to perform this process.
 
 The **Proteomics Sandbox Application** is a virtual environment that includes various software tools, such as **FragPipe**, for analyzing proteomics data.
 
@@ -47,18 +50,18 @@ You can access the **Proteomics Sandbox Application** on UCloud [here](https://c
 
 In UCloud, the settings should look like this:
 
-![](images/TeachingModuleInstructions/UCloud_settings.PNG){width=750 fig-align="center"}
+![](../images/TeachingModuleInstructions/UCloud_settings.PNG){width=750 fig-align="center"}
 
 Before submitting the job, it is highly recommended to create a personal folder to securely store both your data and the results generated by **FragPipe**. Follow the step-by-step guide below for an effortless setup:
 
 1. First, click on the vibrant blue `Add folder` button.
 2. Next, select the exact directory you wish to mount, as illustrated below:
 
-![](images/TeachingModuleInstructions/AddFolder1.png){width=750 fig-align="center"}
+![](../images/TeachingModuleInstructions/AddFolder1.png){width=750 fig-align="center"}
 
 Upon clicking, a window similar to the one below will appear. Here, you have the option to either create a specific folder within a particular drive in the workspace you’ve chosen or simply select the entire drive itself. In this example, the drive is labeled as `Home` and the workspace is `My workspace`.
 
-![](images/TeachingModuleInstructions/AddFolder2.png){width=750 fig-align="center"}
+![](../images/TeachingModuleInstructions/AddFolder2.png){width=750 fig-align="center"}
 
 
 ::: {.callout-caution}
@@ -96,7 +99,7 @@ To download one sample file from each of the Plex Sets, we will need these URLs
 ```
 
 ::: {.callout-tip collapse="true"}
-## Tip to Download of Data
+##### Tip to Download of Data
 You can also download this list [here](https://github.com/hds-sandbox/proteomics-sandbox/blob/webpage/TeachingModule/urls.txt).
 :::
 
@@ -116,7 +119,7 @@ Next, we can launch FragPipe, which is located on the desktop. In this tutorial,
 
 Now that FragPipe is launched, we need to configure the settings before running the analysis. To assist you in setting up the settings in FragPipe, we have provided some guiding questions:
 
-### Getting started with FragPipe
+### Analyzing data with FragPipe
 
 ::: {.callout-note}
 Some of the information you will need in this section can be found in **Supplementary Information** to the study. Open the **Supplementary Information**  and go to page 25, **Supplementary Methods**.
@@ -171,13 +174,12 @@ When all settings have been obtained, MSFragger should look something like this:
 What is MSFragger? What does it do?
 :::
 
-
 ::: {.question}
 How does MSFragger operate?
 :::
 
 ::: {.callout-note}
-You can also skip configuring MSFragger manually and just use this [parameter file](https://github.com/hds-sandbox/proteomics-sandbox/blob/webpage/fragger.params).
+You can also skip configuring MSFragger manually and just use this [parameter file](https://github.com/hds-sandbox/proteomics-sandbox/blob/webpage/TeachingModule/fragger.params).
 You will need to upload it to UCloud and then load it on the "MSFragger" tab in FragPipe.
 :::
 
@@ -189,12 +191,10 @@ This process might take some time, so make sure that you still have enough hours
 What are your expectations regarding the output results? Consider the implications of the number of files provided for this search in your response.
 :::
 
-
 ::: {.question}
 Can the output from this analysis be reliably used for downstream applications given the limited number of sample files? Justify your answer.
 :::
 
-
 ::: {.question}
 What does it signify that the sample tissues have been fractionated as described in **Supplementary Information**?
 
@@ -214,9 +214,7 @@ What types of output are generated by FragPipe?
 For the downstream analysis, we will use the output from the list of combined proteins, which we will explore further in the following section.
 
 
-### Interpretation and Analysis of FragPipe Results
-
-For this part, we will use output files based on a run with FragPipe using all sample files (i.e., 5x72 raw files). That file can be downloaded here??? 
+### Overview of FragPipe Results
 
 Now, we will look at the output from FragPipe, where we will use the file named `combined_proteins.tsv`. Initially, we will explore the contents of the file locally. Therefore, you should download the file from UCloud and view it locally in a file editor such as Excel.
 
@@ -226,16 +224,12 @@ You can download the file by clicking on the file in your output directory in th
 Provide a concise overview of the table's contents. What information is represented in the rows and columns?
 :::
 
-
-
-For the downstream analysis, we will use the columns containing the TMT intensities across the proteins identified.
-
-
+For a more in-depth guide to processing of quantitative results based on TMT intensities of all proteins in the dataset, go to [Part 4](teachingmodule.qmd#sec-data-analysis).
 For that we will use [OmicsQ](https://computproteomics.bmb.sdu.dk/app_direct/OmicsQ/), which is a toolkit for quantitative proteomics.
 OmicsQ can be used to facilitate the processing of quantitative data from Omics type experiments. Additionally, it also serves as an entrypoint for using apps like [PolySTest](https://computproteomics.bmb.sdu.dk/app_direct/PolySTest/) [SCHWAMMLE20201396] for statistical testing, [VSClust](https://computproteomics.bmb.sdu.dk/app_direct/VSClust/) for clustering and [ComplexBrowser](https://computproteomics.bmb.sdu.dk/app_direct/ComplexBrowser/) for the investigation of the behavior of protein complexes.
 
 <!--
 Guide and introduction to Omics Q
 Questions about Omics Q
 Questions about output and interpretations of results from OmicsQ
--->
+-->
diff --git a/TeachingModule/DataScreening_Multivariate.qmd b/TeachingModule/DataScreening_Multivariate.qmd
@@ -26,7 +26,10 @@ This directory will be within the folder where you cloned the material in your t
 
 You can now proceed with the multivariate analysis of the material.
 
+To analyze the whole dataset, you will need to use output files from FragPipe for all sample files (i.e., 5x72 raw files). That file can be downloaded [here](https://github.com/hds-sandbox/proteomics-sandbox/blob/webpage/TeachingModule/combined_proteins.tsv).
+
+
 <iframe src="https://nbviewer.jupyter.org/github/veitveit/training-quantitative-proteomics/blob/main/03_Cluster_Analysis/Multivariate%20analysis.ipynb" 
         width="100%" 
         height="1200px">
-</iframe>
+</iframe>