Skip to content

Commit

Permalink
Merge pull request #602 from subinamehta/main
Browse files Browse the repository at this point in the history
adding clinical metaproteomics workflows
  • Loading branch information
mvdbeek authored Nov 22, 2024
2 parents 311a7b5 + 5382c4d commit 3b04202
Show file tree
Hide file tree
Showing 5 changed files with 425 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: 1.2
workflows:
- name: main
subclass: Galaxy
publish: true
primaryDescriptorPath: /iwc-clinicalmp-database-generation.ga
testParameterFiles:
- /iwc-clinicalmp-database-generation-tests.yml
authors:
- name: Subina Mehta
orcid: 0000-0001-9818-0537
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Changelog

## [0.1] 2024-11-18
First release.
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Clinical Metaproteomics 1: Database Generation
Metaproteomics involves the large-scale identification and analysis of all proteins expressed by microbiota. However, analyzing clinical samples using metaproteomics is complicated by the presence of abundant human (host) proteins, which can obscure the detection of less abundant microbial proteins.

To overcome this challenge, we developed a metaproteomics workflow using tandem mass spectrometry (MS/MS) and bioinformatics tools on the Galaxy platform. This workflow enables the characterization of metaproteomes in clinical samples.

The first step in this workflow is the Database Generation process. The Galaxy-P team has created a workflow that compiles a large database by downloading protein sequences of known disease-causing microorganisms. From this extensive database, a compact, relevant database is then created using the Metanovo tool.
A GTN has been developed for this workflow. [https://training.galaxyproject.org/training-material/topics/proteomics/tutorials/clinical-mp-1-database-generation/tutorial.html](https://training.galaxyproject.org/training-material/topics/proteomics/tutorials/clinical-mp-1-database-generation/tutorial.html)

## Inputs dataset

### Search Databases (FASTA) from [Zenodo](https://zenodo.org/records/14181725)
- `HUMAN SwissProt Protein_Database.fasta`
- `Species UniProt Protein Database FASTA.fasta`
- `Contaminants (cRAP) Protein Database.fasta`

### MSMS files download from [Zenodo](https://zenodo.org/records/14181725)
- `PTRC_Skubitz_Plex2_F10_9Aug19_Rage_Rep-19-06-08.mgf`
- `PTRC_Skubitz_Plex2_F11_9Aug19_Rage_Rep-19-06-08.mgf`
- `PTRC_Skubitz_Plex2_F13_9Aug19_Rage_Rep-19-06-08.mgf`
- `PTRC_Skubitz_Plex2_F15_9Aug19_Rage_Rep-19-06-08.mgf`

## Input Values
For Metanovo
- Peptide Length
- Variable modifications
- Labeled element

## Processing
- Merge all the resultant FASTA files
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
- doc: Test outline for iwc-clinicalmp-database-generation
job:
Human SwissProt Protein Database:
class: File
location: https://zenodo.org/records/14181725/files/HUMAN-SwissProt-Protein-Database.fasta?download=1
filetype: fasta
Species UniProt Protein Database:
class: File
location: https://zenodo.org/records/14181725/files/Species_UniProt_FASTA.fasta?download=1
filetype: fasta
Contaminants cRAP Protein Database:
class: File
location: https://zenodo.org/records/14181725/files/Contaminants(cRAP)-Protein-Database.fasta?download=1
filetype: fasta
Tandem Mass Spectrometry (MS/MS) datasets:
class: Collection
collection_type: list
elements:
- class: File
identifier: PTRC_Skubitz_Plex2_F15_9Aug19_Rage_Rep-19-06-08.mgf
location: https://zenodo.org/records/14181725/files/PTRC_Skubitz_Plex2_F15_9Aug19_Rage_Rep-19-06-08.mgf?download=1
- class: File
identifier: PTRC_Skubitz_Plex2_F13_9Aug19_Rage_Rep-19-06-08.mgf
location: https://zenodo.org/records/14181725/files/PTRC_Skubitz_Plex2_F13_9Aug19_Rage_Rep-19-06-08.mgf?download=1
- class: File
identifier: PTRC_Skubitz_Plex2_F11_9Aug19_Rage_Rep-19-06-08.mgf
location: https://zenodo.org/records/14181725/files/PTRC_Skubitz_Plex2_F11_9Aug19_Rage_Rep-19-06-08.mgf?download=1
- class: File
identifier: PTRC_Skubitz_Plex2_F10_9Aug19_Rage_Rep-19-06-08.mgf
location: https://zenodo.org/records/14181725/files/PTRC_Skubitz_Plex2_F10_9Aug19_Rage_Rep-19-06-08.mgf?download=1
outputs:
Human UniProt Microbial Proteins cRAP for MetaNovo:
asserts:
- that: has_text
text: ">sp|"
Metanovo Compact database:
asserts:
- that: has_text
text: ">sp|"
Metanovo Compact CSV database:
asserts:
- that: has_text
text: "index"
Human UniProt Microbial Proteins from MetaNovo cRAP:
asserts:
- that: has_text
text: ">sp|"
Loading

0 comments on commit 3b04202

Please sign in to comment.