Skip to content

Commit

Permalink
2.1.0 "Olive Tin Hamster"
Browse files Browse the repository at this point in the history
  • Loading branch information
marissaDubbelaar authored Dec 9, 2021
2 parents c95201d + df4729f commit 3dd74fb
Show file tree
Hide file tree
Showing 76 changed files with 1,993 additions and 1,191 deletions.
12 changes: 0 additions & 12 deletions .github/markdownlint.yml

This file was deleted.

16 changes: 7 additions & 9 deletions .github/workflows/awsfulltest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,19 +14,17 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Launch workflow via tower
uses: nf-core/tower-action@master
uses: nf-core/tower-action@v2
with:
workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }}
bearer_token: ${{ secrets.TOWER_BEARER_TOKEN }}
access_token: ${{ secrets.TOWER_ACCESS_TOKEN }}
compute_env: ${{ secrets.TOWER_COMPUTE_ENV }}
pipeline: ${{ github.repository }}
revision: ${{ github.sha }}
workdir: s3://${{ secrets.AWS_S3_BUCKET }}/mhcquant/work-${{ github.sha }}
# Add full size test data (but still relatively small datasets for few samples)
# on the `test_full.config` test runs with only one set of parameters
# Then specify `-profile test_full` instead of `-profile test` on the AWS batch command
revision: ${{ github.sha } }
workdir: s3://${{ secrets.AWS_S3_BUCKET }}/work/mhcquant/work-${{ github.sha }}
parameters: |
{
"outdir" : "s3://${{ secrets.AWS_S3_BUCKET }}/mhcquant/results-${{ github.sha }}",
"outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/mhcquant/results-${{ github.sha }}"
}
profiles: '[ "test_full", "aws_tower" ]'
profiles: test,aws_tower
pre_run_script: 'export NXF_VER=21.10.3'
48 changes: 18 additions & 30 deletions .github/workflows/awstest.yml
Original file line number Diff line number Diff line change
@@ -1,40 +1,28 @@
name: nf-core AWS test
# This workflow is triggered on push to the master branch.
# It can be additionally triggered manually with GitHub actions workflow dispatch.
# It runs the -profile 'test' on AWS batch.
# This workflow can be triggered manually with the GitHub actions workflow dispatch button.
# It runs the -profile 'test' on AWS batch

on:
workflow_dispatch:

env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
TOWER_ACCESS_TOKEN: ${{ secrets.AWS_TOWER_TOKEN }}
AWS_JOB_DEFINITION: ${{ secrets.AWS_JOB_DEFINITION }}
AWS_JOB_QUEUE: ${{ secrets.AWS_JOB_QUEUE }}
AWS_S3_BUCKET: ${{ secrets.AWS_S3_BUCKET }}

jobs:
run-awstest:
run-tower:
name: Run AWS tests
if: github.repository == 'nf-core/mhcquant'
runs-on: ubuntu-latest
steps:
- name: Setup Miniconda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: 3.7
- name: Install awscli
run: conda install -c conda-forge awscli
- name: Start AWS batch job
- name: Launch workflow via tower
uses: nf-core/tower-action@v2

# For example: adding multiple test runs with different parameters
# Remember that you can parallelise this by using strategy.matrix
run: |
aws batch submit-job \
--region eu-west-1 \
--job-name nf-core-mhcquant \
--job-queue $AWS_JOB_QUEUE \
--job-definition $AWS_JOB_DEFINITION \
--container-overrides '{"command": ["nf-core/mhcquant", "-r '"${GITHUB_SHA}"' -profile test --outdir s3://'"${AWS_S3_BUCKET}"'/mhcquant/results-'"${GITHUB_SHA}"' -w s3://'"${AWS_S3_BUCKET}"'/mhcquant/work-'"${GITHUB_SHA}"' -with-tower"], "environment": [{"name": "TOWER_ACCESS_TOKEN", "value": "'"$TOWER_ACCESS_TOKEN"'"}]}'
with:
workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }}
access_token: ${{ secrets.TOWER_ACCESS_TOKEN }}
compute_env: ${{ secrets.TOWER_COMPUTE_ENV }}
pipeline: ${{ github.repository }}
revision: ${{ github.sha }}
workdir: s3://${{ secrets.AWS_S3_BUCKET }}/work/mhcquant/work-${{ github.sha }}
parameters: |
{
"outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/mhcquant/results-${{ github.sha }}"
}
profiles: test,aws_tower
pre_run_script: 'export NXF_VER=21.10.3'
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
strategy:
matrix:
# Nextflow versions: check pipeline minimum and current latest
nxf_ver: ["21.04.0", ""]
nxf_ver: ['21.10.3', '']
steps:
- name: Check out pipeline code
uses: actions/checkout@v2
Expand Down
32 changes: 31 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,35 @@
# nf-core/mhcquant: Changelog

## v2.1.0 nf-core/mhcquant "Olive Tin Hamster" - 2021/12/09

### `Added`

- Inclusion of assets/schema_input.json
- Added the multiQC again to report the versions
- MHCquant parameters are now directly assigned to the argument of the

### `Fixed`

- Fixed typos
- [#165] - Raise memory requirements of FeatureFinderIdentification step
- [#176] - Pipeline crashes when setting the --skip_quantification flag

### `Dependencies`

Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own [Biocontainer](https://biocontainers.pro/#/registry). This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.

| Dependency | Old version | New version |
| --------------------- | ----------- | ----------- |
| `openms` | 2.5.0 | 2.6.0 |
| `openms-thirdparty` | 2.5.0 | 2.6.0 |
| `thermorawfileparser` | 1.2.3 | 1.3.4 |

> **NB:** Dependency has been **updated** if both old and new version information is present.
> **NB:** Dependency has been **added** if just the new version information is present.
> **NB:** Dependency has been **removed** if version information isn't present.
### `Deprecated`

## v2.0.0 nf-core/mhcquant "Steel Beagle" - 2021/09/03

### `Added`
Expand Down Expand Up @@ -60,7 +90,7 @@ DSL1 to DSL2 conversion

- raise OpenMS version to 2.5
- adapt workflow accoringly with new options
- remove specifying input as file dirs eg "data/*.mzML"
- remove specifying input as file dirs eg "data/\*.mzML"

### `Dependencies`

Expand Down
5 changes: 4 additions & 1 deletion CITATIONS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# nf-core/rnaseq: Citations
# nf-core/mhcquant: Citations

## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)

Expand All @@ -22,6 +22,9 @@
* [OpenMS](https://pubmed.ncbi.nlm.nih.gov/27575624/)
> Röst H, Sachsenberg T, Aiche S, Bielow C, Weisser H, Aicheler F, Andreotti S, Ehrlich HC, Gutenbrunner P, Kenar E, Liang X, Nahnsen S, Nilse L, Pfeuffer J, Rosenberger G, Rurik M, Schmitt U, Veit J, Walzer M, Wojnar D, Wolski WE, Schilling O, Choudhary JS, Malmström L, Aebersold R, Reinert K, Kohlbacher O. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Methods 13, 741–748 (2016). doi: 10.1038/nmeth.3959. PubMed PMID: 27575624
* [MultiQC](https://www.ncbi.nlm.nih.gov/pubmed/27312411/)
> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
## Software packaging/containerisation tools

* [Anaconda](https://anaconda.com)
Expand Down
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) Leon Bichmann
Copyright (c) Leon Bichmann, Marissa Dubbelaar

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
46 changes: 30 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,50 +1,62 @@
# ![nf-core/mhcquant](docs/images/nf-core-mhcquant_logo.png)

**Identify and quantify peptides from mass spectrometry raw data**.
[![GitHub Actions CI Status](https://github.com/nf-core/mhcquant/workflows/nf-core%20CI/badge.svg)](https://github.com/nf-core/mhcquant/actions?query=workflow%3A%22nf-core+CI%22)
[![GitHub Actions Linting Status](https://github.com/nf-core/mhcquant/workflows/nf-core%20linting/badge.svg)](https://github.com/nf-core/mhcquant/actions?query=workflow%3A%22nf-core+linting%22)
[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/mhcquant/results)
[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.5407955-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.5407955)

[![GitHub Actions CI Status](https://github.com/nf-core/mhcquant/workflows/nf-core%20CI/badge.svg)](https://github.com/nf-core/mhcquant/actions)
[![GitHub Actions Linting Status](https://github.com/nf-core/mhcquant/workflows/nf-core%20linting/badge.svg)](https://github.com/nf-core/mhcquant/actions)
[![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A521.04.0-brightgreen.svg)](https://www.nextflow.io/)
[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A521.10.3-23aa62.svg?labelColor=000000)](https://www.nextflow.io/)
[![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/)
[![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/)
[![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/)

[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg)](https://bioconda.github.io/)
[![Docker](https://img.shields.io/docker/automated/nfcore/mhcquant.svg)](https://hub.docker.com/r/nfcore/mhcquant)
[![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23mhcquant-4A154B?logo=slack)](https://nfcore.slack.com/channels/mhcquant)
[![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23mhcquant-4A154B?labelColor=000000&logo=slack)](https://nfcore.slack.com/channels/mhcquant)
[![Follow on Twitter](http://img.shields.io/badge/twitter-%40nf__core-1DA1F2?labelColor=000000&logo=twitter)](https://twitter.com/nf_core)
[![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core)

## Introduction

nfcore/mhcquant is a bioinformatics analysis pipeline used for quantitative processing of data dependent (DDA) peptidomics data.
**nfcore/mhcquant** is a bioinformatics analysis pipeline used for quantitative processing of data dependent (DDA) peptidomics data.

It was specifically designed to analyse immunopeptidomics data, which deals with the analysis of affinity purified, unspecifically cleaved peptides that have recently been discussed intensively in [the context of cancer vaccines](https://www.nature.com/articles/ncomms13404).

The workflow is based on the OpenMS C++ framework for computational mass spectrometry. RAW files (mzML) serve as inputs and a database search (Comet) is performed based on a given input protein database. FDR rescoring is applied using Percolator based on a competitive target-decoy approach (reversed decoys). For label free quantification all input files undergo identification based retention time alignment (MapAlignerIdentification), and targeted feature extraction matching ids between runs (FeatureFinderIdentification). In addition, a variant calling file (vcf) can be specified to translate variants into proteins that will be included in the database search and binding predictions on specified alleles (alleles.tsv) using MHCFlurry (Class 1) or MHCNugget (Class 2) can be directly run on the output peptide lists. Moreover, if a vcf file was specified, neoepitopes will automatically be determined and binding predictions can also directly be predicted for them.

The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.
The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!

On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/mhcquant/results).

## Pipeline summary

1. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))

![overview](assets/MHCquant_scheme.png)
(This chart was created with the help of [Lucidchart](https://www.lucidchart.com))

## Quick Start

1. Install [`nextflow`](https://nf-co.re/usage/installation) (`>=21.04.0`)
1. Install [`Nextflow`](https://www.nextflow.io/docs/latest/getstarted.html#installation) (`>=21.10.3`)

2. Install any of [`Docker`](https://docs.docker.com/engine/installation/), [`Singularity`](https://www.sylabs.io/guides/3.0/user-guide/), [`Podman`](https://podman.io/), [`Shifter`](https://nersc.gitlab.io/development/shifter/how-to-use/) or [`Charliecloud`](https://hpc.github.io/charliecloud/) for full pipeline reproducibility _(please only use [`Conda`](https://conda.io/miniconda.html) as a last resort; see [docs](https://nf-co.re/usage/configuration#basic-configuration-profiles))_
2. Install any of [`Docker`](https://docs.docker.com/engine/installation/), [`Singularity`](https://www.sylabs.io/guides/3.0/user-guide/), [`Podman`](https://podman.io/), [`Shifter`](https://nersc.gitlab.io/development/shifter/how-to-use/) or [`Charliecloud`](https://hpc.github.io/charliecloud/) for full pipeline reproducibility _(please only use [`Conda`](https://conda.io/miniconda.html) as a last resort; see [docs](https://nf-co.re/usage/configuration#basic-configuration-profiles))_.

3. Download the pipeline and test it on a minimal dataset with a single command:

```bash
```console
nextflow run nf-core/mhcquant -profile test,<docker/singularity/podman/shifter/charliecloud/conda/institute>
```

> Please check [nf-core/configs](https://github.com/nf-core/configs#documentation) to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use `-profile <institute>` in your command. This will enable either `docker` or `singularity` and set the appropriate execution settings for your local compute environment.
> * Please check [nf-core/configs](https://github.com/nf-core/configs#documentation) to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use `-profile <institute>` in your command. This will enable either `docker` or `singularity` and set the appropriate execution settings for your local compute environment.
> * If you are using `singularity` then the pipeline will auto-detect this and attempt to download the Singularity images directly as opposed to performing a conversion from Docker images. If you are persistently observing issues downloading Singularity images directly due to timeout or network issues then please use the `--singularity_pull_docker_container` parameter to pull and convert the Docker image instead. Alternatively, it is highly recommended to use the [`nf-core download`](https://nf-co.re/tools/#downloading-pipelines-for-offline-use) command to pre-download all of the required containers before running the pipeline and to set the [`NXF_SINGULARITY_CACHEDIR` or `singularity.cacheDir`](https://www.nextflow.io/docs/latest/singularity.html?#singularity-docker-hub) Nextflow options to be able to store and re-use the images from a central location for future pipeline runs.
> * If you are using `conda`, it is highly recommended to use the [`NXF_CONDA_CACHEDIR` or `conda.cacheDir`](https://www.nextflow.io/docs/latest/conda.html) settings to store the environments in a central location for future pipeline runs.

4. Start running your own analysis!

```bash
nextflow run nf-core/mhcquant -profile test,<docker/singularity/podman/shifter/charliecloud/conda/institute>
--input 'samples.tsv'
--fasta 'SWISSPROT_2020.fasta'
--allele_sheet 'alleles.tsv'
--predict_class_1
--allele_sheet 'alleles.tsv'
--predict_class_1
--refine_fdr_on_predicted_subset
```

Expand Down Expand Up @@ -84,7 +96,7 @@ For further information or help, don't hesitate to get in touch on the [Slack `#

## Citations

If you use `nf-core/mhcquant` for your analysis, please cite:
If you use `nf-core/mhcquant` for your analysis, please cite it using the following doi: [10.5281/zenodo.5407955](https://doi.org/10.5281/zenodo.5407955) and the corresponding manuscript:

> **MHCquant: Automated and Reproducible Data Analysis for Immunopeptidomics**
>
Expand All @@ -93,6 +105,8 @@ If you use `nf-core/mhcquant` for your analysis, please cite:
> Journal of Proteome Research 2019 18 (11), 3876-3884
> DOI: 10.1021/acs.jproteome.9b00313

An extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.

You can cite the `nf-core` publication as follows:

> **The nf-core framework for community-curated bioinformatics pipelines.**
Expand Down
46 changes: 46 additions & 0 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "https://raw.githubusercontent.com/nf-core/mhcquant/master/assets/schema_input.json",
"title": "nf-core/mhcquant pipeline - params.input schema",
"description": "Schema for the file provided with params.input",
"type": "array",
"items": {
"type": "object",
"properties": {
"ID": {
"type": "integer",
"errorMessage": "Provide an unique identifier for the replicate, must be a numeric value"
},
"Sample": {
"type": "string",
"pattern": "^\\S+-?",
"errorMessage": "Sample name must be provided and cannot contain spaces"
},
"Condition": {
"type": "string",
"pattern": "^\\S+-?",
"errorMessage": "Sample condition must be provided and cannot contain spaces"
},
"ReplicateFileName": {
"type": "string",
"errorMessage": "MS file spaces and must have extension '.raw' or '.mzml'",
"anyOf": [
{
"type": "string",
"pattern": "^\\S+-?\\.raw$"
},
{
"type": "string",
"pattern": "^\\S+-?\\.mzml$"
}
]
}
},
"required": [
"ID",
"Sample",
"Condition",
"ReplicateFileName"
]
}
}
Loading

0 comments on commit 3dd74fb

Please sign in to comment.