Skip to content

Commit

Permalink
Merge branch 'dev' into nf-core-template-merge-2.13.1
Browse files Browse the repository at this point in the history
  • Loading branch information
LouisLeNezet committed Mar 4, 2024
2 parents df206fd + e55808a commit 62aca9f
Show file tree
Hide file tree
Showing 103 changed files with 10,049 additions and 35 deletions.
7 changes: 7 additions & 0 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,15 @@ Contributions to the code are even more welcome ;)

If you'd like to write some code for nf-core/phaseimpute, the standard workflow is as follows:

1. Check that there isn't already an issue about your idea in the [nf-core/phaseimpute issues](https://github.com/nf-core/phaseimpute/issues) to avoid duplicating work. If there isn't one already, please create one so that others know you're working on this
1. Check that there isn't already an issue about your idea in the [nf-core/phaseimpute issues](https://github.com/nf-core/phaseimpute/issues) to avoid duplicating work. If there isn't one already, please create one so that others know you're working on this
2. [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [nf-core/phaseimpute repository](https://github.com/nf-core/phaseimpute) to your GitHub account
3. Make the necessary changes / additions within your forked repository following [Pipeline conventions](#pipeline-contribution-conventions)
4. Use `nf-core schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10).
5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged
3. Make the necessary changes / additions within your forked repository following [Pipeline conventions](#pipeline-contribution-conventions)
4. Use `nf-core schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10).
5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged

If you're not used to this workflow with git, you can start with some [docs from GitHub](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or even their [excellent `git` resources](https://try.github.io/).

Expand All @@ -37,13 +41,15 @@ Typically, pull-requests are only fully reviewed when these tests are passing, t

There are typically two types of tests that run:

### Lint tests
### Lint tests

`nf-core` has a [set of guidelines](https://nf-co.re/developers/guidelines) which all pipelines must adhere to.
To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core lint <pipeline-directory>` command.

If any failures or warnings are encountered, please follow the listed URL for more documentation.

### Pipeline tests
### Pipeline tests

Each `nf-core` pipeline should be set up with a minimal set of test-data.
Expand All @@ -53,6 +59,7 @@ These tests are run both with the latest available version of `Nextflow` and als

## Patch

:warning: Only in the unlikely and regretful event of a release happening with a bug.
:warning: Only in the unlikely and regretful event of a release happening with a bug.

- On your own fork, make a new branch `patch` based on `upstream/master`.
Expand Down
7 changes: 7 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
<!--
<!--
# nf-core/phaseimpute pull request
Many thanks for contributing to nf-core/phaseimpute!
Expand All @@ -11,8 +12,14 @@ Remember that PRs should be made against the dev branch, unless you're preparing
Learn more about contributing: [CONTRIBUTING.md](https://github.com/nf-core/phaseimpute/tree/master/.github/CONTRIBUTING.md)
-->

Remember that PRs should be made against the dev branch, unless you're preparing a pipeline release.

Learn more about contributing: [CONTRIBUTING.md](https://github.com/nf-core/phaseimpute/tree/master/.github/CONTRIBUTING.md)
-->

## PR checklist

- [ ] This comment contains a description of changes (with reason).
- [ ] This comment contains a description of changes (with reason).
- [ ] If you've fixed a bug or added code that should be tested, add tests!
- [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/phaseimpute/tree/master/.github/CONTRIBUTING.md)
Expand Down
5 changes: 5 additions & 0 deletions .github/workflows/branch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,18 @@ name: nf-core branch protection
on:
pull_request_target:
branches: [master]
pull_request_target:
branches: [master]

jobs:
test:
runs-on: ubuntu-latest
runs-on: ubuntu-latest
steps:
# PRs to the nf-core repo master branch are only ok if coming from the nf-core repo `dev` or any `patch` branches
# PRs to the nf-core repo master branch are only ok if coming from the nf-core repo `dev` or any `patch` branches
- name: Check PRs
if: github.repository == 'nf-core/phaseimpute'
if: github.repository == 'nf-core/phaseimpute'
run: |
{ [[ ${{github.event.pull_request.head.repo.full_name }} == nf-core/phaseimpute ]] && [[ $GITHUB_HEAD_REF == "dev" ]]; } || [[ $GITHUB_HEAD_REF == "patch" ]]
Expand Down
25 changes: 25 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,30 @@ on:
env:
NXF_ANSI_LOG: false

concurrency:
group: "${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}"
cancel-in-progress: true
# This workflow runs the pipeline with the minimal test dataset to check that it completes without any syntax errors
on:
push:
branches:
- dev
pull_request:
release:
types: [published]

env:
NXF_ANSI_LOG: false

concurrency:
group: "${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}"
cancel-in-progress: true

jobs:
test:
name: Run pipeline with test data
# Only run on push if this is the nf-core dev branch (merged PRs)
if: "${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/phaseimpute') }}"
name: Run pipeline with test data
# Only run on push if this is the nf-core dev branch (merged PRs)
if: "${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/phaseimpute') }}"
Expand All @@ -26,6 +44,9 @@ jobs:
NXF_VER:
- "23.04.0"
- "latest-everything"
NXF_VER:
- "23.04.0"
- "latest-everything"
steps:
- name: Check out pipeline code
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4
Expand All @@ -39,8 +60,12 @@ jobs:
uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1

- name: Run pipeline with test data
# TODO nf-core: You can customise CI pipeline run tests as required
# For example: adding multiple test runs with different parameters
# Remember that you can parallelise this by using strategy.matrix
# TODO nf-core: You can customise CI pipeline run tests as required
# For example: adding multiple test runs with different parameters
# Remember that you can parallelise this by using strategy.matrix
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --outdir ./results
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --outdir ./results
5 changes: 5 additions & 0 deletions .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,14 @@ name: nf-core linting
# This workflow is triggered on pushes and PRs to the repository.
# It runs the `nf-core lint` and markdown lint tests to ensure
# that the code meets the nf-core guidelines.
# It runs the `nf-core lint` and markdown lint tests to ensure
# that the code meets the nf-core guidelines.
on:
push:
branches:
- dev
branches:
- dev
pull_request:
release:
types: [published]
Expand Down Expand Up @@ -47,6 +51,7 @@ jobs:
python -m pip install --upgrade pip
pip install nf-core
- name: Run nf-core lint
env:
GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }}
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ results/
testing/
testing*
*.pyc
*.code-workspace
2 changes: 2 additions & 0 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,10 @@ Questions, concerns, or ideas on what we can include? Contact members of the Saf
## Our Responsibilities

Members of the Safety Team (the Safety Officers) are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behaviour.
The safety officer is responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behaviour.

The Safety Team, in consultation with the nf-core core team, have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this CoC, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
The safety officer in consultation with the nf-core core team have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.

Members of the core team or the Safety Team who violate the CoC will be required to recuse themselves pending investigation. They will not have access to any reports of the violations and will be subject to the same actions as others in violation of the CoC.

Expand Down
23 changes: 21 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,13 @@
workflows use the "tube map" design for that. See https://nf-co.re/docs/contributing/design_guidelines#examples for examples. -->
<!-- TODO nf-core: Fill in short bullet-pointed list of the default steps in the pipeline -->

1. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))
2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))
### Main steps of the pipeline

The **phaseimpute** pipeline is constituted of 5 main steps:

| Metro map | Modes |
|-----------|-------------|
| <img src="docs/images/metro/MetroMap.png" alt="metromap" width="800"/> | - **Pre-processing**: Phasing, QC, variant filtering, variant annotation of the reference panel <br> - **Phase**: Phasing of the target dataset on the reference panel <br> - **Simulate**: Simulation of the target dataset from high quality target data <br> - **Concordance**: Concordance between the target dataset and a truth dataset <br> - **Post-processing**: Variant filtering based on their imputation quality |
## Usage

> [!NOTE]
Expand Down Expand Up @@ -72,6 +76,19 @@ nextflow run nf-core/phaseimpute \
For more details and further functionality, please refer to the [usage documentation](https://nf-co.re/phaseimpute/usage) and the [parameter documentation](https://nf-co.re/phaseimpute/parameters).

## Description of the different mode of the pipeline

Here is a short description of the different mode of the pipeline.
For more information please refer to the [documentation](https://nf-core.github.io/phaseimpute/usage/).

| Mode | Flow chart | Description |
|-------------------|-----------------------------------------------------------------|----------------------------------------------------------------------------------------|
| **Preprocessing** | <img src="docs/images/metro/PreProcessing.png" alt="phase_metro" width="600"/> | The preprocessing mode is responsible to the preparation of the multiple input file that will be used by the phasing process. <br> The main processes are : <br> - **Haplotypes phasing** of the reference panel using [**Shapeit5**](https://odelaneau.github.io/shapeit5/). <br> - **Filter** the reference panel to select only the necessary variants. <br> - **Chunking the reference panel** in a subset of region for all the chromosomes. <br> - **Extract** the positions where to perform the imputation.|
| **Phasing** | <img src="docs/images/metro/Phase.png" alt="phase_metro" width="600"/> | The phasing mode is the core mode of this pipeline. <br> It is constituted of 3 main steps: <br> - **Phasing**: Phasing of the target dataset on the reference panel using either: <br> &emsp; - [**Glimpse1**](https://odelaneau.github.io/GLIMPSE/glimpse1/index.html) <br> &emsp; It's come with the necessety to compute the genotype likelihoods of the target dataset. <br> &emsp; This step is done using [BCFTOOLS_mpileup](https://samtools.github.io/bcftools/bcftools.html#mpileup) <br> &emsp; - [**Glimpse2**](https://odelaneau.github.io/GLIMPSE/glimpse2/index.html) For this step the reference panel is transformed to binary chunks. <br> &emsp; - [**Stitch**](https://github.com/rwdavies/stitch) <br> &emsp; - [**Quilt**](https://github.com/rwdavies/QUILT) <br> - **Ligation**: all the different chunks are merged together. <br> - **Sampling** (optional) |
| **Simulate** | <img src="docs/images/metro/Simulate.png" alt="simulate_metro" width="600"/> | The simulation mode is used to create artificial low informative genetic information from high density data. This allow to compare the imputed result to a *truth* and therefore evaluate the quality of the imputation. <br> For the moment it is possible to simulate: <br> - Low-pass data by **downsample** BAM or CRAM using [SAMTOOLS_view -s]() at different depth <br> - Genotype data by **SNP selecting** the position used by a designated SNP chip. <br> The simulation mode will also compute the **Genotype likelihoods** of the high density data. |
| **Concordance** | <img src="docs/images/metro/Concordance.png" alt="concordance_metro" width="600"/> | This mode compare two vcf together to compute a summary of the differences between them. <br> To do so it use either: <br> - [**Glimpse1**](https://odelaneau.github.io/GLIMPSE/glimpse1/index.html) concordance process. <br> - [**Glimpse2**](https://odelaneau.github.io/GLIMPSE/glimpse2/index.html) concordance process <br> - Or convert the two vcf fill to `.zarr` using [**Scikit allele**](https://scikit-allel.readthedocs.io/en/stable/) and [**anndata**](https://anndata.readthedocs.io/en/latest/) before comparing the SNPs. |
| **Postprocessing**| <img src="docs/images/metro/PostProcessing.png" alt="postprocessing_metro" width="600"/> | This final process unable to loop the whole pipeline for increasing the performance of the imputation. To do so it filter out the best imputed position and rerun the analysis using this positions. |

## Pipeline output

To see the results of an example test run with a full size dataset refer to the [results](https://nf-co.re/phaseimpute/results) tab on the nf-core website pipeline page.
Expand All @@ -90,8 +107,10 @@ We thank the following people for their extensive assistance in the development

If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).

For further information or help, don't hesitate to get in touch on the [Slack `#phaseimpute` channel](https://nfcore.slack.com/channels/phaseimpute) (you can join with [this invite](https://nf-co.re/join/slack)).
For further information or help, don't hesitate to get in touch on the [Slack `#phaseimpute` channel](https://nfcore.slack.com/channels/phaseimpute) (you can join with [this invite](https://nf-co.re/join/slack)).

## Citations
## Citations

<!-- TODO nf-core: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file. -->
Expand Down
39 changes: 39 additions & 0 deletions assets/chr_rename.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
chr1 1
chr2 2
chr3 3
chr4 4
chr5 5
chr6 6
chr7 7
chr8 8
chr9 9
chr10 10
chr11 11
chr12 12
chr13 13
chr14 14
chr15 15
chr16 16
chr17 17
chr18 18
chr19 19
chr20 20
chr21 21
chr22 22
chr23 23
chr24 24
chr25 25
chr26 26
chr27 27
chr28 28
chr29 29
chr30 30
chr31 31
chr32 32
chr33 33
chr34 34
chr35 35
chr36 36
chr37 37
chr38 38
chr39 X
2 changes: 2 additions & 0 deletions assets/regionsheet.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
chr,start,end
20,20000000,2200000
6 changes: 3 additions & 3 deletions assets/samplesheet.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
sample,fastq_1,fastq_2
SAMPLE_PAIRED_END,/path/to/fastq/files/AEG588A1_S1_L002_R1_001.fastq.gz,/path/to/fastq/files/AEG588A1_S1_L002_R2_001.fastq.gz
SAMPLE_SINGLE_END,/path/to/fastq/files/AEG588A4_S4_L003_R1_001.fastq.gz,
sample,BAM,BAI
1_BAM_1X,/path/to/.bam,/path/to/.bai
1_BAM_SNP,/path/to/.bam,/path/to/.bai
20 changes: 8 additions & 12 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "https://raw.githubusercontent.com/nf-core/phaseimpute/master/assets/schema_input.json",
"title": "nf-core/phaseimpute pipeline - params.input schema",
"title": "nf-core/phaseimpute pipeline - params.input",
"description": "Schema for the file provided with params.input",
"type": "array",
"items": {
Expand All @@ -13,21 +13,17 @@
"errorMessage": "Sample name must be provided and cannot contain spaces",
"meta": ["id"]
},
"fastq_1": {
"BAM": {
"type": "string",
"format": "file-path",
"exists": true,
"pattern": "^\\S+\\.f(ast)?q\\.gz$",
"errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'"
"pattern": "^\\S+\\.bam$",
"errorMessage": "BAM file must be provided, cannot contain spaces and must have extension '.bam'"
},
"fastq_2": {
"BAI": {
"errorMessage": "BAI file must be provided, cannot contain spaces and must have extension '.bai'",
"type": "string",
"format": "file-path",
"exists": true,
"pattern": "^\\S+\\.f(ast)?q\\.gz$",
"errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'"
"pattern": "^\\S+\\.bai$"
}
},
"required": ["sample", "fastq_1"]
"required": ["sample", "BAM", "BAI"]
}
}
35 changes: 35 additions & 0 deletions assets/schema_input_region.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "https://raw.githubusercontent.com/nf-core/phaseimpute/master/assets/schema_input.json",
"title": "nf-core/phaseimpute pipeline - params.input_region schema",
"description": "Schema for the file provided with params.input_region",
"type": "array",
"items": {
"type": "object",
"properties": {
"chr": {
"anyOf": [
{
"type": "string",
"pattern": "^\\S+$"
},
{
"type": "integer",
"pattern": "^\\d+$"
}
]
},
"start": {
"type": "integer",
"pattern": "^\\d+$",
"errorMessage": "Region start name must be provided, cannot contain spaces and must be numeric"
},
"end": {
"type": "integer",
"pattern": "^\\d+$",
"errorMessage": "Region end name must be provided, cannot contain spaces and must be numeric"
}
},
"required": ["chr", "start", "end"]
}
}
Loading

0 comments on commit 62aca9f

Please sign in to comment.