Proposal: DSL2+ #312

bentsherman · 2024-05-21T14:51:53Z

Spun off from #309 to showcase just the DSL2 parts:

Only use params in top-level workflow: can be done today. Pass params into processes and workflows as explicit inputs. Might have missed a few, but you get the idea
Replace ext config with params and process inputs: can be done today, see Refactor ext config as params #308. Formalizes ext args as process inputs. If an args input needs to be configurable, it can be a param.
Replace publishDir with workflow output definition: available in 24.04 as a preview feature (Workflow output definition nextflow-io/nextflow#4784). Moves publish definition to workflow level by publishing channels. Combined with the ext refactor, removes the need for most module config and process selectors.
Make config comply with strict config parser: coming sometime this year as an opt-in feature (Config parser (and loader) nextflow-io/nextflow#4744). Restricts the config syntax to assignment / block / include with the ability for values to be Groovy expressions. Mostly improves the error reporting at runtime, not much syntax changes are needed. Replace check_max() with the resourceLimits directive, coming in 24.04 (Add resourceLimits directive nextflow-io/nextflow#2911).
Use params schema as source of truth: Only define params in schema instead of config file. Convert schema to YAML for better readability. Config profiles can still override param default value. New config parser (above) will fix issue with params resolution in config (Allow custom configs params to be parsed before nextflow.config nextflow-io/nextflow#2662). Incorporate params validation from nf-validation into core Nextflow.
Use eval output, topic channels to collect tool versions: can be done today, see 'versions' directive in process nextflow-io/nextflow#4386. Simplify the collection of tool versions, removes lots of boilerplate from processes and workflows.
Add workflow output schema: coming sometime this year in the final version of the workflow output definition. The output schema is essentially a collection of schemas for index files (like a samplesheet).

This schema can be used to launch a chain of pipelines in Seqera Platform -- when filling out the launch form for a downstream pipeline, you should be able to select "expected" outputs from the upstream pipeline as inputs, e.g. mapping an output samplesheet to an input samplesheet. The schema is used to verify whether an output and input can be connected.

For example, the schema for the fetchngs output samplesheet should somehow "match" the schema for the rnaseq input samplesheet, so that you can select it, then Seqera Platform should launch the rnaseq run immediately after the fetchngs run completes.
Make pipeline import-able: See https://github.com/bentsherman/fetchngs2rnaseq for a more complete example of this and some notes. The main thing missing from this PR for importability is to make sure that the SRA workflow can be used independently of any params or publishing. It should be possible to pass a channel of samples + metadata directly into a downstream workflow, rather than saving to and re-loading from a samplesheet.

Signed-off-by: Ben Sherman <[email protected]>

bentsherman · 2024-11-03T19:49:46Z

Closing in favor of #309

Eval outputs and topic channels are being implemented in modules, see Use eval to collect version modules#5834
The nf-core pipeline template was updated in v3 to comply with strict syntax, which is also new defined in the docs and enforced by the language server
Parameter schema is now used by the language server, can be incorporated into Nextflow with less urgency, pending further design

Based on discussions with the community, I have concluded that the other proposed changes (use params only in entry workflow, remove ext config, workflow outputs) will be much easier to do with static types, so I'm not going to push for it very much in the meantime. IMO it makes more sense to wait for static types and refactor your pipeline once, rather than refactor now with suboptimal syntax and then refactor again in a year, which wouldn't bring much benefit, especially now with the language server.

I do encourage fetchngs to go ahead and update where appropriate, but we can pursue these changes in smaller pieces, in particular:

comply with strict syntax (mostly covered by template v3)
topic channels, eval outputs (once modules are updated)
workflow outputs

bentsherman added 8 commits April 26, 2024 19:52

Replace ext/publishDir with params/publish definition

f531c5d

Signed-off-by: Ben Sherman <[email protected]>

Update config to comply with strict parser

836ace2

Signed-off-by: Ben Sherman <[email protected]>

Use param schemas as source of truth, convert to YAML

25a1fb5

Signed-off-by: Ben Sherman <[email protected]>

Use eval output, topic channels to collect tool versions

505806a

Signed-off-by: Ben Sherman <[email protected]>

Refactor params as workflow inputs

4401d29

Signed-off-by: Ben Sherman <[email protected]>

Update workflow output definition

1b2ad00

Signed-off-by: Ben Sherman <[email protected]>

Update workflow params definition

4ab2ddc

Signed-off-by: Ben Sherman <[email protected]>

Add workflow output schema

39971b6

Signed-off-by: Ben Sherman <[email protected]>

bentsherman mentioned this pull request May 21, 2024

Proposal: Static types #309

Draft

bentsherman marked this pull request as ready for review May 21, 2024 14:53

This was referenced May 21, 2024

Refactor ext config as params #308

Closed

Workflow output definition #275

Closed

bentsherman added 3 commits June 10, 2024 14:33

Rename schema_params.yml to schema_inputs.yml

a05928c

Signed-off-by: Ben Sherman <[email protected]>

Remove trailing slashes from target names

081dbc0

Signed-off-by: Ben Sherman <[email protected]>

Add wrapper workflow for ASPERA_CLI

ff54921

Signed-off-by: Ben Sherman <[email protected]>

ewels mentioned this pull request Jun 19, 2024

Use eval to collect version nf-core/modules#5834

Open

bentsherman added 2 commits June 21, 2024 05:26

Initialize ch_fastq

b69bb74

Signed-off-by: Ben Sherman <[email protected]>

Remove import statements

6996724

Signed-off-by: Ben Sherman <[email protected]>

bentsherman mentioned this pull request Jul 2, 2024

Finalize workflow output definition nextflow-io/nextflow#5103

Closed

bentsherman mentioned this pull request Aug 7, 2024

Generate output schema from output definition nextflow-io/nextflow#5213

Open

bentsherman added 2 commits September 22, 2024 23:58

Fix warnings

faf3af4

Signed-off-by: Ben Sherman <[email protected]>

Update workflow outputs (second preview)

f9385f3

Signed-off-by: Ben Sherman <[email protected]>

This was referenced Sep 24, 2024

Refactor ext config as process inputs bentsherman/rnaseq#1

Closed

Workflow output definition (second preview) nextflow-io/nextflow#5185

Merged

MatthiasZepper mentioned this pull request Oct 29, 2024

Add subsampling nf-core/seqinspector#50

Merged

11 tasks

Refactor output targets to samples and versions

a65146e

Signed-off-by: Ben Sherman <[email protected]>

bentsherman closed this Nov 3, 2024

bentsherman deleted the dsl2-plus branch November 3, 2024 19:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: DSL2+ #312

Proposal: DSL2+ #312

bentsherman commented May 21, 2024 •

edited

Loading

bentsherman commented Nov 3, 2024

Proposal: DSL2+ #312

Proposal: DSL2+ #312

Conversation

bentsherman commented May 21, 2024 • edited Loading

bentsherman commented Nov 3, 2024

bentsherman commented May 21, 2024 •

edited

Loading