Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: DSL2+ #312

Closed
wants to merge 16 commits into from
Closed

Proposal: DSL2+ #312

wants to merge 16 commits into from

Conversation

bentsherman
Copy link

@bentsherman bentsherman commented May 21, 2024

Spun off from #309 to showcase just the DSL2 parts:

  • Only use params in top-level workflow: can be done today. Pass params into processes and workflows as explicit inputs. Might have missed a few, but you get the idea

  • Replace ext config with params and process inputs: can be done today, see Refactor ext config as params #308. Formalizes ext args as process inputs. If an args input needs to be configurable, it can be a param.

  • Replace publishDir with workflow output definition: available in 24.04 as a preview feature (Workflow output definition nextflow-io/nextflow#4784). Moves publish definition to workflow level by publishing channels. Combined with the ext refactor, removes the need for most module config and process selectors.

  • Make config comply with strict config parser: coming sometime this year as an opt-in feature (Config parser (and loader) nextflow-io/nextflow#4744). Restricts the config syntax to assignment / block / include with the ability for values to be Groovy expressions. Mostly improves the error reporting at runtime, not much syntax changes are needed. Replace check_max() with the resourceLimits directive, coming in 24.04 (Add resourceLimits directive nextflow-io/nextflow#2911).

  • Use params schema as source of truth: Only define params in schema instead of config file. Convert schema to YAML for better readability. Config profiles can still override param default value. New config parser (above) will fix issue with params resolution in config (Allow custom configs params to be parsed before nextflow.config nextflow-io/nextflow#2662). Incorporate params validation from nf-validation into core Nextflow.

  • Use eval output, topic channels to collect tool versions: can be done today, see 'versions' directive in process nextflow-io/nextflow#4386. Simplify the collection of tool versions, removes lots of boilerplate from processes and workflows.

  • Add workflow output schema: coming sometime this year in the final version of the workflow output definition. The output schema is essentially a collection of schemas for index files (like a samplesheet).

    This schema can be used to launch a chain of pipelines in Seqera Platform -- when filling out the launch form for a downstream pipeline, you should be able to select "expected" outputs from the upstream pipeline as inputs, e.g. mapping an output samplesheet to an input samplesheet. The schema is used to verify whether an output and input can be connected.

    For example, the schema for the fetchngs output samplesheet should somehow "match" the schema for the rnaseq input samplesheet, so that you can select it, then Seqera Platform should launch the rnaseq run immediately after the fetchngs run completes.

  • Make pipeline import-able: See https://github.com/bentsherman/fetchngs2rnaseq for a more complete example of this and some notes. The main thing missing from this PR for importability is to make sure that the SRA workflow can be used independently of any params or publishing. It should be possible to pass a channel of samples + metadata directly into a downstream workflow, rather than saving to and re-loading from a samplesheet.

@bentsherman bentsherman marked this pull request as ready for review May 21, 2024 14:53
This was referenced May 21, 2024
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
@bentsherman
Copy link
Author

Closing in favor of #309

  • Eval outputs and topic channels are being implemented in modules, see Use eval to collect version modules#5834
  • The nf-core pipeline template was updated in v3 to comply with strict syntax, which is also new defined in the docs and enforced by the language server
  • Parameter schema is now used by the language server, can be incorporated into Nextflow with less urgency, pending further design

Based on discussions with the community, I have concluded that the other proposed changes (use params only in entry workflow, remove ext config, workflow outputs) will be much easier to do with static types, so I'm not going to push for it very much in the meantime. IMO it makes more sense to wait for static types and refactor your pipeline once, rather than refactor now with suboptimal syntax and then refactor again in a year, which wouldn't bring much benefit, especially now with the language server.

I do encourage fetchngs to go ahead and update where appropriate, but we can pursue these changes in smaller pieces, in particular:

  • comply with strict syntax (mostly covered by template v3)
  • topic channels, eval outputs (once modules are updated)
  • workflow outputs

@bentsherman bentsherman closed this Nov 3, 2024
@bentsherman bentsherman deleted the dsl2-plus branch November 3, 2024 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant