Add cmd output type #4493

pditommaso · 2023-11-08T03:37:00Z

Implements the support for cmd output type to collect task tool version and metadata as described here #4386 (comment)

POC usage:

process foo {
  output:
  cmd 'git --version', emit: gitVersion
  
  '''
  echo Ciao
  '''
}

workflow {
  foo()
  foo.out.gitVersion | view
}

Signed-off-by: Paolo Di Tommaso <[email protected]>

netlify · 2023-11-08T03:37:06Z

✅ Deploy Preview for nextflow-docs-staging canceled.

Name	Link
🔨 Latest commit	`c4c6182`
🔍 Latest deploy log	https://app.netlify.com/sites/nextflow-docs-staging/deploys/65c7dbc1e12437000807d870

bentsherman · 2023-11-08T15:52:36Z

I kinda like the idea of overloading stdout for this purpose, since it can only be the stdout of the command. Otherwise we should use the full word command

pditommaso · 2023-11-08T15:55:05Z

I wanted to keep simple, at least as POC

pditommaso · 2023-11-08T21:58:33Z

Next steps:

Handle command failure (how?)
- Ben: command fail should trigger process fail, unless marked as optional (still print warning)
- if implementation too complex, worth it to just document a || true syntax
Handle multiline command output (likely currently breaks)
Only works for Bash based tasks
- will be a constraint of the implementation
Support for tuple composition
Add + fix tests

bentsherman · 2023-11-15T16:00:47Z

Is this ready to test with rnaseq? Even if it only supports single-line bash commands, should be enough.

pditommaso · 2023-11-15T16:13:49Z

it should

bentsherman · 2023-11-15T20:57:58Z

It doesn't seem to work with tuples, gonna need that for something like:

tuple val('gunzip'), cmd("gunzip --version")

bentsherman · 2023-11-15T21:22:38Z

I will add it

marcodelapierre · 2023-11-17T06:04:51Z

I kinda like the idea of overloading stdout for this purpose, since it can only be the stdout of the command. Otherwise we should use the full word command

@pditommaso sounds tempting to me, too, on this same ground by Ben.

marcodelapierre · 2023-11-17T06:07:07Z

I think this comment is interesting: #4386 (comment)
Also, it relates to your point above Paolo: Only works for Bash based tasks

If multi-line is allowed, then we might leverage the shebang to allow non-bash executions, including other shells, Groovy, popular scripting langs such as Python .. a bit like the current scoping for the script block , really.

Thoughts on this point?

marcodelapierre · 2023-11-17T06:07:54Z

Handle command failure (how?)

I would say, if the command fail, the task should fail. I.e. consistent with the behaviour as for the main script block.

pditommaso · 2023-11-17T09:55:57Z

If multi-line is allowed, then we might leverage the shebang to allow non-bash executions

not sure it's possible

bentsherman · 2023-11-17T16:05:13Z

I would say, if the command fail, the task should fail. I.e. consistent with the behaviour as for the main script block.

On this point, I think I suggested that the user should be able to ignore the failure by specifying optional: true, although it would be helpful to still log a warning in that case

If multi-line is allowed, then we might leverage the shebang to allow non-bash executions

Due to the compact nature of process inputs and outputs, it would be unwieldy to specify a multi-line script here. Instead, it should already be possible to reference a script from the bin directory in the cmd output. That would be useful anyway because, in rnaseq for example, many processes use the same tools and so the same version commands are duplicated.

Regarding the shebang, we could add a shell option to the command output and still support one-liners:

output:
cmd("print('Hello!')", shell: 'python')

modules/nextflow/src/main/groovy/nextflow/script/params/TupleInParam.groovy

bentsherman · 2023-11-17T16:42:50Z

I kinda like the idea of overloading stdout for this purpose, since it can only be the stdout of the command. Otherwise we should use the full word command

@pditommaso sounds tempting to me, too, on this same ground by Ben.

I'm starting to lean back towards a separate command or cmd type... , this command output feels different enough from stdout that it should have a different name, and it wouldn't make sense to extend stdin in the same way

marcodelapierre · 2023-11-20T13:53:11Z

I kinda like the idea of overloading stdout for this purpose, since it can only be the stdout of the command. Otherwise we should use the full word command

@pditommaso sounds tempting to me, too, on this same ground by Ben.

I'm starting to lean back towards a separate command or cmd type... , this command output feels different enough from stdout that it should have a different name, and it wouldn't make sense to extend stdin in the same way

I agree on the leaning back, based on the optional and script attributes you mentioned above, #4493 (comment), both of which make sense to me.

bentsherman · 2023-11-27T20:52:59Z

Handle multiline command output (likely currently breaks)

So the multi-line output does not fail, but the newlines are converted to spaces:

$ echo nxf_out_cmd_1=$(bash --version) > .command.env
$ cat .command.env
nxf_out_cmd_1=GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu) Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software; you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.

The command itself is easily fixed with some quotes:

$ echo nxf_out_cmd_1="$(bash --version)" > .command.env
$ cat .command.env 
nxf_out_cmd_1=GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

But that would break the .command.env file which expects a key-value pair on each line.

The same limitation already exists for env outputs, e.g. if I do FOO=$(bash --version) in the process script then the process output will be squeezed onto a single line. For this reason, and the fact that cmd outputs are intended for small things like metadata, while larger outputs can always be written to a file, I am fine with leaving the current behavior as it is.

pditommaso · 2023-11-27T21:18:13Z

A single line is a too weak assumption. If you notice, nf-core uses a multi-lines YAML snippet!

https://github.com/nf-core/rnaseq/blob/a10f41afa204538d5dcc89a5910c299d68f94f41/modules/nf-core/salmon/index/main.nf#L42-L45

bentsherman · 2023-11-27T21:21:48Z

That is only because they are eagerly formatting it to YAML. Instead in my proof-of-concept PR I use tuples and separate cmd outputs:

https://github.com/nf-core/rnaseq/pull/1115/files#diff-d81c6cc6d9c866f8162450ce077b2413b28ecd47881a70a5c24a1ec1f1b70205

marcodelapierre · 2024-01-08T14:36:06Z

@pditommaso any blocker for this to be merged? Conversation around the review seems pretty much settled

Signed-off-by: Paolo Di Tommaso <[email protected]>

pditommaso · 2024-02-11T09:25:53Z

Ok, I've cleaned up and refactored a but this. The main change is output: cmd become output: eval. The main reason is that command is too generic term and it was overlapping with the task command. I think eval express better the overall idea of a small computation that should be evaluated to capture the output value.

ewels · 2024-02-11T12:37:40Z

Nice! I like eval, good call 👍🏻

docs/process.md

This PR introduces the output `eval` type that allows the definition of a script expression that needs to be computed in the script context to evaluate the output value to be emitted. An example would be: ``` process someTask { output: eval 'bash --version' ''' some-command --here ''' } ``` Signed-off-by: Paolo Di Tommaso <[email protected]> Signed-off-by: Ben Sherman <[email protected]> Co-authored-by: Ben Sherman <[email protected]> Co-authored-by: Dr Marco Claudio De La Pierre <[email protected]> Signed-off-by: Niklas Schandry <[email protected]>

This PR introduces the output `eval` type that allows the definition of a script expression that needs to be computed in the script context to evaluate the output value to be emitted. An example would be: ``` process someTask { output: eval 'bash --version' ''' some-command --here ''' } ``` Signed-off-by: Paolo Di Tommaso <[email protected]> Signed-off-by: Ben Sherman <[email protected]> Co-authored-by: Ben Sherman <[email protected]> Co-authored-by: Dr Marco Claudio De La Pierre <[email protected]>

This PR introduces the output `eval` type that allows the definition of a script expression that needs to be computed in the script context to evaluate the output value to be emitted. An example would be: ``` process someTask { output: eval 'bash --version' ''' some-command --here ''' } ``` Signed-off-by: Paolo Di Tommaso <[email protected]> Signed-off-by: Ben Sherman <[email protected]> Co-authored-by: Ben Sherman <[email protected]> Co-authored-by: Dr Marco Claudio De La Pierre <[email protected]> Signed-off-by: Niklas Schandry <[email protected]>

Add cmd output type

c5531a8

Signed-off-by: Paolo Di Tommaso <[email protected]>

pditommaso marked this pull request as draft November 8, 2023 03:37

bentsherman added the lang/processes label Nov 8, 2023

pditommaso mentioned this pull request Nov 8, 2023

Allow to add custom traces and use them as metadata #4425

Closed

bentsherman self-requested a review November 15, 2023 16:47

bentsherman mentioned this pull request Nov 15, 2023

Use eval output for tool versions nf-core/rnaseq#1115

Draft

7 tasks

marcodelapierre added this to the 23.11.0-edge milestone Nov 17, 2023

bentsherman reviewed Nov 17, 2023

View reviewed changes

modules/nextflow/src/main/groovy/nextflow/script/params/TupleInParam.groovy Outdated Show resolved Hide resolved

marcodelapierre modified the milestones: 23.11.0-edge, 23.12.0-edge Nov 21, 2023

pditommaso force-pushed the master branch from fd99141 to 19d2ccb Compare November 24, 2023 20:42

bentsherman mentioned this pull request Nov 27, 2023

Add example of dynamic directive with input file #4545

Open

Merge branch 'master' into output-cmd-type

a1a7f74

pditommaso force-pushed the master branch from 4e27468 to dfd7d09 Compare December 20, 2023 09:55

Merge branch 'master' into output-cmd-type

197dafa

marcodelapierre modified the milestones: 23.12.0-edge, 24.01.0-edge Jan 11, 2024

marcodelapierre and others added 9 commits January 15, 2024 20:42

Merge branch 'master' into output-cmd-type

3d609eb

Merge branch 'master' into output-cmd-type

48fc497

Merge branch 'master' into output-cmd-type

2c1109c

Merge branch 'master' into output-cmd-type

533d2a0

Merge branch 'master' into output-cmd-type

11b21db

Signed-off-by: Paolo Di Tommaso <[email protected]>

Merge branch 'master' into output-cmd-type

5a3f67c

Merge branch 'master' into output-cmd-type

605c262

Improvement and cleanup

1d97d49

Signed-off-by: Paolo Di Tommaso <[email protected]>

Fix failing tests

c4c6182

Signed-off-by: Paolo Di Tommaso <[email protected]>

pditommaso modified the milestones: 24.01.0-edge, 24.02.0-edge Feb 11, 2024

pditommaso merged commit df97811 into master Feb 11, 2024
22 checks passed

pditommaso deleted the output-cmd-type branch February 11, 2024 09:29

bentsherman reviewed Feb 11, 2024

View reviewed changes

docs/process.md Show resolved Hide resolved

This was referenced Apr 19, 2024

Add a directive that allows the fetching of tool version meta information #879

Closed

'versions' directive in process #4386

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cmd output type #4493

Add cmd output type #4493

pditommaso commented Nov 8, 2023 •

edited

Loading

netlify bot commented Nov 8, 2023 •

edited

Loading

bentsherman commented Nov 8, 2023

pditommaso commented Nov 8, 2023

pditommaso commented Nov 8, 2023 •

edited

Loading

bentsherman commented Nov 15, 2023

pditommaso commented Nov 15, 2023

bentsherman commented Nov 15, 2023

bentsherman commented Nov 15, 2023

marcodelapierre commented Nov 17, 2023

marcodelapierre commented Nov 17, 2023

marcodelapierre commented Nov 17, 2023

pditommaso commented Nov 17, 2023

bentsherman commented Nov 17, 2023

bentsherman commented Nov 17, 2023

marcodelapierre commented Nov 20, 2023

bentsherman commented Nov 27, 2023

pditommaso commented Nov 27, 2023

bentsherman commented Nov 27, 2023

marcodelapierre commented Jan 8, 2024

pditommaso commented Feb 11, 2024

ewels commented Feb 11, 2024

Add cmd output type #4493

Add cmd output type #4493

Conversation

pditommaso commented Nov 8, 2023 • edited Loading

netlify bot commented Nov 8, 2023 • edited Loading

✅ Deploy Preview for nextflow-docs-staging canceled.

bentsherman commented Nov 8, 2023

pditommaso commented Nov 8, 2023

pditommaso commented Nov 8, 2023 • edited Loading

bentsherman commented Nov 15, 2023

pditommaso commented Nov 15, 2023

bentsherman commented Nov 15, 2023

bentsherman commented Nov 15, 2023

marcodelapierre commented Nov 17, 2023

marcodelapierre commented Nov 17, 2023

marcodelapierre commented Nov 17, 2023

pditommaso commented Nov 17, 2023

bentsherman commented Nov 17, 2023

bentsherman commented Nov 17, 2023

marcodelapierre commented Nov 20, 2023

bentsherman commented Nov 27, 2023

pditommaso commented Nov 27, 2023

bentsherman commented Nov 27, 2023

marcodelapierre commented Jan 8, 2024

pditommaso commented Feb 11, 2024

ewels commented Feb 11, 2024

pditommaso commented Nov 8, 2023 •

edited

Loading

netlify bot commented Nov 8, 2023 •

edited

Loading

pditommaso commented Nov 8, 2023 •

edited

Loading