-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make workflow named outputs show up in the manifest #23
Comments
I think the original creator of nf-prov tried to associate published outputs with the process But the problem with your request is that published outputs are not related to workflow emits at all. More fundamentally, I'm not sure that the provenance manifest is the best way to facilitate the chaining of pipelines. I think we need some kind of workflow output schema which can be easily matched to the input schema of a downstream workflow, which does not involve workflow emits at all. Alternatively, you could write a "meta-pipeline" which imports entire pipelines as modules and chains them together with regular dataflow logic. That would use the workflow takes/emits but not the input/output schemas, which in this case would be an unnecessary extra step. I am working on a proof-of-concept for this using fetchngs+rnaseq, hope to finish it at the hackathon next week. |
This should definitely be a thing. The main blockers on this (in nf-core at least) have been config-based, and @drpatelh 's related plans should help. |
Honestly, I am not really a big fan of the idea of writing "meta-pipelines" because then it seems you would have to write one for every combination of pipelines you want to chain together. I feel like this is the better approach;
( which feels related to this nextflow-io/nextflow#4670 ) an idea floated elsewhere, was some mechanism by which you could chain pipelines in a manner like this
The topic of 'pipeline chaining' per se is likely out of scope for this Issue and Repo, maybe it can be moved to some other location. But if "named outputs" were available in the nf-prov (or elsewhere??) then at least we could more easily hack it together ourselves :) feel free to close this issue if think there's a better place for the discussions, thanks |
I see you have commented on nextflow-io/nextflow#4670, let's move the discussion over there. Your feedback might help us finalize the design of the workflow output schema which should be the easiest way to chain pipelines |
Right now the manifest JSON output looks something like this
However I am able to define my pipeline's main
workflow
section to have named outputs, like thisIt would be really helpful if we could somehow keep the label such as
myfiles
associated with the published files, maybe something like thisThis would be really helpful for downstream processing, so that you could parse the manifest JSON and identify specific files. For example, if you had an
emit
channel for MultiQC filesmultiqc_ch
, you would be able to identify all the files with the labelmultiqc_ch
to more easily pass them in to some other process, like a chained post-processing workflow.@pinin4fjords
I noticed that under the
tasks
section of the manifest JSON, there is anemit
field already in theoutputs
list for each task, however in all my pipelines so far it seems like the value here isnull
, not sure what this was meant to be used for but it seems like maybe this functionality might overlap?The text was updated successfully, but these errors were encountered: