Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first release of RNAseq DE analysis, filtering and plotting workflow #582

Merged
merged 5 commits into from
Nov 12, 2024

Conversation

pavanvidem
Copy link
Member

This workflow is downstream analysis of the existing rnaseq-pe and rnaseq-sr workflows. It takes 2 collections and performs differential expressin analysis, annotates the results table, filters and generates a valcano plot, a z-scores heatmap and a heatmap of normalized counts.

@lldelisle
Copy link
Contributor

👍 This looks good.

@lldelisle
Copy link
Contributor

Maybe you should write in the directory or in the workflow is only for 2 conditions.

@pavanvidem
Copy link
Member Author

pavanvidem commented Oct 25, 2024

Maybe you should write in the directory or in the workflow is only for 2 conditions.

sure! I will write in the workflow description

@lldelisle
Copy link
Contributor

One of my colleague proposed that the input could be a single collection and a replicate assignment like 1,1,1,2,2,2 and a batch assignment like 1,2,3,1,2,3. I don't know if this makes sense or a samples plan like @wm75 proposed in #581 (with less columns).
Maybe we could just add 3 general workflows:

  • split collection in 2 by samplesplan: one column is identifier, other column is group (new collection name)
  • split collection in 2 by pattern: one collection contains identifiers which contains this pattern, another collection with identifiers which do not contain this pattern.
  • split collection in 2 by comma-separated values: the user input would be 1,1,1,2,2,2 but this one is more error prone to me.

@pavanvidem
Copy link
Member Author

replicate assignments are fine because we know there are exactly 2 conditions. But I don't think it is possible to include batches unless we assume there are fixed number of batches. Where should we use these batches information? in multi-factor analysis?

@lldelisle
Copy link
Contributor

I think we would need to rewrite the wrapper to allow to set the batches dynamically in workflows so this is just for future development.

@wm75
Copy link
Contributor

wm75 commented Oct 25, 2024

just simple two-conditions comparison might be good enough for this version.

@wm75
Copy link
Contributor

wm75 commented Oct 25, 2024

image

can you arrange the inputs in a more meaningful way @pavanvidem ?
(order is dependent on euclidian distance from the top-left corner 🙃 )

@wm75
Copy link
Contributor

wm75 commented Oct 25, 2024

Also 0.0 doesn't seem like a good default for neither of the thresholds.

@lldelisle
Copy link
Contributor

I think this version is fine. I just say that to make this more usable we should think about developing workflows to split the count collection into 2 (in another PR).

@pavanvidem
Copy link
Member Author

Also 0.0 doesn't seem like a good default for neither of the thresholds.

These are non-optional parameters of tools. I can only set a default if I make then optional and I have to use the dirty trick of connecting the steps and then making them optional with defaults :)

@wm75
Copy link
Contributor

wm75 commented Oct 25, 2024

Also 0.0 doesn't seem like a good default for neither of the thresholds.

These are non-optional parameters of tools. I can only set a default if I make then optional and I have to use the dirty trick of connecting the steps and then making them optional with defaults :)

The "better" trick is to use pick value after the optional param and if it's not set, pick the default value again :-)

@lldelisle
Copy link
Contributor

For me, 'Main factor counts' and 'Base factor counts' are not really meaningful.
For me a factor is for example 'Treatment' and then you have 2 levels: 'treated' and 'untreated'. One of them is the reference and I am not sure there is a consensus way to call the other level.
I propose: 'Counts in changed condition', 'Counts in reference condition'.

@lldelisle
Copy link
Contributor

Also 0.0 doesn't seem like a good default for neither of the thresholds.

These are non-optional parameters of tools. I can only set a default if I make then optional and I have to use the dirty trick of connecting the steps and then making them optional with defaults :)

The "better" trick is to use pick value after the optional param and if it's not set, pick the default value again :-)

Why don't we simply put it optional with a default value?

@pavanvidem
Copy link
Member Author

For me, 'Main factor counts' and 'Base factor counts' are not really meaningful. For me a factor is for example 'Treatment' and then you have 2 levels: 'treated' and 'untreated'. One of them is the reference and I am not sure there is a consensus way to call the other level. I propose: 'Counts in changed condition', 'Counts in reference condition'.

sounds better! I will change

@wm75
Copy link
Contributor

wm75 commented Oct 25, 2024

For me, 'Main factor counts' and 'Base factor counts' are not really meaningful. For me a factor is for example 'Treatment' and then you have 2 levels: 'treated' and 'untreated'. One of them is the reference and I am not sure there is a consensus way to call the other level. I propose: 'Counts in changed condition', 'Counts in reference condition'.

Yes, the factor and factor levels should have better names. Alternatively, you could also use input params there.

@pavanvidem
Copy link
Member Author

Also 0.0 doesn't seem like a good default for neither of the thresholds.

These are non-optional parameters of tools. I can only set a default if I make then optional and I have to use the dirty trick of connecting the steps and then making them optional with defaults :)

The "better" trick is to use pick value after the optional param and if it's not set, pick the default value again :-)

Why don't we simply put it optional with a default value?

If a tool has this param as non-optional. If we set this to optional, we cannot connect this value to the tool.

@wm75
Copy link
Contributor

wm75 commented Oct 25, 2024

Also 0.0 doesn't seem like a good default for neither of the thresholds.

These are non-optional parameters of tools. I can only set a default if I make then optional and I have to use the dirty trick of connecting the steps and then making them optional with defaults :)

The "better" trick is to use pick value after the optional param and if it's not set, pick the default value again :-)

Why don't we simply put it optional with a default value?

https://matrix.to/#/%23galaxyproject_iwc%3Agitter.im/%24-cW9o37aiBvL760m03fDaXVkbprQjQ8-Qj9Qs7KSlb0?via=matrix.org&via=gitter.im :-)

@lldelisle
Copy link
Contributor

For me this is really close to be merged.

  • I think it would be good to use count data available on zenodo to avoid taking space on the github.
  • I think there are too many datasets 'visibles' when you run the workflow:
    image
    I can make you a PR if you want.

@pavanvidem
Copy link
Member Author

pavanvidem commented Nov 8, 2024

  • The count files are not so big but no problem, I can move them to Zenodo.
  • I will make them invisible stuff except that needed

I am also building another 2 workflows (a paired and a single) in the same directory which will combine the quantification workflow and this one so that we will have complete DESeq2 workflows. But this PR can be merged before I add them.

@lldelisle
Copy link
Contributor

Would you mind to remove the test-data?

@pavanvidem
Copy link
Member Author

Would you mind to remove the test-data?

sure, forgot :)

Copy link

github-actions bot commented Nov 8, 2024

Test Results (powered by Planemo)

Test Summary

Test State Count
Total 1
Passed 1
Error 0
Failure 0
Skipped 0
Passed Tests
  • ✅ rnaseq-de-filtering-plotting.ga_0

    Workflow invocation details

    • Invocation Messages

    • Steps
      • Step 1: Counts from changed condition:

        • step_state: scheduled
      • Step 2: Counts from reference condition:

        • step_state: scheduled
      • Step 11: Differential Analysis:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • cat '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/deseq2/8fe98f7094de/deseq2/get_deseq_dataset.R' > /dev/null &&  Rscript '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/deseq2/8fe98f7094de/deseq2/deseq2.R' --cores ${GALAXY_SLOTS:-1} -o '/tmp/tmpc02os3x9/job_working_directory/000/10/outputs/dataset_30f7df7f-be6e-4bb4-b78a-93205e0cb00a.dat' -p '/tmp/tmpc02os3x9/job_working_directory/000/10/outputs/dataset_ce0b6f1f-d7b0-4c4a-b429-82d724f2ff8d.dat' -A 0.1 -n '/tmp/tmpc02os3x9/job_working_directory/000/10/outputs/dataset_15ef0886-71bb-46a6-9b0f-03ef66a76a04.dat'              -H  -f '[["DEFactor", [{"BaseFactor": ["/tmp/tmpc02os3x9/files/6/b/5/dataset_6b5217f1-7077-4695-9d9b-95ead4abbbf4.dat", "/tmp/tmpc02os3x9/files/c/1/e/dataset_c1e6ad0e-4713-4e07-ab4f-9a034504840d.dat"]}, {"MainFactor": ["/tmp/tmpc02os3x9/files/0/e/4/dataset_0e49aaed-3dd7-4c04-9fe6-29fff8b1811d.dat", "/tmp/tmpc02os3x9/files/8/d/7/dataset_8d7b03b2-ff64-46f4-a5b4-e706fd640037.dat"]}]]]' -l '{"dataset_0e49aaed-3dd7-4c04-9fe6-29fff8b1811d.dat": "SRR5085169 Counts Table", "dataset_8d7b03b2-ff64-46f4-a5b4-e706fd640037.dat": "SRR5085170 Counts Table", "dataset_6b5217f1-7077-4695-9d9b-95ead4abbbf4.dat": "SRR5085167 Counts Table", "dataset_c1e6ad0e-4713-4e07-ab4f-9a034504840d.dat": "SRR5085168 Counts Table"}' -t 1

            Exit Code:

            • 0

            Standard Error:

            • Warning message:
              In Sys.setlocale("LC_MESSAGES", "en_US.UTF-8") :
                OS reports request to set locale to "en_US.UTF-8" cannot be honored
              estimating size factors
              estimating dispersions
              gene-wise dispersion estimates
              mean-dispersion relationship
              final dispersion estimates
              fitting model and testing
              

            Standard Output:

            • primary factor: DEFactor 
              
              ---------------------
              DESeq2 run information
              
              sample table:
                                        DEFactor
              SRR5085167 Counts Table BaseFactor
              SRR5085168 Counts Table BaseFactor
              SRR5085169 Counts Table MainFactor
              SRR5085170 Counts Table MainFactor
              
              design formula:
              ~DEFactor
              
              
              4 samples with counts over 7127 genes
              using disperion fit type: parametric 
              creating plots
              summary of results
              DEFactor: MainFactor vs BaseFactor
              
              out of 5734 with nonzero total read count
              adjusted p-value < 0.1
              LFC > 0 (up)       : 2, 0.035%
              LFC < 0 (down)     : 13, 0.23%
              outliers [1]       : 0, 0%
              low counts [2]     : 2963, 52%
              (mean count < 7)
              [1] see 'cooksCutoff' argument of ?results
              [2] see 'independentFiltering' argument of ?results
              
              NULL
              closing plot device
              null device 
                        1 
              Session information:
              
              R version 4.1.1 (2021-08-10)
              Platform: x86_64-conda-linux-gnu (64-bit)
              Running under: Debian GNU/Linux 10 (buster)
              
              Matrix products: default
              BLAS/LAPACK: /usr/local/lib/libopenblasp-r0.3.18.so
              
              locale:
               [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
               [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
               [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
              [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
              
              attached base packages:
              [1] stats4    tools     stats     graphics  grDevices utils     datasets 
              [8] methods   base     
              
              other attached packages:
               [1] pheatmap_1.0.12             ggrepel_0.9.1              
               [3] ggplot2_3.3.5               rjson_0.2.20               
               [5] gplots_3.1.1                RColorBrewer_1.1-2         
               [7] DESeq2_1.34.0               SummarizedExperiment_1.24.0
               [9] Biobase_2.54.0              MatrixGenerics_1.6.0       
              [11] matrixStats_0.61.0          GenomicRanges_1.46.0       
              [13] GenomeInfoDb_1.30.0         IRanges_2.28.0             
              [15] S4Vectors_0.32.0            BiocGenerics_0.40.0        
              [17] getopt_1.20.3              
              
              loaded via a namespace (and not attached):
               [1] httr_1.4.2             bit64_4.0.5            splines_4.1.1         
               [4] gtools_3.9.2           assertthat_0.2.1       blob_1.2.2            
               [7] GenomeInfoDbData_1.2.7 pillar_1.6.4           RSQLite_2.2.8         
              [10] lattice_0.20-45        glue_1.5.1             digest_0.6.29         
              [13] XVector_0.34.0         colorspace_2.0-2       Matrix_1.3-4          
              [16] XML_3.99-0.8           pkgconfig_2.0.3        genefilter_1.76.0     
              [19] zlibbioc_1.40.0        purrr_0.3.4            xtable_1.8-4          
              [22] scales_1.1.1           BiocParallel_1.28.0    tibble_3.1.6          
              [25] annotate_1.72.0        KEGGREST_1.34.0        generics_0.1.1        
              [28] farver_2.1.0           ellipsis_0.3.2         cachem_1.0.6          
              [31] withr_2.4.3            survival_3.2-13        magrittr_2.0.1        
              [34] crayon_1.4.2           memoise_2.0.1          fansi_0.4.2           
              [37] lifecycle_1.0.1        munsell_0.5.0          locfit_1.5-9.4        
              [40] DelayedArray_0.20.0    AnnotationDbi_1.56.1   Biostrings_2.62.0     
              [43] compiler_4.1.1         caTools_1.18.2         rlang_0.4.12          
              [46] grid_4.1.1             RCurl_1.98-1.5         bitops_1.0-7          
              [49] labeling_0.4.2         gtable_0.3.0           DBI_1.1.1             
              [52] R6_2.5.1               dplyr_1.0.7            fastmap_1.1.0         
              [55] bit_4.0.4              utf8_1.2.2             KernSmooth_2.23-20    
              [58] parallel_4.1.1         Rcpp_1.0.7             vctrs_0.3.8           
              [61] geneplotter_1.72.0     png_0.1-7              tidyselect_1.1.1      
              

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "tabular"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              advanced_options {"auto_mean_filter_off": false, "esf": "", "fit_type": "1", "outlier_filter_off": false, "outlier_replace_off": false, "prefilter_conditional": {"__current_case__": 1, "prefilter": ""}}
              batch_factors None
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              header true
              output_options {"alpha_ma": "0.1", "output_selector": ["pdf", "normCounts"]}
              select_data {"__current_case__": 1, "how": "datasets_per_level", "rep_factorName": [{"__index__": 0, "factorName": "DEFactor", "rep_factorLevel": [{"__index__": 0, "countsFile": {"values": [{"id": 1, "src": "hdca"}]}, "factorLevel": "MainFactor"}, {"__index__": 1, "countsFile": {"values": [{"id": 2, "src": "hdca"}]}, "factorLevel": "BaseFactor"}]}]}
              tximport {"__current_case__": 1, "tximport_selector": "count"}
      • Step 12: toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • cd ../; python _evaluate_expression_.py

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              components [{"__index__": 0, "param_type": {"__current_case__": 0, "component_value": "c7<", "select_param_type": "text"}}, {"__index__": 1, "param_type": {"__current_case__": 2, "component_value": "0.1", "select_param_type": "float"}}]
              dbkey "?"
      • Step 13: toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • cd ../; python _evaluate_expression_.py

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              components [{"__index__": 0, "param_type": {"__current_case__": 0, "component_value": "abs(c3)>", "select_param_type": "text"}}, {"__index__": 1, "param_type": {"__current_case__": 2, "component_value": "0.5", "select_param_type": "float"}}]
              dbkey "?"
      • Step 14: toolshed.g2.bx.psu.edu/repos/iuc/deg_annotate/deg_annotate/1.1.0:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/deg_annotate/e98d4ab5b5bc/deg_annotate/deg_annotate.py' -in '/tmp/tmpc02os3x9/files/3/0/f/dataset_30f7df7f-be6e-4bb4-b78a-93205e0cb00a.dat' -m 'degseq' -g '/tmp/tmpc02os3x9/files/a/9/2/dataset_a92fa7b6-7b45-4dd7-af04-a49a481ca4b5.dat' -t 'exon' -i 'gene_id' -x 'transcript_id' -a 'gene_biotype, gene_name' -o '/tmp/tmpc02os3x9/job_working_directory/000/13/outputs/dataset_69e1e648-a0a6-4036-851a-f398531dfe1c.dat'

            Exit Code:

            • 0

            Standard Output:

            • DE(X)Seq output file     : /tmp/tmpc02os3x9/files/3/0/f/dataset_30f7df7f-be6e-4bb4-b78a-93205e0cb00a.dat
              Input file type          : degseq
              Annotation file          : /tmp/tmpc02os3x9/files/a/9/2/dataset_a92fa7b6-7b45-4dd7-af04-a49a481ca4b5.dat
              Feature type             : exon
              ID attribute             : gene_id
              Transcript attribute     : transcript_id
              Attributes to include    : gene_biotype, gene_name
              Annotated output file    : /tmp/tmpc02os3x9/job_working_directory/000/13/outputs/dataset_69e1e648-a0a6-4036-851a-f398531dfe1c.dat
              

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "tabular"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              advanced_parameters {"gff_attributes": "gene_biotype, gene_name", "gff_feature_attribute": "gene_id", "gff_feature_type": "exon", "gff_transcript_attribute": "transcript_id"}
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              mode "degseq"
      • Step 15: toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.3+galaxy1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • env -i $(which awk) --sandbox -v FS='	' -v OFS='	' --re-interval -f '/tmp/tmpc02os3x9/job_working_directory/000/14/configs/tmp44k_8pq7' '/tmp/tmpc02os3x9/files/1/5/e/dataset_15ef0886-71bb-46a6-9b0f-03ef66a76a04.dat' > '/tmp/tmpc02os3x9/job_working_directory/000/14/outputs/dataset_af5f16f2-7127-400d-b896-b4fc6eab3734.dat'

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              code "END{print NF}"
              dbkey "?"
      • Step 16: Annotate DESeq2 table:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • cat '/tmp/tmpc02os3x9/files/3/5/6/dataset_3560d077-c3ec-44e5-8982-3db1f89e4e3a.dat' >> '/tmp/tmpc02os3x9/job_working_directory/000/15/outputs/dataset_b490d2d5-d71c-455b-a2c5-03714b10dc89.dat' && cat '/tmp/tmpc02os3x9/files/6/9/e/dataset_69e1e648-a0a6-4036-851a-f398531dfe1c.dat' >> '/tmp/tmpc02os3x9/job_working_directory/000/15/outputs/dataset_b490d2d5-d71c-455b-a2c5-03714b10dc89.dat' && exit 0

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              queries [{"__index__": 0, "inputs2": {"values": [{"id": 15, "src": "hda"}]}}]
      • Step 17: param_value_from_file:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • cd ../; python _evaluate_expression_.py

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "tabular"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              param_type "text"
              remove_newlines true
      • Step 18: Filter with p-adj threshold:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • python '/tmp/tmpc02os3x9/galaxy-dev/tools/stats/filtering.py' '/tmp/tmpc02os3x9/files/b/4/9/dataset_b490d2d5-d71c-455b-a2c5-03714b10dc89.dat' '/tmp/tmpc02os3x9/job_working_directory/000/18/outputs/dataset_5f6b5ed4-8325-4719-a167-30f82c016fe0.dat' '/tmp/tmpc02os3x9/job_working_directory/000/18/configs/tmpqr8pzuch' 13 "str,float,float,float,float,float,float,str,int,int,str,str,str" 1

            Exit Code:

            • 0

            Standard Output:

            • Filtering with c7<0.1, 
              kept 0.22% of 7128 valid lines (7128 total lines).
              Skipped 4356 invalid line(s) starting at line #2773: "YDL246C	0.164326158122698	0.0145924704577571	0.0770926594275621	0.189284823822539	0.849869589153857	NA	chrIV	8682	9756	-	protein_coding	SOR2"
              

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "tabular"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              cond "c7<0.1"
              dbkey "?"
              header_lines "1"
      • Step 19: Generate Valcanot plot of DE genes:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • Rscript '/tmp/tmpc02os3x9/job_working_directory/000/17/configs/tmpi9_wx118'

            Exit Code:

            • 0

            Standard Error:

            • Warning message:
              In Sys.setlocale("LC_MESSAGES", "en_US.UTF-8") :
                OS reports request to set locale to "en_US.UTF-8" cannot be honored
              Warning message:
              Removed 1393 rows containing missing values (geom_point). 
              

            Standard Output:

            • null device 
                        1 
              R version 4.0.5 (2021-03-31)
              Platform: x86_64-conda-linux-gnu (64-bit)
              Running under: Debian GNU/Linux 10 (buster)
              
              Matrix products: default
              BLAS/LAPACK: /usr/local/lib/libopenblasp-r0.3.15.so
              
              locale:
               [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
               [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
               [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
              [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
              
              attached base packages:
              [1] stats     graphics  grDevices utils     datasets  methods   base     
              
              other attached packages:
              [1] ggrepel_0.9.1 ggplot2_3.3.3 dplyr_1.0.6  
              
              loaded via a namespace (and not attached):
               [1] Rcpp_1.0.6       magrittr_2.0.1   tidyselect_1.1.1 munsell_0.5.0   
               [5] colorspace_2.0-1 R6_2.5.0         rlang_0.4.11     fansi_0.5.0     
               [9] grid_4.0.5       gtable_0.3.0     utf8_1.2.1       withr_2.4.2     
              [13] ellipsis_0.3.2   digest_0.6.27    tibble_3.1.2     lifecycle_1.0.0 
              [17] crayon_1.4.1     purrr_0.3.4      farver_2.1.0     vctrs_0.3.8     
              [21] glue_1.4.2       labeling_0.4.2   compiler_4.0.5   pillar_1.6.1    
              [25] generics_0.1.0   scales_1.1.1     pkgconfig_2.0.3 
              

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              fdr_col "7"
              header "yes"
              label_col "13"
              labels {"__current_case__": 0, "label_select": "signif", "top_num": "10"}
              lfc_col "3"
              lfc_thresh "0.5"
              out_options {"rscript_out": false}
              plot_options {"boxes": false, "legend": null, "legend_labs": "Down,Not Sig,Up", "title": null, "xlab": null, "xmax": null, "xmin": null, "ylab": null, "ymax": null}
              pval_col "6"
              signif_thresh "0.1"
      • Step 20: toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • cd ../; python _evaluate_expression_.py

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              components [{"__index__": 0, "param_type": {"__current_case__": 0, "component_value": "c1-c", "select_param_type": "text"}}, {"__index__": 1, "param_type": {"__current_case__": 0, "component_value": "5", "select_param_type": "text"}}]
              dbkey "?"
      • Step 3: Count files have header:

        • step_state: scheduled
      • Step 21: Filter with log2 FC threshold:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • python '/tmp/tmpc02os3x9/galaxy-dev/tools/stats/filtering.py' '/tmp/tmpc02os3x9/files/5/f/6/dataset_5f6b5ed4-8325-4719-a167-30f82c016fe0.dat' '/tmp/tmpc02os3x9/job_working_directory/000/19/outputs/dataset_2af24a7e-3106-4cfa-b8e1-47de4ac0bddb.dat' '/tmp/tmpc02os3x9/job_working_directory/000/19/configs/tmp0pzfu53j' 13 "str,float,float,float,float,float,float,str,int,int,str,str,str" 1

            Exit Code:

            • 0

            Standard Output:

            • Filtering with abs(c3)>0.5, 
              kept 93.75% of 16 valid lines (16 total lines).
              

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "tabular"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              cond "abs(c3)>0.5"
              dbkey "?"
              header_lines "1"
      • Step 22: join1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • python '/tmp/tmpc02os3x9/galaxy-dev/tools/filters/join.py' '/tmp/tmpc02os3x9/files/1/5/e/dataset_15ef0886-71bb-46a6-9b0f-03ef66a76a04.dat' '/tmp/tmpc02os3x9/files/2/a/f/dataset_2af24a7e-3106-4cfa-b8e1-47de4ac0bddb.dat' 1 1 '/tmp/tmpc02os3x9/job_working_directory/000/20/outputs/dataset_ec6a006a-9ec9-417a-b3bf-61d3345ab4ae.dat'   --index_depth=3 --buffer=50000000 --fill_options_file=/tmp/tmpc02os3x9/job_working_directory/000/20/configs/tmpenzlmvbu -H

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "tabular"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              field1 "1"
              field2 "1"
              fill_empty_columns {"__current_case__": 0, "fill_empty_columns_switch": "no_fill"}
              header "-H"
              partial ""
              unmatched ""
      • Step 23: Cut1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • perl '/tmp/tmpc02os3x9/galaxy-dev/tools/filters/cutWrapper.pl' '/tmp/tmpc02os3x9/files/e/c/6/dataset_ec6a006a-9ec9-417a-b3bf-61d3345ab4ae.dat' 'c1-c5' T '/tmp/tmpc02os3x9/job_working_directory/000/22/outputs/dataset_3df26632-7d82-4a60-98f4-ea3d0d3ccdf0.dat'

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "tabular"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              columnList "c1-c5"
              dbkey "?"
              delimiter "T"
      • Step 24: Generate Heatmap of counts:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • cat '/tmp/tmpc02os3x9/job_working_directory/000/23/configs/tmpnz5526j0' && Rscript '/tmp/tmpc02os3x9/job_working_directory/000/23/configs/tmpnz5526j0'

            Exit Code:

            • 0

            Standard Error:

            • Warning message:
              In Sys.setlocale("LC_MESSAGES", "en_US.UTF-8") :
                OS reports request to set locale to "en_US.UTF-8" cannot be honored
              
              Attaching package: ‘gplots’
              
              The following object is masked from ‘package:stats’:
              
                  lowess
              
              

            Standard Output:

            • options(show.error.messages=F, error=function(){cat(geterrmessage(), file=stderr()); q("no",1,F)})
              
              loc <- Sys.setlocale("LC_MESSAGES", "en_US.UTF-8")
              
              library("RColorBrewer")
              library("gplots")
              
              input <- read.delim('/tmp/tmpc02os3x9/files/3/d/f/dataset_3df26632-7d82-4a60-98f4-ea3d0d3ccdf0.dat', sep='\t', header=TRUE)
              
              mat_input <- data.matrix(input[,2:ncol(input)])
              rownames(mat_input) <- input[,1]
              
                  linput <- log2(mat_input+1)
              
                  scale <- "none"
              
              srtCol <- 30
                  rlabs <- FALSE
                  clabs <- NULL
                  label_margins <- c(8,1)
              
                  dendrogramtoplot <- "both"
                      reorder_cols <- TRUE
                      reorder_rows <- TRUE
                      layout_matrix <- rbind(c(4,3), c(2,1))
                      key_margins <- list(mar=c(4,0.5,2,1))
                      lheight <- c(1, 5)
                      lwidth <- c(1,3)
                  hclust_fun <- function(x) hclust(x, method='complete')
                      dist_fun <- function(x) dist(x, method='euclidean')
              
              ncolors <- 50
                  colused <- colorRampPalette(c("#ffffff", "#ff0000"))(ncolors)
              
                  pdf(file='/tmp/tmpc02os3x9/job_working_directory/000/23/outputs/dataset_82c305c1-113d-49fb-8f67-08cebf660d0a.dat')
              
              heatmap.2(linput, dendrogram=dendrogramtoplot, Colv=reorder_cols, Rowv=reorder_rows,
                  distfun=dist_fun, hclustfun=hclust_fun, scale = scale, labRow = rlabs, labCol = clabs,
                  col=colused, trace="none", density.info = "none", margins=label_margins,
                  main = '', cexCol=0.8, cexRow=0.8, srtCol=srtCol,
                  keysize=3, key.xlab='', key.title='', key.par=key_margins,
                  lmat=layout_matrix, lhei=lheight, lwid=lwidth)
              
              dev.off()
                      null device 
                        1 
              

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              cluster_cond {"__current_case__": 0, "cluster": "yes", "cluster_cols_rows": "both", "clustering": "complete", "distance": "euclidean"}
              colorchoice {"__current_case__": 1, "color1": "#ffffff", "color2": "#ff0000", "type": "two"}
              dbkey "?"
              image_file_format "pdf"
              key ""
              labels "columns"
              title ""
              transform "log2plus1"
              zscore_cond {"__current_case__": 0, "scale": "none", "zscore": "none"}
      • Step 25: Generate Heatmap of Z-scores:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • cat '/tmp/tmpc02os3x9/job_working_directory/000/24/configs/tmpy67zuyx5' && Rscript '/tmp/tmpc02os3x9/job_working_directory/000/24/configs/tmpy67zuyx5'

            Exit Code:

            • 0

            Standard Error:

            • Warning message:
              In Sys.setlocale("LC_MESSAGES", "en_US.UTF-8") :
                OS reports request to set locale to "en_US.UTF-8" cannot be honored
              
              Attaching package: ‘gplots’
              
              The following object is masked from ‘package:stats’:
              
                  lowess
              
              

            Standard Output:

            • options(show.error.messages=F, error=function(){cat(geterrmessage(), file=stderr()); q("no",1,F)})
              
              loc <- Sys.setlocale("LC_MESSAGES", "en_US.UTF-8")
              
              library("RColorBrewer")
              library("gplots")
              
              input <- read.delim('/tmp/tmpc02os3x9/files/3/d/f/dataset_3df26632-7d82-4a60-98f4-ea3d0d3ccdf0.dat', sep='\t', header=TRUE)
              
              mat_input <- data.matrix(input[,2:ncol(input)])
              rownames(mat_input) <- input[,1]
              
                  linput <- mat_input
              
                  linput <- t(apply(linput, 1, scale))
                  colnames(linput) <- colnames(input)[2:ncol(input)]
                  rownames(linput) <- input[,1]
                  scale <- "none"
              
              srtCol <- 30
                  rlabs <- FALSE
                  clabs <- NULL
                  label_margins <- c(8,1)
              
                  dendrogramtoplot <- "both"
                      reorder_cols <- TRUE
                      reorder_rows <- TRUE
                      layout_matrix <- rbind(c(4,3), c(2,1))
                      key_margins <- list(mar=c(4,0.5,2,1))
                      lheight <- c(1, 5)
                      lwidth <- c(1,3)
                  hclust_fun <- function(x) hclust(x, method='complete')
                      dist_fun <- function(x) dist(x, method='euclidean')
              
              ncolors <- 50
                  colused <- colorRampPalette(c("#0000ff", "#ffffff", "#ff0000"))(ncolors)
              
                  pdf(file='/tmp/tmpc02os3x9/job_working_directory/000/24/outputs/dataset_627adaec-f2aa-43e8-9bf5-60372550aa96.dat')
              
              heatmap.2(linput, dendrogram=dendrogramtoplot, Colv=reorder_cols, Rowv=reorder_rows,
                  distfun=dist_fun, hclustfun=hclust_fun, scale = scale, labRow = rlabs, labCol = clabs,
                  col=colused, trace="none", density.info = "none", margins=label_margins,
                  main = '', cexCol=0.8, cexRow=0.8, srtCol=srtCol,
                  keysize=3, key.xlab='', key.title='', key.par=key_margins,
                  lmat=layout_matrix, lhei=lheight, lwid=lwidth)
              
              dev.off()
                      null device 
                        1 
              

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              cluster_cond {"__current_case__": 0, "cluster": "yes", "cluster_cols_rows": "both", "clustering": "complete", "distance": "euclidean"}
              colorchoice {"__current_case__": 2, "color1": "#0000ff", "color2": "#ffffff", "color3": "#ff0000", "type": "three"}
              dbkey "?"
              image_file_format "pdf"
              key ""
              labels "columns"
              title ""
              transform "none"
              zscore_cond {"__current_case__": 1, "zscore": "rows"}
      • Step 4: Gene Annotaton:

        • step_state: scheduled
      • Step 5: Adjusted p-value threshold:

        • step_state: scheduled
      • Step 6: toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_text_file_with_recurring_lines/9.3+galaxy1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • times=1; yes -- 'GeneID__tc__Base mean__tc__log2(FC)__tc__StdErr__tc__Wald-Stats__tc__P-value__tc__P-adj__tc__Chromosome__tc__Start__tc__End__tc__Strand__tc__Feature__tc__Gene name' 2>/dev/null | head -n $times >> '/tmp/tmpc02os3x9/job_working_directory/000/6/outputs/dataset_df4e9fab-592e-4287-a630-be6cab5f172c.dat';

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              token_set [{"__index__": 0, "line": "GeneID\tBase mean\tlog2(FC)\tStdErr\tWald-Stats\tP-value\tP-adj\tChromosome\tStart\tEnd\tStrand\tFeature\tGene name", "repeat_select": {"__current_case__": 0, "repeat_select_opts": "user", "times": "1"}}]
      • Step 7: log2 fold change threshold:

        • step_state: scheduled
      • Step 8: toolshed.g2.bx.psu.edu/repos/iuc/pick_value/pick_value/0.2.0:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • cd ../; python _evaluate_expression_.py

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              style_cond {"__current_case__": 1, "pick_style": "first_or_default", "type_cond": {"__current_case__": 2, "default_value": "0.05", "param_type": "float", "pick_from": [{"__index__": 0, "value": "0.1"}]}}
      • Step 9: toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/9.3+galaxy1:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • sed --sandbox -r -f '/tmp/tmpc02os3x9/job_working_directory/000/8/configs/tmpi_5_ry3g' '/tmp/tmpc02os3x9/files/d/f/4/dataset_df4e9fab-592e-4287-a630-be6cab5f172c.dat' > '/tmp/tmpc02os3x9/job_working_directory/000/8/outputs/dataset_3560d077-c3ec-44e5-8982-3db1f89e4e3a.dat'

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              adv_opts {"__current_case__": 0, "adv_opts_selector": "basic"}
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              code "s/__tc__/\\t/g"
              dbkey "?"
      • Step 10: toolshed.g2.bx.psu.edu/repos/iuc/pick_value/pick_value/0.2.0:

        • step_state: scheduled

        • Jobs
          • Job 1:

            • Job state is ok

            Command Line:

            • cd ../; python _evaluate_expression_.py

            Exit Code:

            • 0

            Traceback:

            Job Parameters:

            • Job parameter Parameter value
              __input_ext "input"
              __workflow_invocation_uuid__ "b775d7849dd811ef9b8aa1042ab2eb7f"
              chromInfo "/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"
              dbkey "?"
              style_cond {"__current_case__": 1, "pick_style": "first_or_default", "type_cond": {"__current_case__": 2, "default_value": "1.0", "param_type": "float", "pick_from": [{"__index__": 0, "value": "0.5"}]}}
    • Other invocation details
      • history_id

        • 2bc6b4f69d71020c
      • history_state

        • ok
      • invocation_id

        • 2bc6b4f69d71020c
      • invocation_state

        • scheduled
      • workflow_id

        • 2bc6b4f69d71020c

Copy link
Contributor

@lldelisle lldelisle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you so much.
@mvdbeek you want to review?

@lldelisle lldelisle merged commit a949697 into galaxyproject:main Nov 12, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants