first release of RNAseq DE analysis, filtering and plotting workflow #582

pavanvidem · 2024-10-25T09:03:21Z

This workflow is downstream analysis of the existing rnaseq-pe and rnaseq-sr workflows. It takes 2 collections and performs differential expressin analysis, annotates the results table, filters and generates a valcano plot, a z-scores heatmap and a heatmap of normalized counts.

workflows/transcriptomics/rnaseq-de/README.md

lldelisle · 2024-10-25T10:14:17Z

👍 This looks good.

lldelisle · 2024-10-25T10:15:07Z

Maybe you should write in the directory or in the workflow is only for 2 conditions.

pavanvidem · 2024-10-25T10:16:58Z

Maybe you should write in the directory or in the workflow is only for 2 conditions.

sure! I will write in the workflow description

lldelisle · 2024-10-25T10:22:43Z

One of my colleague proposed that the input could be a single collection and a replicate assignment like 1,1,1,2,2,2 and a batch assignment like 1,2,3,1,2,3. I don't know if this makes sense or a samples plan like @wm75 proposed in #581 (with less columns).
Maybe we could just add 3 general workflows:

split collection in 2 by samplesplan: one column is identifier, other column is group (new collection name)
split collection in 2 by pattern: one collection contains identifiers which contains this pattern, another collection with identifiers which do not contain this pattern.
split collection in 2 by comma-separated values: the user input would be 1,1,1,2,2,2 but this one is more error prone to me.

pavanvidem · 2024-10-25T13:21:34Z

replicate assignments are fine because we know there are exactly 2 conditions. But I don't think it is possible to include batches unless we assume there are fixed number of batches. Where should we use these batches information? in multi-factor analysis?

lldelisle · 2024-10-25T13:43:49Z

I think we would need to rewrite the wrapper to allow to set the batches dynamically in workflows so this is just for future development.

wm75 · 2024-10-25T13:46:29Z

just simple two-conditions comparison might be good enough for this version.

wm75 · 2024-10-25T13:48:10Z

can you arrange the inputs in a more meaningful way @pavanvidem ?
(order is dependent on euclidian distance from the top-left corner 🙃 )

wm75 · 2024-10-25T13:49:07Z

Also 0.0 doesn't seem like a good default for neither of the thresholds.

lldelisle · 2024-10-25T13:49:10Z

I think this version is fine. I just say that to make this more usable we should think about developing workflows to split the count collection into 2 (in another PR).

pavanvidem · 2024-10-25T13:54:20Z

Also 0.0 doesn't seem like a good default for neither of the thresholds.

These are non-optional parameters of tools. I can only set a default if I make then optional and I have to use the dirty trick of connecting the steps and then making them optional with defaults :)

wm75 · 2024-10-25T13:55:59Z

Also 0.0 doesn't seem like a good default for neither of the thresholds.

These are non-optional parameters of tools. I can only set a default if I make then optional and I have to use the dirty trick of connecting the steps and then making them optional with defaults :)

The "better" trick is to use pick value after the optional param and if it's not set, pick the default value again :-)

lldelisle · 2024-10-25T13:57:53Z

For me, 'Main factor counts' and 'Base factor counts' are not really meaningful.
For me a factor is for example 'Treatment' and then you have 2 levels: 'treated' and 'untreated'. One of them is the reference and I am not sure there is a consensus way to call the other level.
I propose: 'Counts in changed condition', 'Counts in reference condition'.

lldelisle · 2024-10-25T13:59:22Z

Also 0.0 doesn't seem like a good default for neither of the thresholds.

These are non-optional parameters of tools. I can only set a default if I make then optional and I have to use the dirty trick of connecting the steps and then making them optional with defaults :)

The "better" trick is to use pick value after the optional param and if it's not set, pick the default value again :-)

Why don't we simply put it optional with a default value?

pavanvidem · 2024-10-25T13:59:36Z

For me, 'Main factor counts' and 'Base factor counts' are not really meaningful. For me a factor is for example 'Treatment' and then you have 2 levels: 'treated' and 'untreated'. One of them is the reference and I am not sure there is a consensus way to call the other level. I propose: 'Counts in changed condition', 'Counts in reference condition'.

sounds better! I will change

wm75 · 2024-10-25T14:00:37Z

For me, 'Main factor counts' and 'Base factor counts' are not really meaningful. For me a factor is for example 'Treatment' and then you have 2 levels: 'treated' and 'untreated'. One of them is the reference and I am not sure there is a consensus way to call the other level. I propose: 'Counts in changed condition', 'Counts in reference condition'.

Yes, the factor and factor levels should have better names. Alternatively, you could also use input params there.

pavanvidem · 2024-10-25T14:01:38Z

Also 0.0 doesn't seem like a good default for neither of the thresholds.

These are non-optional parameters of tools. I can only set a default if I make then optional and I have to use the dirty trick of connecting the steps and then making them optional with defaults :)

The "better" trick is to use pick value after the optional param and if it's not set, pick the default value again :-)

Why don't we simply put it optional with a default value?

If a tool has this param as non-optional. If we set this to optional, we cannot connect this value to the tool.

wm75 · 2024-10-25T14:02:27Z

Also 0.0 doesn't seem like a good default for neither of the thresholds.

These are non-optional parameters of tools. I can only set a default if I make then optional and I have to use the dirty trick of connecting the steps and then making them optional with defaults :)

The "better" trick is to use pick value after the optional param and if it's not set, pick the default value again :-)

Why don't we simply put it optional with a default value?

https://matrix.to/#/%23galaxyproject_iwc%3Agitter.im/%24-cW9o37aiBvL760m03fDaXVkbprQjQ8-Qj9Qs7KSlb0?via=matrix.org&via=gitter.im :-)

lldelisle · 2024-11-08T11:05:23Z

For me this is really close to be merged.

I think it would be good to use count data available on zenodo to avoid taking space on the github.
I think there are too many datasets 'visibles' when you run the workflow:

I can make you a PR if you want.

pavanvidem · 2024-11-08T11:11:50Z

The count files are not so big but no problem, I can move them to Zenodo.
I will make them invisible stuff except that needed

I am also building another 2 workflows (a paired and a single) in the same directory which will combine the quantification workflow and this one so that we will have complete DESeq2 workflows. But this PR can be merged before I add them.

workflows/transcriptomics/rnaseq-de/rnaseq-de-filtering-plotting.ga

lldelisle · 2024-11-08T13:49:45Z

Would you mind to remove the test-data?

pavanvidem · 2024-11-08T13:50:47Z

Would you mind to remove the test-data?

sure, forgot :)

github-actions · 2024-11-08T13:56:44Z

Test Results (powered by Planemo)

Test Summary

Test State	Count
Total	1
Passed	1
Error	0
Failure	0
Skipped	0

Passed Tests

✅ rnaseq-de-filtering-plotting.ga_0

Workflow invocation details

Invocation Messages

Steps

Step 1: Counts from changed condition:
- step_state: scheduled
Step 2: Counts from reference condition:
- step_state: scheduled

Step 11: Differential Analysis:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

cat '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/deseq2/8fe98f7094de/deseq2/get_deseq_dataset.R' > /dev/null &&  Rscript '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/deseq2/8fe98f7094de/deseq2/deseq2.R' --cores ${GALAXY_SLOTS:-1} -o '/tmp/tmpc02os3x9/job_working_directory/000/10/outputs/dataset_30f7df7f-be6e-4bb4-b78a-93205e0cb00a.dat' -p '/tmp/tmpc02os3x9/job_working_directory/000/10/outputs/dataset_ce0b6f1f-d7b0-4c4a-b429-82d724f2ff8d.dat' -A 0.1 -n '/tmp/tmpc02os3x9/job_working_directory/000/10/outputs/dataset_15ef0886-71bb-46a6-9b0f-03ef66a76a04.dat'              -H  -f '[["DEFactor", [{"BaseFactor": ["/tmp/tmpc02os3x9/files/6/b/5/dataset_6b5217f1-7077-4695-9d9b-95ead4abbbf4.dat", "/tmp/tmpc02os3x9/files/c/1/e/dataset_c1e6ad0e-4713-4e07-ab4f-9a034504840d.dat"]}, {"MainFactor": ["/tmp/tmpc02os3x9/files/0/e/4/dataset_0e49aaed-3dd7-4c04-9fe6-29fff8b1811d.dat", "/tmp/tmpc02os3x9/files/8/d/7/dataset_8d7b03b2-ff64-46f4-a5b4-e706fd640037.dat"]}]]]' -l '{"dataset_0e49aaed-3dd7-4c04-9fe6-29fff8b1811d.dat": "SRR5085169 Counts Table", "dataset_8d7b03b2-ff64-46f4-a5b4-e706fd640037.dat": "SRR5085170 Counts Table", "dataset_6b5217f1-7077-4695-9d9b-95ead4abbbf4.dat": "SRR5085167 Counts Table", "dataset_c1e6ad0e-4713-4e07-ab4f-9a034504840d.dat": "SRR5085168 Counts Table"}' -t 1

Exit Code:

```
0
```

Standard Error:

Warning message:
In Sys.setlocale("LC_MESSAGES", "en_US.UTF-8") :
  OS reports request to set locale to "en_US.UTF-8" cannot be honored
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing

Standard Output:

primary factor: DEFactor 

---------------------
DESeq2 run information

sample table:
                          DEFactor
SRR5085167 Counts Table BaseFactor
SRR5085168 Counts Table BaseFactor
SRR5085169 Counts Table MainFactor
SRR5085170 Counts Table MainFactor

design formula:
~DEFactor


4 samples with counts over 7127 genes
using disperion fit type: parametric 
creating plots
summary of results
DEFactor: MainFactor vs BaseFactor

out of 5734 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)       : 2, 0.035%
LFC < 0 (down)     : 13, 0.23%
outliers [1]       : 0, 0%
low counts [2]     : 2963, 52%
(mean count < 7)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

NULL
closing plot device
null device 
          1 
Session information:

R version 4.1.1 (2021-08-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)

Matrix products: default
BLAS/LAPACK: /usr/local/lib/libopenblasp-r0.3.18.so

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats4    tools     stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] pheatmap_1.0.12             ggrepel_0.9.1              
 [3] ggplot2_3.3.5               rjson_0.2.20               
 [5] gplots_3.1.1                RColorBrewer_1.1-2         
 [7] DESeq2_1.34.0               SummarizedExperiment_1.24.0
 [9] Biobase_2.54.0              MatrixGenerics_1.6.0       
[11] matrixStats_0.61.0          GenomicRanges_1.46.0       
[13] GenomeInfoDb_1.30.0         IRanges_2.28.0             
[15] S4Vectors_0.32.0            BiocGenerics_0.40.0        
[17] getopt_1.20.3              

loaded via a namespace (and not attached):
 [1] httr_1.4.2             bit64_4.0.5            splines_4.1.1         
 [4] gtools_3.9.2           assertthat_0.2.1       blob_1.2.2            
 [7] GenomeInfoDbData_1.2.7 pillar_1.6.4           RSQLite_2.2.8         
[10] lattice_0.20-45        glue_1.5.1             digest_0.6.29         
[13] XVector_0.34.0         colorspace_2.0-2       Matrix_1.3-4          
[16] XML_3.99-0.8           pkgconfig_2.0.3        genefilter_1.76.0     
[19] zlibbioc_1.40.0        purrr_0.3.4            xtable_1.8-4          
[22] scales_1.1.1           BiocParallel_1.28.0    tibble_3.1.6          
[25] annotate_1.72.0        KEGGREST_1.34.0        generics_0.1.1        
[28] farver_2.1.0           ellipsis_0.3.2         cachem_1.0.6          
[31] withr_2.4.3            survival_3.2-13        magrittr_2.0.1        
[34] crayon_1.4.2           memoise_2.0.1          fansi_0.4.2           
[37] lifecycle_1.0.1        munsell_0.5.0          locfit_1.5-9.4        
[40] DelayedArray_0.20.0    AnnotationDbi_1.56.1   Biostrings_2.62.0     
[43] compiler_4.1.1         caTools_1.18.2         rlang_0.4.12          
[46] grid_4.1.1             RCurl_1.98-1.5         bitops_1.0-7          
[49] labeling_0.4.2         gtable_0.3.0           DBI_1.1.1             
[52] R6_2.5.1               dplyr_1.0.7            fastmap_1.1.0         
[55] bit_4.0.4              utf8_1.2.2             KernSmooth_2.23-20    
[58] parallel_4.1.1         Rcpp_1.0.7             vctrs_0.3.8           
[61] geneplotter_1.72.0     png_0.1-7              tidyselect_1.1.1

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tabular"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
advanced_options	`{"auto_mean_filter_off": false, "esf": "", "fit_type": "1", "outlier_filter_off": false, "outlier_replace_off": false, "prefilter_conditional": {"__current_case__": 1, "prefilter": ""}}`
batch_factors	`None`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
header	`true`
output_options	`{"alpha_ma": "0.1", "output_selector": ["pdf", "normCounts"]}`
select_data	`{"__current_case__": 1, "how": "datasets_per_level", "rep_factorName": [{"__index__": 0, "factorName": "DEFactor", "rep_factorLevel": [{"__index__": 0, "countsFile": {"values": [{"id": 1, "src": "hdca"}]}, "factorLevel": "MainFactor"}, {"__index__": 1, "countsFile": {"values": [{"id": 2, "src": "hdca"}]}, "factorLevel": "BaseFactor"}]}]}`
tximport	`{"__current_case__": 1, "tximport_selector": "count"}`

Step 12: toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
components	`[{"__index__": 0, "param_type": {"__current_case__": 0, "component_value": "c7<", "select_param_type": "text"}}, {"__index__": 1, "param_type": {"__current_case__": 2, "component_value": "0.1", "select_param_type": "float"}}]`
dbkey	`"?"`

Step 13: toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
components	`[{"__index__": 0, "param_type": {"__current_case__": 0, "component_value": "abs(c3)>", "select_param_type": "text"}}, {"__index__": 1, "param_type": {"__current_case__": 2, "component_value": "0.5", "select_param_type": "float"}}]`
dbkey	`"?"`

Step 14: toolshed.g2.bx.psu.edu/repos/iuc/deg_annotate/deg_annotate/1.1.0:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

python '/tmp/shed_dir/toolshed.g2.bx.psu.edu/repos/iuc/deg_annotate/e98d4ab5b5bc/deg_annotate/deg_annotate.py' -in '/tmp/tmpc02os3x9/files/3/0/f/dataset_30f7df7f-be6e-4bb4-b78a-93205e0cb00a.dat' -m 'degseq' -g '/tmp/tmpc02os3x9/files/a/9/2/dataset_a92fa7b6-7b45-4dd7-af04-a49a481ca4b5.dat' -t 'exon' -i 'gene_id' -x 'transcript_id' -a 'gene_biotype, gene_name' -o '/tmp/tmpc02os3x9/job_working_directory/000/13/outputs/dataset_69e1e648-a0a6-4036-851a-f398531dfe1c.dat'

Exit Code:

```
0
```

Standard Output:

DE(X)Seq output file     : /tmp/tmpc02os3x9/files/3/0/f/dataset_30f7df7f-be6e-4bb4-b78a-93205e0cb00a.dat
Input file type          : degseq
Annotation file          : /tmp/tmpc02os3x9/files/a/9/2/dataset_a92fa7b6-7b45-4dd7-af04-a49a481ca4b5.dat
Feature type             : exon
ID attribute             : gene_id
Transcript attribute     : transcript_id
Attributes to include    : gene_biotype, gene_name
Annotated output file    : /tmp/tmpc02os3x9/job_working_directory/000/13/outputs/dataset_69e1e648-a0a6-4036-851a-f398531dfe1c.dat

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tabular"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
advanced_parameters	`{"gff_attributes": "gene_biotype, gene_name", "gff_feature_attribute": "gene_id", "gff_feature_type": "exon", "gff_transcript_attribute": "transcript_id"}`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
mode	`"degseq"`

Step 15: toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.3+galaxy1:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

env -i $(which awk) --sandbox -v FS='	' -v OFS='	' --re-interval -f '/tmp/tmpc02os3x9/job_working_directory/000/14/configs/tmp44k_8pq7' '/tmp/tmpc02os3x9/files/1/5/e/dataset_15ef0886-71bb-46a6-9b0f-03ef66a76a04.dat' > '/tmp/tmpc02os3x9/job_working_directory/000/14/outputs/dataset_af5f16f2-7127-400d-b896-b4fc6eab3734.dat'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
code	`"END{print NF}"`
dbkey	`"?"`

Step 16: Annotate DESeq2 table:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

cat '/tmp/tmpc02os3x9/files/3/5/6/dataset_3560d077-c3ec-44e5-8982-3db1f89e4e3a.dat' >> '/tmp/tmpc02os3x9/job_working_directory/000/15/outputs/dataset_b490d2d5-d71c-455b-a2c5-03714b10dc89.dat' && cat '/tmp/tmpc02os3x9/files/6/9/e/dataset_69e1e648-a0a6-4036-851a-f398531dfe1c.dat' >> '/tmp/tmpc02os3x9/job_working_directory/000/15/outputs/dataset_b490d2d5-d71c-455b-a2c5-03714b10dc89.dat' && exit 0

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
queries	`[{"__index__": 0, "inputs2": {"values": [{"id": 15, "src": "hda"}]}}]`

Step 17: param_value_from_file:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tabular"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
param_type	`"text"`
remove_newlines	`true`

Step 18: Filter with p-adj threshold:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

python '/tmp/tmpc02os3x9/galaxy-dev/tools/stats/filtering.py' '/tmp/tmpc02os3x9/files/b/4/9/dataset_b490d2d5-d71c-455b-a2c5-03714b10dc89.dat' '/tmp/tmpc02os3x9/job_working_directory/000/18/outputs/dataset_5f6b5ed4-8325-4719-a167-30f82c016fe0.dat' '/tmp/tmpc02os3x9/job_working_directory/000/18/configs/tmpqr8pzuch' 13 "str,float,float,float,float,float,float,str,int,int,str,str,str" 1

Exit Code:

```
0
```

Standard Output:

Filtering with c7<0.1, 
kept 0.22% of 7128 valid lines (7128 total lines).
Skipped 4356 invalid line(s) starting at line #2773: "YDL246C	0.164326158122698	0.0145924704577571	0.0770926594275621	0.189284823822539	0.849869589153857	NA	chrIV	8682	9756	-	protein_coding	SOR2"

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tabular"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
cond	`"c7<0.1"`
dbkey	`"?"`
header_lines	`"1"`

Step 19: Generate Valcanot plot of DE genes:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

Rscript '/tmp/tmpc02os3x9/job_working_directory/000/17/configs/tmpi9_wx118'

Exit Code:

```
0
```

Standard Error:

Warning message:
In Sys.setlocale("LC_MESSAGES", "en_US.UTF-8") :
  OS reports request to set locale to "en_US.UTF-8" cannot be honored
Warning message:
Removed 1393 rows containing missing values (geom_point).

Standard Output:

null device 
          1 
R version 4.0.5 (2021-03-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)

Matrix products: default
BLAS/LAPACK: /usr/local/lib/libopenblasp-r0.3.15.so

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggrepel_0.9.1 ggplot2_3.3.3 dplyr_1.0.6  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6       magrittr_2.0.1   tidyselect_1.1.1 munsell_0.5.0   
 [5] colorspace_2.0-1 R6_2.5.0         rlang_0.4.11     fansi_0.5.0     
 [9] grid_4.0.5       gtable_0.3.0     utf8_1.2.1       withr_2.4.2     
[13] ellipsis_0.3.2   digest_0.6.27    tibble_3.1.2     lifecycle_1.0.0 
[17] crayon_1.4.1     purrr_0.3.4      farver_2.1.0     vctrs_0.3.8     
[21] glue_1.4.2       labeling_0.4.2   compiler_4.0.5   pillar_1.6.1    
[25] generics_0.1.0   scales_1.1.1     pkgconfig_2.0.3

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
fdr_col	`"7"`
header	`"yes"`
label_col	`"13"`
labels	`{"__current_case__": 0, "label_select": "signif", "top_num": "10"}`
lfc_col	`"3"`
lfc_thresh	`"0.5"`
out_options	`{"rscript_out": false}`
plot_options	`{"boxes": false, "legend": null, "legend_labs": "Down,Not Sig,Up", "title": null, "xlab": null, "xmax": null, "xmin": null, "ylab": null, "ymax": null}`
pval_col	`"6"`
signif_thresh	`"0.1"`

Step 20: toolshed.g2.bx.psu.edu/repos/iuc/compose_text_param/compose_text_param/0.1.1:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
components	`[{"__index__": 0, "param_type": {"__current_case__": 0, "component_value": "c1-c", "select_param_type": "text"}}, {"__index__": 1, "param_type": {"__current_case__": 0, "component_value": "5", "select_param_type": "text"}}]`
dbkey	`"?"`

Step 3: Count files have header:
- step_state: scheduled

Step 21: Filter with log2 FC threshold:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

python '/tmp/tmpc02os3x9/galaxy-dev/tools/stats/filtering.py' '/tmp/tmpc02os3x9/files/5/f/6/dataset_5f6b5ed4-8325-4719-a167-30f82c016fe0.dat' '/tmp/tmpc02os3x9/job_working_directory/000/19/outputs/dataset_2af24a7e-3106-4cfa-b8e1-47de4ac0bddb.dat' '/tmp/tmpc02os3x9/job_working_directory/000/19/configs/tmp0pzfu53j' 13 "str,float,float,float,float,float,float,str,int,int,str,str,str" 1

Exit Code:

```
0
```

Standard Output:

Filtering with abs(c3)>0.5, 
kept 93.75% of 16 valid lines (16 total lines).

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tabular"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
cond	`"abs(c3)>0.5"`
dbkey	`"?"`
header_lines	`"1"`

Step 22: join1:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

python '/tmp/tmpc02os3x9/galaxy-dev/tools/filters/join.py' '/tmp/tmpc02os3x9/files/1/5/e/dataset_15ef0886-71bb-46a6-9b0f-03ef66a76a04.dat' '/tmp/tmpc02os3x9/files/2/a/f/dataset_2af24a7e-3106-4cfa-b8e1-47de4ac0bddb.dat' 1 1 '/tmp/tmpc02os3x9/job_working_directory/000/20/outputs/dataset_ec6a006a-9ec9-417a-b3bf-61d3345ab4ae.dat'   --index_depth=3 --buffer=50000000 --fill_options_file=/tmp/tmpc02os3x9/job_working_directory/000/20/configs/tmpenzlmvbu -H

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tabular"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
field1	`"1"`
field2	`"1"`
fill_empty_columns	`{"__current_case__": 0, "fill_empty_columns_switch": "no_fill"}`
header	`"-H"`
partial	`""`
unmatched	`""`

Step 23: Cut1:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

perl '/tmp/tmpc02os3x9/galaxy-dev/tools/filters/cutWrapper.pl' '/tmp/tmpc02os3x9/files/e/c/6/dataset_ec6a006a-9ec9-417a-b3bf-61d3345ab4ae.dat' 'c1-c5' T '/tmp/tmpc02os3x9/job_working_directory/000/22/outputs/dataset_3df26632-7d82-4a60-98f4-ea3d0d3ccdf0.dat'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"tabular"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
columnList	`"c1-c5"`
dbkey	`"?"`
delimiter	`"T"`

Step 24: Generate Heatmap of counts:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

cat '/tmp/tmpc02os3x9/job_working_directory/000/23/configs/tmpnz5526j0' && Rscript '/tmp/tmpc02os3x9/job_working_directory/000/23/configs/tmpnz5526j0'

Exit Code:

```
0
```

Standard Error:

Warning message:
In Sys.setlocale("LC_MESSAGES", "en_US.UTF-8") :
  OS reports request to set locale to "en_US.UTF-8" cannot be honored

Attaching package: ‘gplots’

The following object is masked from ‘package:stats’:

    lowess

Standard Output:

options(show.error.messages=F, error=function(){cat(geterrmessage(), file=stderr()); q("no",1,F)})

loc <- Sys.setlocale("LC_MESSAGES", "en_US.UTF-8")

library("RColorBrewer")
library("gplots")

input <- read.delim('/tmp/tmpc02os3x9/files/3/d/f/dataset_3df26632-7d82-4a60-98f4-ea3d0d3ccdf0.dat', sep='\t', header=TRUE)

mat_input <- data.matrix(input[,2:ncol(input)])
rownames(mat_input) <- input[,1]

    linput <- log2(mat_input+1)

    scale <- "none"

srtCol <- 30
    rlabs <- FALSE
    clabs <- NULL
    label_margins <- c(8,1)

    dendrogramtoplot <- "both"
        reorder_cols <- TRUE
        reorder_rows <- TRUE
        layout_matrix <- rbind(c(4,3), c(2,1))
        key_margins <- list(mar=c(4,0.5,2,1))
        lheight <- c(1, 5)
        lwidth <- c(1,3)
    hclust_fun <- function(x) hclust(x, method='complete')
        dist_fun <- function(x) dist(x, method='euclidean')

ncolors <- 50
    colused <- colorRampPalette(c("#ffffff", "#ff0000"))(ncolors)

    pdf(file='/tmp/tmpc02os3x9/job_working_directory/000/23/outputs/dataset_82c305c1-113d-49fb-8f67-08cebf660d0a.dat')

heatmap.2(linput, dendrogram=dendrogramtoplot, Colv=reorder_cols, Rowv=reorder_rows,
    distfun=dist_fun, hclustfun=hclust_fun, scale = scale, labRow = rlabs, labCol = clabs,
    col=colused, trace="none", density.info = "none", margins=label_margins,
    main = '', cexCol=0.8, cexRow=0.8, srtCol=srtCol,
    keysize=3, key.xlab='', key.title='', key.par=key_margins,
    lmat=layout_matrix, lhei=lheight, lwid=lwidth)

dev.off()
        null device 
          1

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
cluster_cond	`{"__current_case__": 0, "cluster": "yes", "cluster_cols_rows": "both", "clustering": "complete", "distance": "euclidean"}`
colorchoice	`{"__current_case__": 1, "color1": "#ffffff", "color2": "#ff0000", "type": "two"}`
dbkey	`"?"`
image_file_format	`"pdf"`
key	`""`
labels	`"columns"`
title	`""`
transform	`"log2plus1"`
zscore_cond	`{"__current_case__": 0, "scale": "none", "zscore": "none"}`

Step 25: Generate Heatmap of Z-scores:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

cat '/tmp/tmpc02os3x9/job_working_directory/000/24/configs/tmpy67zuyx5' && Rscript '/tmp/tmpc02os3x9/job_working_directory/000/24/configs/tmpy67zuyx5'

Exit Code:

```
0
```

Standard Error:

Warning message:
In Sys.setlocale("LC_MESSAGES", "en_US.UTF-8") :
  OS reports request to set locale to "en_US.UTF-8" cannot be honored

Attaching package: ‘gplots’

The following object is masked from ‘package:stats’:

    lowess

Standard Output:

options(show.error.messages=F, error=function(){cat(geterrmessage(), file=stderr()); q("no",1,F)})

loc <- Sys.setlocale("LC_MESSAGES", "en_US.UTF-8")

library("RColorBrewer")
library("gplots")

input <- read.delim('/tmp/tmpc02os3x9/files/3/d/f/dataset_3df26632-7d82-4a60-98f4-ea3d0d3ccdf0.dat', sep='\t', header=TRUE)

mat_input <- data.matrix(input[,2:ncol(input)])
rownames(mat_input) <- input[,1]

    linput <- mat_input

    linput <- t(apply(linput, 1, scale))
    colnames(linput) <- colnames(input)[2:ncol(input)]
    rownames(linput) <- input[,1]
    scale <- "none"

srtCol <- 30
    rlabs <- FALSE
    clabs <- NULL
    label_margins <- c(8,1)

    dendrogramtoplot <- "both"
        reorder_cols <- TRUE
        reorder_rows <- TRUE
        layout_matrix <- rbind(c(4,3), c(2,1))
        key_margins <- list(mar=c(4,0.5,2,1))
        lheight <- c(1, 5)
        lwidth <- c(1,3)
    hclust_fun <- function(x) hclust(x, method='complete')
        dist_fun <- function(x) dist(x, method='euclidean')

ncolors <- 50
    colused <- colorRampPalette(c("#0000ff", "#ffffff", "#ff0000"))(ncolors)

    pdf(file='/tmp/tmpc02os3x9/job_working_directory/000/24/outputs/dataset_627adaec-f2aa-43e8-9bf5-60372550aa96.dat')

heatmap.2(linput, dendrogram=dendrogramtoplot, Colv=reorder_cols, Rowv=reorder_rows,
    distfun=dist_fun, hclustfun=hclust_fun, scale = scale, labRow = rlabs, labCol = clabs,
    col=colused, trace="none", density.info = "none", margins=label_margins,
    main = '', cexCol=0.8, cexRow=0.8, srtCol=srtCol,
    keysize=3, key.xlab='', key.title='', key.par=key_margins,
    lmat=layout_matrix, lhei=lheight, lwid=lwidth)

dev.off()
        null device 
          1

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
cluster_cond	`{"__current_case__": 0, "cluster": "yes", "cluster_cols_rows": "both", "clustering": "complete", "distance": "euclidean"}`
colorchoice	`{"__current_case__": 2, "color1": "#0000ff", "color2": "#ffffff", "color3": "#ff0000", "type": "three"}`
dbkey	`"?"`
image_file_format	`"pdf"`
key	`""`
labels	`"columns"`
title	`""`
transform	`"none"`
zscore_cond	`{"__current_case__": 1, "zscore": "rows"}`

Step 4: Gene Annotaton:
- step_state: scheduled
Step 5: Adjusted p-value threshold:
- step_state: scheduled

Step 6: toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_text_file_with_recurring_lines/9.3+galaxy1:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

times=1; yes -- 'GeneID__tc__Base mean__tc__log2(FC)__tc__StdErr__tc__Wald-Stats__tc__P-value__tc__P-adj__tc__Chromosome__tc__Start__tc__End__tc__Strand__tc__Feature__tc__Gene name' 2>/dev/null | head -n $times >> '/tmp/tmpc02os3x9/job_working_directory/000/6/outputs/dataset_df4e9fab-592e-4287-a630-be6cab5f172c.dat';

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
token_set	`[{"__index__": 0, "line": "GeneID\tBase mean\tlog2(FC)\tStdErr\tWald-Stats\tP-value\tP-adj\tChromosome\tStart\tEnd\tStrand\tFeature\tGene name", "repeat_select": {"__current_case__": 0, "repeat_select_opts": "user", "times": "1"}}]`

Step 7: log2 fold change threshold:
- step_state: scheduled

Step 8: toolshed.g2.bx.psu.edu/repos/iuc/pick_value/pick_value/0.2.0:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
style_cond	`{"__current_case__": 1, "pick_style": "first_or_default", "type_cond": {"__current_case__": 2, "default_value": "0.05", "param_type": "float", "pick_from": [{"__index__": 0, "value": "0.1"}]}}`

Step 9: toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sed_tool/9.3+galaxy1:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

sed --sandbox -r -f '/tmp/tmpc02os3x9/job_working_directory/000/8/configs/tmpi_5_ry3g' '/tmp/tmpc02os3x9/files/d/f/4/dataset_df4e9fab-592e-4287-a630-be6cab5f172c.dat' > '/tmp/tmpc02os3x9/job_working_directory/000/8/outputs/dataset_3560d077-c3ec-44e5-8982-3db1f89e4e3a.dat'

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
adv_opts	`{"__current_case__": 0, "adv_opts_selector": "basic"}`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
code	`"s/__tc__/\\t/g"`
dbkey	`"?"`

Step 10: toolshed.g2.bx.psu.edu/repos/iuc/pick_value/pick_value/0.2.0:

step_state: scheduled

Jobs

Job 1:

Job state is ok

Command Line:

```
cd ../; python _evaluate_expression_.py
```

Exit Code:

```
0
```

Traceback:

Job Parameters:

Job parameter	Parameter value
__input_ext	`"input"`
__workflow_invocation_uuid__	`"b775d7849dd811ef9b8aa1042ab2eb7f"`
chromInfo	`"/tmp/tmpc02os3x9/galaxy-dev/tool-data/shared/ucsc/chrom/?.len"`
dbkey	`"?"`
style_cond	`{"__current_case__": 1, "pick_style": "first_or_default", "type_cond": {"__current_case__": 2, "default_value": "1.0", "param_type": "float", "pick_from": [{"__index__": 0, "value": "0.5"}]}}`

Other invocation details
- history_id
  - 2bc6b4f69d71020c
- history_state
  - ok
- invocation_id
  - 2bc6b4f69d71020c
- invocation_state
  - scheduled
- workflow_id
  - 2bc6b4f69d71020c

lldelisle

Looks good to me. Thank you so much.
@mvdbeek you want to review?

first release of RNAseq DE analysis, filtering and plotting workflow

7cd6935

pavanvidem mentioned this pull request Oct 25, 2024

Update PE RNA-seq workflow #544

Merged

lldelisle reviewed Oct 25, 2024

View reviewed changes

workflows/transcriptomics/rnaseq-de/README.md Outdated Show resolved Hide resolved

lldelisle mentioned this pull request Oct 25, 2024

Workflows to write to split collection into 2 #583

Closed

Make thresholds optional, header param, better descriptions

ad0df1e

jmchilton mentioned this pull request Oct 30, 2024

Sample Sheet Workflow Inputs galaxyproject/galaxy#19085

Open

nekrut mentioned this pull request Oct 31, 2024

Workflows for "Analysis" page galaxyproject/brc-analytics#144

Open

pavanvidem added 2 commits November 8, 2024 14:46

hide unnecessary outputs and move count files to Zenodo

6844577

add release

e37a207

lldelisle reviewed Nov 8, 2024

View reviewed changes

workflows/transcriptomics/rnaseq-de/rnaseq-de-filtering-plotting.ga Outdated Show resolved Hide resolved

remove test data

0c49c70

lldelisle approved these changes Nov 8, 2024

View reviewed changes

lldelisle merged commit a949697 into galaxyproject:main Nov 12, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

first release of RNAseq DE analysis, filtering and plotting workflow #582

first release of RNAseq DE analysis, filtering and plotting workflow #582

pavanvidem commented Oct 25, 2024

lldelisle commented Oct 25, 2024

lldelisle commented Oct 25, 2024

pavanvidem commented Oct 25, 2024 •

edited

Loading

lldelisle commented Oct 25, 2024

pavanvidem commented Oct 25, 2024

lldelisle commented Oct 25, 2024

wm75 commented Oct 25, 2024

wm75 commented Oct 25, 2024

wm75 commented Oct 25, 2024

lldelisle commented Oct 25, 2024

pavanvidem commented Oct 25, 2024

wm75 commented Oct 25, 2024

lldelisle commented Oct 25, 2024

lldelisle commented Oct 25, 2024

pavanvidem commented Oct 25, 2024

wm75 commented Oct 25, 2024

pavanvidem commented Oct 25, 2024

wm75 commented Oct 25, 2024

lldelisle commented Nov 8, 2024

pavanvidem commented Nov 8, 2024 •

edited

Loading

lldelisle commented Nov 8, 2024

pavanvidem commented Nov 8, 2024

github-actions bot commented Nov 8, 2024

Workflow invocation details

lldelisle left a comment

first release of RNAseq DE analysis, filtering and plotting workflow #582

first release of RNAseq DE analysis, filtering and plotting workflow #582

Conversation

pavanvidem commented Oct 25, 2024

lldelisle commented Oct 25, 2024

lldelisle commented Oct 25, 2024

pavanvidem commented Oct 25, 2024 • edited Loading

lldelisle commented Oct 25, 2024

pavanvidem commented Oct 25, 2024

lldelisle commented Oct 25, 2024

wm75 commented Oct 25, 2024

wm75 commented Oct 25, 2024

wm75 commented Oct 25, 2024

lldelisle commented Oct 25, 2024

pavanvidem commented Oct 25, 2024

wm75 commented Oct 25, 2024

lldelisle commented Oct 25, 2024

lldelisle commented Oct 25, 2024

pavanvidem commented Oct 25, 2024

wm75 commented Oct 25, 2024

pavanvidem commented Oct 25, 2024

wm75 commented Oct 25, 2024

lldelisle commented Nov 8, 2024

pavanvidem commented Nov 8, 2024 • edited Loading

lldelisle commented Nov 8, 2024

pavanvidem commented Nov 8, 2024

github-actions bot commented Nov 8, 2024

Test Results (powered by Planemo)

Test Summary

Workflow invocation details

lldelisle left a comment

Choose a reason for hiding this comment

pavanvidem commented Oct 25, 2024 •

edited

Loading

pavanvidem commented Nov 8, 2024 •

edited

Loading