-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mmuphin wrapper #6584
base: main
Are you sure you want to change the base?
Mmuphin wrapper #6584
Changes from 1 commit
298cadd
ef26199
801cc88
f0ca96f
c19da4d
52076ec
bab0879
7c09803
f340c26
4ee8d65
e861c0f
f65885e
7fafc59
f79c79f
1b335f1
d86e487
47d426d
f815e86
f36aa66
090b476
c5af559
e60c159
9c404da
7c2d2d7
c1b167c
550cd60
8d38c99
b6e9c39
5109992
0c902c3
19a33c0
663287a
a139a38
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
name: mmuphin | ||
owner: iuc | ||
description: "MMUPHin is an R package implementing meta-analysis methods for microbial community profiles" | ||
homepage_url: https://huttenhower.sph.harvard.edu/mmuphin | ||
long_description: | | ||
MMUPHin is a Bioconductor package implementing meta-analysis methods for microbial community profiles. It has interfaces for: a) covariate-controlled batch and study effect adjustment, b) meta-analytic differential abundance testing, and meta-analytic discovery of c) discrete (cluster-based) or d) continuous unsupervised population structure. | ||
|
||
Overall, MMUPHin enables the normalization and combination of multiple microbial community studies. It can then help in identifying microbes, genes, or pathways that are differential with respect to combined phenotypes. Finally, it can find clusters or gradients of sample types that reproduce consistently among studies | ||
remote_repository_url: https://github.com/biobakery/MMUPHin | ||
type: unrestricted | ||
categories: | ||
- Metagenomics | ||
auto_tool_repositories: | ||
name_template: "{{ tool_id }}" | ||
description_template: "Wrapper for the mmuphin function: {{ tool_name }}" | ||
suite: | ||
name: "suite_mmuphin" | ||
description: "A suite of tools that brings the mmuphin project into Galaxy." " | ||
long_description: | | ||
MMUPHin is a Bioconductor package implementing meta-analysis methods for microbial community profiles. It has interfaces for: a) covariate-controlled batch and study effect adjustment, b) meta-analytic differential abundance testing, and meta-analytic discovery of c) discrete (cluster-based) or d) continuous unsupervised population structure. | ||
|
||
Overall, MMUPHin enables the normalization and combination of multiple microbial community studies. It can then help in identifying microbes, genes, or pathways that are differential with respect to combined phenotypes. Finally, it can find clusters or gradients of sample types that reproduce consistently among studies | ||
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
<?xml version="1.0"?> | ||
<macros> | ||
<token name="@TOOL_VERSION@">1.18.1</token> | ||
<token name="@VERSION_SUFFIX@">0</token> | ||
<token name="@PROFILE@">21.05</token> | ||
renu-pal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
<xml name="xrefs"> | ||
<xrefs> | ||
<xref type="bio.tools">mmuphin</xref> | ||
<xref type="bioconductor">mmuphin</xref> | ||
|
||
</xrefs> | ||
</xml> | ||
<xml name="requirements"> | ||
<requirements> | ||
<requirement type="package" version="@TOOL_VERSION@">bioconductor-mmuphin</requirement> | ||
<requirement type="package" version="2.0.3">magrittr</requirement> | ||
renu-pal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
<requirement type="package" version="1.1.4">dplyr</requirement> | ||
renu-pal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
<requirement type="package" version="0.33">DT</requirement> | ||
renu-pal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
</requirements> | ||
renu-pal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
</xml> | ||
<xml name="citations"> | ||
<citations> | ||
<citation type="doi"> 10.18129/B9.bioc.MMUPHin </citation> | ||
</citations> | ||
</xml> | ||
</macros> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,139 @@ | ||
<tool id="mmuphin" name="mmuphin" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> | ||
renu-pal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
<description>Performing meta-analyses of microbiome studies</description> | ||
<macros> | ||
<import>macros.xml</import> | ||
</macros> | ||
<expand macro="xrefs"/> | ||
<expand macro="requirements"/> | ||
<command detect_errors="exit_code"><![CDATA[ | ||
Rscript '$rscript' | ||
]]></command> | ||
|
||
<configfiles> | ||
<configfile name="rscript"><![CDATA[ | ||
bgruening marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
library(MMUPHin) | ||
library(magrittr) | ||
library(dplyr) | ||
library(ggplot2) | ||
library(readr) | ||
|
||
source(adjust_batch.R) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You will also need to add adjust_batch.R script to your test-data and source that here |
||
|
||
#input files | ||
print(" Read input files") | ||
abd_data <- read_tsv("$input_data") | ||
meta_data <- read_tsv("$input_metadata") | ||
|
||
# Define control list | ||
controls <- list("$zero_inflation", | ||
"$pseudo_count", | ||
"$conv", | ||
"$maxit", | ||
"$verbose", | ||
"$diagnostic_plot") | ||
|
||
#Perform batch adjustment | ||
source(adjust_batch.R) | ||
result <- adjust_batch(feature_abd = abd_data, | ||
batch = "$batch_input", | ||
covariates = "$covariates_input", | ||
data = meta_data, | ||
control=controls | ||
) | ||
|
||
# Save results into output files | ||
print(result) | ||
write.table(result$feature_abd_adj,file="$output",quote = FALSE) | ||
write.table(result$control,file="$control_output",quote = FALSE) | ||
#save adjust_batch_diagnostic.pdf into diagnostic_plot_output file too | ||
]]></configfile> | ||
</configfiles> | ||
bgruening marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you please clean those things up? thanks |
||
<inputs> | ||
<param name="input_data" type="data" format="tabular" label="Data (or features) file"/> | ||
<param name="input_metadata" type="data" format="tabular" label="Metadata file"/> | ||
<param argument="batch_input" type="data_column" data_ref="input_metadata" use_header_names="true" label="batch" /> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you please improve all labels and help text. They are not very user-friendly IMHO. How does a metadata file needs to look like? Or the feature file? "batch"? Maybe "the column in which the batch identifier is species"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bgruening does this work? |
||
<param argument="covariates_input" type="data_column" data_ref="input_metadata" use_header_names="true" optional="true" label="covariates" /> | ||
<section name="additional_options" title="Additional Options" expanded="true"> | ||
<param argument="zero_inflation" type="boolean" truevalue="zero_inflation TRUE" falsevalue="zero_inflation FALSE" checked="true" label=" Run zero-inflated model"/> | ||
renu-pal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
<param argument="pseudo_count" type="float" optional="true" label="Pseudo_count" help="Pseudo count to add feature_abd before the methods' log transformation.Default to NULL, in which case will be set to half of minimal non-zero values in feature_abd"/> | ||
<param argument="conv" type="float" value="0.0001" optional="true" label="Convergence threshold" help="Convergence threshold for the method's iterative algorithm for shrinking batch effect parameters"/> | ||
<param argument="maxit" type="float" value="1000" optional="true" label="Maximum number of iterations" help="Maximum number of iterations allowed for the method's iterative algorithm. Default to 1000"/> | ||
<param argument="verbose" type="boolean" truevalue="verbose TRUE" falsevalue="verbose FALSE" checked="true" label="Print verbose information"/> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we usually don't expose those parameters to the user There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bgruening ,so should I remove them ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes and set a useful default There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @bgruening , I have the made required changes . Does this work ? |
||
<param argument="diagnostic_plot" type="boolean" truevalue="diagnostic_plot TRUE" falsevalue="diagnostic_plot FALSE" checked="true" label="Generate diagnostic figure file, default: adjust_batch_diagnostic.pdf"/> | ||
</section> | ||
</inputs> | ||
|
||
|
||
<outputs> | ||
<data name="output" format="tabular" label="Adjusted abundance table"/> | ||
<data name="diagnostic_plot_output" format="pdf" label="diagnostic figure file"/> | ||
<data name="control_output" format="tabular" label="control list used in batch adjustment"/> | ||
</outputs> | ||
<tests> | ||
<test> | ||
renu-pal marked this conversation as resolved.
Show resolved
Hide resolved
|
||
<param name="input_data" value="CRC_abd.tsv"/> | ||
<param name="input_metadata" value="CRC_meta.tsv"/> | ||
<param name="batch_input" value="1"/> | ||
<param name="covariates_input" value="2"/> | ||
<section name="additional_options"> | ||
<param name="zero_inflation" value="TRUE"/> | ||
<param name="pseudo_count" value="3"/> | ||
<param name="conv" value="0.0001"/> | ||
<param name="maxit" value="1000"/> | ||
<param name="verbose" value="TRUE"/> | ||
<param name="diagnostic_plot" value="TRUE"/> | ||
</section> | ||
|
||
<output name="output"> | ||
<assert_contents> | ||
<has_size value="150053" delta="1000" /> | ||
</assert_contents> | ||
</output> | ||
<output name="diagnostic_plot_output" file="adjust_batch_diagnostic.pdf" ftype="pdf"/> | ||
<output name="control_output"> | ||
<assert_contents> | ||
<has_size value="1500" delta="100" /> | ||
</assert_contents> | ||
</output> | ||
</test> | ||
</tests> | ||
<help><![CDATA[ | ||
@HELP_HEADER@ | ||
MmuPHin | ||
========= | ||
MMUPHin is an R package implementing meta-analysis methods for microbial community profiles. It has interfaces for: | ||
|
||
a) Performing batch (study) effect adjustment with adjust_batch : | ||
------------------------------------------------------------------ | ||
It aims to correct for technical batch effects in microbial feature abundances. Batch effects refer to variations in data that arise not from the biological or experimental variables of interest but due to differences in technical or procedural factors during data collection or processing. For example: | ||
|
||
Different equipment or lab environments. | ||
Different operators handling the experiment. | ||
Variations in sample preparation, sequencing runs, or platforms. | ||
|
||
These unwanted variations can obscure true biological signals and introduce bias, making it critical to adjust for batch effects to ensure accurate and comparable results across datasets. | ||
|
||
The function adjust_batch in the MMUPHin package is designed to correct batch effects in microbiome data. | ||
|
||
Inputs: | ||
======= | ||
A feature-by-sample abundance matrix (e.g., microbial abundances). | ||
A metadata file, which contains information about samples, including batch identifiers and optional covariates. | ||
|
||
Output: | ||
======= | ||
A batch-adjusted abundance matrix for downstream analyses. | ||
|
||
b) meta-analytic differential abundance testing | ||
c) meta-analytic discovery of discrete (cluster-based) or continuous unsupervised population structure. | ||
|
||
Meta-analysis methods are statistical techniques used to combine and synthesize data from multiple independent studies, typically to derive a more precise or generalizable conclusion. This approach is commonly used in fields such as medicine, psychology, and biology to aggregate research findings and increase the statistical power of analyses by pooling data from different experiments or studies. | ||
|
||
|
||
]]></help> | ||
<expand macro="citations"/> | ||
</tool> |
Large diffs are not rendered by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
long_description is redundant here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
801cc88
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
long_description
was duplicated in earlier code. It is completely removed now in 801cc88There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misunderstood and removed the whole thing but made the correction later: 7fafc59