Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interaction:fixed effects and corrections modified #5928

Merged
merged 6 commits into from
Apr 29, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions tools/maaslin2/maaslin2.xml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ cd outputFolder && mkdir -p figures/ && cp *.pdf figures
<inputs>
<param name="input_data" type="data" format="tabular" label="Data (or features) file"/>
<param name="input_metadata" type="data" format="tabular" label="Metadata file"/>
<param argument="--fixed_effects" type="select" multiple="true" optional="true" label="Interactions: Fixed effects" help="The fixed effects for the model, comma-delimited for multiple effects">
<param argument="--fixed_effects" type="text" multiple="true" optional="true" label="Interactions: Fixed effects" help="The fixed effects for the model, comma-delimited for multiple effects">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should this be a text paraeter? We should probably fix the help instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that now but if more association factors get added in the future, we might have to make changes again as we are doing now. So instead take it as a text parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, the fixed_effects as well as the random_effects should be of type: data_column (see:

<param name="cols" type="data_column" data_ref="input" multiple="true"
for an example).
Since for both multiple columns from the input can be chosen. The original optional values are nonsense, since they hard-coded the column names of the test data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am getting column numbers instead of names after using type="data_column" . Do we have any other type where we can get names of columns directly instead of numbers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried setting numerical=false? https://docs.galaxyproject.org/en/latest/dev/schema.html#id51

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting numerical="false" did not work instead setting use_header_names="true" worked and UI shows column names now but the test cases are failing. Do not understand why it is still accepting numerical values as valid options for tests. So working on that now.

tool_test_output.json

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong, but the use_header_names="true" option is really just cosmetic and for the UI, you still need to give numbers in the tests.

Very good idea to use use_header_names="true"!!!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @renu-pal it was a surprise for me, that there is not easy way to get the column_names available for the command line from Galaxy, however after some fideling I found a way to get them from the input file:

## get column names of fixed and random effect from the input file, since galaxy 
## can only return indices with type="data_column" 

## get header
#set $input = open(str($input_metadata), 'r')
#set $header = $input.readlines()[0].split('\t')

## get fixed effects
#set $fixed_effects_val = []
#for $i in $fixed_effects:
    #silent $fixed_effects_val.append($header[int($i)])
#end for
#set $fixed_effects = ','.join($fixed_effects_val)

## get random effects
#set $random_effects_val = []
#for $i in $random_effects:
    #silent $random_effects_val.append($header[int($i)])
#end for
#set $random_effects = ','.join($random_effects_val)

Maybe let the user know in the help, that the column names must be in the first line of the table or modify the code to skip comments when parsing the header

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forget about that, we need to get the column names in the command line not the cheetah code...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So update, this way, using awk the file is only parsed on script execution

## get column names of fixed and random effect from the input file, since galaxy 
## can only return indices with type="data_column" 
## using awk so that the file is only parsed on command line execution

#set idx = []
#for $i in $fixed_effects:
    #silent idx.append(f'${i}')
#end for
#set idx_for_awk = ','.join(idx)

fixed_effects=`awk -v OFS=',' -F"\t" 'NR == 1 { print $idx_for_awk}' '$input_metadata'` &&
echo 'Assigned fixed effects as:' \$fixed_effects &&

#set idx = []
#for $i in $random_effects:
    #silent idx.append(f'${i}')
#end for
#set idx_for_awk = ','.join(idx)

random_effects=`awk -v OFS=',' -F"\t" 'NR == 1 { print $idx_for_awk}' '$input_metadata'` &&
echo 'Assigned random effects as:' \$random_effects &&

<option value="diagnosis" selected="true">diagnosis</option>
<option value="dysbiosisnonIBD" selected="true">dysbiosisnonIBD</option>
<option value="dysbiosisUC" selected="true">dysbiosisUC</option>
Expand Down Expand Up @@ -87,7 +87,14 @@ cd outputFolder && mkdir -p figures/ && cp *.pdf figures
<option value="NEGBIN">NEGBIN</option>
<option value="ZINB">ZINB</option>
</param>
<param argument="--correction" type="text" value="BH" optional="true" label="Correction" help="The correction method for computing the q-value"/>
<param argument="--correction" type="select" value="BH" multiple="true" optional="true" label="Correction" help="The correction method for computing the q-value">
<option value="holm">holm</option>
<option value="hochberg">hochberg</option>
<option value="hommel">hommel</option>
<option value="bonferroni">bonferroni</option>
<option value="BH">BH</option>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the full names for BH and BY as well, maybe also add a link to the explanation to the help and check if they all work, maybe add a test or two as well

<option value="BY">BY</option>
</param>
renu-pal marked this conversation as resolved.
Show resolved Hide resolved
<param argument="--standardize" type="boolean" truevalue="--standardize TRUE" falsevalue="--standardize FALSE" checked="true" label="Apply z-score so continuous metadata are on the same scale"/>
</section>
<section name="output" title="Set Plotting Output" expanded="true">
Expand Down