Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add: implement tool to add rank names to phyloseq object #6625

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

MaraBesemer
Copy link
Contributor

@MaraBesemer MaraBesemer commented Dec 12, 2024

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

@MaraBesemer
Copy link
Contributor Author

@paulzierep

tools/phyloseq/add_rank_names_to_phyloseq.xml Outdated Show resolved Hide resolved
<output name="output" ftype="phyloseq">
<!-- Check if the output contains the first character -->
<assert_contents>
<has_text text="B"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please be a bit more strict in this test?

change version and add profile
Copy link
Contributor

@paulzierep paulzierep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my more generic approach, it will be good I think.
Please change the code comments to english.

print(tax_table(physeq))

# Zuordnung der Rangnamen
rank_names <- c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species") # Anpassen je nach Bedarf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be more ranks, e.g.: super kingdom, subspecies, strain.
Could you make this more generic, e.g.: the use provides a comma separated list of ranks.
You could make "Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species" the default.
If the number of ranks does not match the column numbers of the tax_table the tool fails.

# Zuordnung der Rangnamen
rank_names <- c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species") # Anpassen je nach Bedarf

# Füge eine leere Spalte für Species hinzu, falls sie fehlt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not needed, if the logic I proposed before is used

print(tax_table(physeq))

# Extrahiere das erste Zeichen aus dem ersten Eintrag des tax_table (z.B. Kingdom)
first_char <- substr(tax_table(physeq)[1, 1], 1, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is that for ?

<tests>
<test>
<param name="input" value="output.phyloseq" ftype="phyloseq"/>
<output name="output" ftype="phyloseq">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could compare the new phylosq object to expected output, like:

<output name="ampvis" value="AalborgWWTPs-subset_samples.rds" ftype="ampvis2" compare="sim_size"/>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better just use an has_size assertion and remove value="AalborgWWTPs-subset_samples.rds".

cat("Current tax_table:\n")
print(tax_table(physeq))

# Zuordnung der Rangnamen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess English comments would be better.

<tests>
<test>
<param name="input" value="output.phyloseq" ftype="phyloseq"/>
<output name="output" ftype="phyloseq">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better just use an has_size assertion and remove value="AalborgWWTPs-subset_samples.rds".

cat("Current tax_table:\n")
print(tax_table(physeq))

# Add a column for Species if missing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this part is needed (the add species part), we cannot really know which rank is missing if it is missing, the user should provide names matching the columns exactly, otherwise the tool should fail, you could change if (ncol(tax_table(physeq)) > length(rank_names)) { to if (ncol(tax_table(physeq)) == length(rank_names)) { instead, and provide a message for further help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants