Reinstate obsoleted commands e.g. calc_distmx, fastx_getseqs etc. #8

KasperSkytte · 2024-06-18T08:51:43Z

Can the fastx_getseqs be included too? Seems it has been removed compared to version 11:
Unknown command-line option -fastx_getseqs

The text was updated successfully, but these errors were encountered:

rcedgar · 2024-06-18T12:39:56Z

As noted in README.md:

Compared to earlier versions, functionality which is sufficiently covered by other open-source projects has been removed. In particular, there is no support for OTU table manipulation or diversity analysis which is well supported by other tools such as QIIME and DADA2. The goal here is to simplify the package as much as reasonably possible to encourage collaborators to join the open-source project.

Binaries for usearch versions 5 though 11 are provided at https://github.com/rcedgar/usearch_old_binaries/, licensed under CC0-1.0 (public domain). There are no plans to provide source code for the older versions.

If you find the obsoleted commands useful, then you can use the older binaries.

KasperSkytte · 2024-06-18T12:44:05Z

Alright. I would just consider all the fastx_* commands similarly, some are included. But anyways, I'll use an earlier version then, thanks.

rcedgar · 2024-06-18T12:47:31Z

Let's leave this one open so that people can see it, this is already the second time a similar issue has been opened.

hjarnek · 2024-09-08T14:08:04Z

Hi @rcedgar! What about otutab_xtalk? I have not seen another software with similar functionality of the UNCROSS2 algorithm. Or do you not recommend using it anymore?

rcedgar · 2024-09-08T14:31:12Z

I think cross-talk is too hard to measure accurately, and you cannot filter it without losing too many low-abundance species. I don't use it myself, but it's all a judgement call, it's impossible to account for all the errors and biases in amplicon sequencing so it's totally up to the user what they feel comfortable with. If you think it is useful, you can use one of the older binaries, they are all licensed in the public domain now.

hjarnek · 2024-09-08T15:01:51Z

I think cross-talk is too hard to measure accurately, and you cannot filter it without losing too many low-abundance species. I don't use it myself, but it's all a judgement call, it's impossible to account for all the errors and biases in amplicon sequencing so it's totally up to the user what they feel comfortable with. If you think it is useful, you can use one of the older binaries, they are all licensed in the public domain now.

Ok, but many people just use a static cutoff value on read abundances to filter out crosstalk. I work with metazoan metabarcoding, and because of multicellularity I think compositional analysis is very rough. I'm happy with just presence/absence, basically. I understand that when you do real microbial compositional analysis, whether there are a handful or no reads of certain OTUs doesn't matter that much, so skipping cross-talk filtering altogether isn't a big deal. But for presence/absence, it does matter a lot more. Surely, using UNCROSS2 with lenient parameters must be better than just applying a static read-cutoff and discard all abundances less than say 10 for example?

And speaking of which... I can't find any documentation on how to adjust the parameters of otutab_xtalk in USEARCH11, although you talk a bit about the effects of adjusting them in your paper. Am I missing something?

rcedgar · 2024-09-08T15:09:27Z

Setting an abundance cutoff makes a trade-off between FPs and FNs where in general you cannot get a meaningful estimate of the FP rate or the FN rate. Perhaps you could estimate rates from mock community control samples with a strongly skewed abundance curve, but including control samples seems not to be a widely accepted practice, and it's unclear to me how dependent cross-talk rates are on the index sequences plus abundance biases such as operon count, primer differences and so on. So I don't see a principled way to set parameters or make a recommendation, it seems to me it is up to the user (plus PIs, referees, editors, funders etc.) to figure out what seems reasonable to them.

hjarnek · 2024-09-08T15:44:16Z

[...] but including control samples seems not to be a widely accepted practice

Yeah, unfortunately not. And I don't have any control over what happens in the wet lab to generate the data I work with. So given there are no mock samples, using UNCROSS2 still seems better than applying a strict read abundance cutoff, right? Is it possible to adjust the parameters of otutab_xtalk in USEARCH11, like s, Nmin and fmax, to make it more lenient? Maybe if the source code was included here, it could be a starting point for people to tweak according to their needs? I just want to get away from the static abundance cutoffs.

KasperSkytte changed the title ~~fastx_getseqs~~ include fastx_getseqs Jun 18, 2024

rcedgar changed the title ~~include fastx_getseqs~~ Reinstate obsoleted commands e.g. calc_distmx, fastx_getseqs etc. Jun 18, 2024

KasperSkytte closed this as completed Jun 18, 2024

rcedgar reopened this Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reinstate obsoleted commands e.g. calc_distmx, fastx_getseqs etc. #8

Reinstate obsoleted commands e.g. calc_distmx, fastx_getseqs etc. #8

KasperSkytte commented Jun 18, 2024

rcedgar commented Jun 18, 2024 •

edited

Loading

KasperSkytte commented Jun 18, 2024

rcedgar commented Jun 18, 2024

hjarnek commented Sep 8, 2024

rcedgar commented Sep 8, 2024

hjarnek commented Sep 8, 2024

rcedgar commented Sep 8, 2024

hjarnek commented Sep 8, 2024

Reinstate obsoleted commands e.g. calc_distmx, fastx_getseqs etc. #8

Reinstate obsoleted commands e.g. calc_distmx, fastx_getseqs etc. #8

Comments

KasperSkytte commented Jun 18, 2024

rcedgar commented Jun 18, 2024 • edited Loading

KasperSkytte commented Jun 18, 2024

rcedgar commented Jun 18, 2024

hjarnek commented Sep 8, 2024

rcedgar commented Sep 8, 2024

hjarnek commented Sep 8, 2024

rcedgar commented Sep 8, 2024

hjarnek commented Sep 8, 2024

rcedgar commented Jun 18, 2024 •

edited

Loading