Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support negation of entire categories #109

Open
ppavlidis opened this issue Jul 17, 2024 · 3 comments
Open

Support negation of entire categories #109

ppavlidis opened this issue Jul 17, 2024 · 3 comments

Comments

@ppavlidis
Copy link

ppavlidis commented Jul 17, 2024

I'm not sure how we would present this to users, we've discussed handling negation better before, but it came up as a use case:

Find all experiments that don't involve a disease i.e. lack a disease annotation.

or

Find data sets that have control samples for a Disease factor

We usually hide annotations like "Reference role" from the browser and I think that is still a good call.

That kind of thing might make sense if you want "normal" samples / data sets, by some definition of "normal".

@arteymix
Copy link
Member

arteymix commented Aug 8, 2024

We need to handle negative subqueries in the backend for this kind of feature to work.

Right now, for a filter that involve a subquery like (i.e. characteristics.category = disease), we do something like:

select * from datasets where id in (select id from datasets join datasets.characteristics  c where c.category = 'disease')

Using a negative clause (i.e. characteristics.category != disease) would result in:

select * from datasets where id in (select id from datasets join datasets.characteristics  c where c.category <> 'disease')

But that's not what is being expected: it will return datasets that have at least one characteristic that does not have the URI in question. Instead, we should do something like:

select * from datasets where id not in (select id from datasets join datasets.characteristics  c where c.category = 'disease')

For the syntax, I'm thinking of something like:

none(characteristics.category = disease) no characteristic match the predicate
any(characteristics.category = disease) at least one characteristic match the predicate (current behavior)
all(characteristics.category = disease) all characteristics match the predicate

@arteymix
Copy link
Member

arteymix commented Aug 8, 2024

I would start only with any and none. The all can be obtained by negating the filter with none, but we do not support negating the in operator. This will require PavlidisLab/Gemma#1192.

@arteymix
Copy link
Member

arteymix commented Aug 9, 2024

The backend now supports quantifiers with the syntax above. It's now possible to implement the requested feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants