Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faceting/Grouping Only Allows for a single selection per group #112

Open
redcape opened this issue Jan 19, 2021 · 3 comments
Open

Faceting/Grouping Only Allows for a single selection per group #112

redcape opened this issue Jan 19, 2021 · 3 comments
Assignees

Comments

@redcape
Copy link

redcape commented Jan 19, 2021

Action: Select a Source, Journal, Published, etc from the site.

What Happens: All other options go away except for the selected facet in its group and the other facets recalculate their sizes and values.

What's Expected: All facets recalculate their sizes and values. The selected facet's group's values may or may not change, but the select value does not change. The selected facet's groups alternative values are not zero - they are the count as if no selection were made in that group only.

Example where the behavior is as expected on a different site on what is probably a lucene-based stack - this may be hard-coded.
LinkedIn's search has a better example where there is actually value discovery (current/past companies)

For example, if the corpus is:
Doc Source Journal Publish
1 WHO BMJ 2020
2 WHO BMJ 2019
3 WHO Lancet 2020
4 Medline Lancet 2020

With no selections:
Total: 4 - Docs 1,2,3,4
Source: WHO(3) Medline(1)
Journal: BMJ(2) Lancet(2)
Publish: 2020(3) 2019(1)

With Source WHO selection, expected:
Total: 3 - Docs 1,2,3
Source: WHO(3) Medline(1) <- notice Medline is 1 even though WHO is selected. This is the only filter WHO is not applied on.
Journal: BMJ(2) Lancet(1) <- Lancet is 1 because the WHO filter is applied here
Publish: 2020(2) 2019(1) <- 2020 goes to 2 because the WHO filter is applied here

With Source WHO selection and Lancet selection:
Total: 1 - Doc 3
Source: WHO(1, selected) Medline(1, unselected) <- Lancet filter applied, but not WHO at this group
Journal: Lancet(1, selected) BMJ(2, unselected) <- WHO filter applied, but not Lancet filter at this group
Publish: 2020(1, selected) <- WHO and Lancet filter applied at this group - there are no other publish years

I think this would be a better experience overall - removing all the unselected options doesn't make much sense for the user and the UI should encourage further discovery of available query expansions. I'm mostly looking to find out if Vespa offers this type of user experience / facet discovery without a separate query for each facet group - and if so - how?

@jobergum
Copy link

Thank you for the detailed request,

There is a lot going on here, there is a front-end implementation and it's the core Vespa grouping features but yes, we could allow the user to select multiple values per faceted field displayed (e.g source:who, source:medline) but it would make the UI a little bit harder to navigate as we would wait for check box, then some kind of submit functionality. If one is quick one can still be able to select multiple values per field as show below.

image

Generally, grouping is performed over the documents matching the query and filters. If a filter specifies +source:who, the grouping only runs over those documents where source contains who. To support discovery/expansion and showing counts outside of the current filter it requires another query without the filtering constraint.

@jobergum
Copy link

And you made spend the entire morning looking at dogs 🐶

image

Faceted search is a fascinating problem with many nuances from UI elements and backend implementations.

@redcape
Copy link
Author

redcape commented Jan 19, 2021

Glad I could brighten your morning with dogs!

I agree that the checkbox then submit approach would be awkward, and it would still produce the same output in the end of filtering the group down to just the results that were selected with no opportunity for query expansion.

Multiple queries makes sense. One, maybe two for counting(discovery then count), then the actual result query, I'm hoping to see a way that it's not one separate query per facet group though.

Can a query like this be written for the UI for counting? What would the actual YQL look like? Hopefully, the intent of the where clauses inside all comes through. I did contains which may not be right, I only just started diving deeper into vespa over the last few days so I'm not super familiar with the query syntax so far...

Counting query:

{
'hits': 0,
'yql': 'select id,source,journal from sources * where userQuery() and (source contains "WHO" OR journal  contains "Lancet") | 
all(
  where(journal contains "Lancet") all(group(source) order(-count()) each(output(count())))
  where(source contains contains "WHO") all(group(journal) max(10) order(-count()) each(output(count())))
  where(source contains "WHO" AND journal contains "Lancet") all(group(time.year(timestamp)) max(10) order(-max(time.year(timestamp))) each(output(count())) as(year))
);'
}

Results query:

{
'hits': 10,
'yql': 'select id,source,journal from sources * where userQuery() and (source contains "WHO" AND journal contains "PloS One");'
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants