Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fuzzy-date function to case search #35065

Merged
merged 8 commits into from
Sep 12, 2024
Merged

Add fuzzy-date function to case search #35065

merged 8 commits into from
Sep 12, 2024

Conversation

MartinRiese
Copy link
Contributor

@MartinRiese MartinRiese commented Aug 29, 2024

Product Description

Add a function to fuzzy search dates.

Technical Summary

https://dimagi.atlassian.net/browse/USH-4861

Feature Flag

Case search needs to be enabled

Safety Assurance

Safety story

Automated test coverage

Added new unit test

QA Plan

Update Potential Duplicate Clients case list in staging and perform search.

Rollback instructions

  • This PR can be reverted after deploy with no further considerations

Labels & Review

  • Risk label is set correctly
  • The set of people pinged as reviewers is appropriate for the level of risk of the change

@MartinRiese
Copy link
Contributor Author

@esoergel No tests or documentation yet. But I was curious about your thoughts on this? I was think about adding a third parameter to let you specify a range in years for how far the permutations could be away from the original date.

Copy link
Contributor

@esoergel esoergel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks great, and the profiling results so far sound quite promising

MartinRiese and others added 3 commits September 3, 2024 13:41
Use should query with two clauses: 1) terms for all permutations and
2) term for just the orignal date will give cases with an exact date
match a higher score.
@@ -233,7 +233,7 @@ def multiplex_to_adapter(domain):
return None


def case_property_query(case_property_name, value, fuzzy=False, multivalue_mode=None):
def case_property_query(case_property_name, value, fuzzy=False, multivalue_mode=None, boost_first=False):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@esoergel I am feeling a bit iffy about adding yet another flag here. I see two alternatives:

  1. just create a separate function case_property_query_boost_first and call it from fuzzy_date
  2. use _base_property_query directly in fuzzy_date probably renaming it without the underscore in the process.
    I am curious, if you have a preference?

Should be formatted correctly beforehand though
"2042-03-21",
"2042-12-30"
], boost_first=True)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Know you're still working on this. Have you thought about adding a test that validates the output of the query is as expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AddisonDunn is there a test the run ES queries against test data already? I am having trouble finding any.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests in test_case_search_filters test that a query in a case search function returns the cases expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AddisonDunn pushed the test

{'_id': 'c1', 'dob': date(2020, 3, 1)},
{'_id': 'c2', 'dob': date(2020, 1, 3)},
{'_id': 'c3', 'dob': date(2002, 3, 1)},
{'_id': 'c4', 'dob': date(2020, 3, 4)},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

self.assertItemsEqual(
query.get_ids(),
output
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think other tests put the xpath query as part of a call to get_case_search_query under the query. Kind of silly though IMO I think it basically does the same thing

@@ -435,6 +436,21 @@ def test_fuzzy_case_property_query(self):
['c3']
)

def test_fuzzy_date(self):
print("############## test_fuzzy_date")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want this print statement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not.

@MartinRiese MartinRiese merged commit 81ea359 into master Sep 12, 2024
13 checks passed
@MartinRiese MartinRiese deleted the riese/fuzzy_date branch September 12, 2024 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants