Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing link duplicates #187

Merged
merged 6 commits into from
Apr 19, 2022
Merged

Conversation

Martin-Rey
Copy link

Hi,

This a clean attempt at addressing the comments in #182, after rebasing against the main branch following #183 proved unnecessarily messy.

I think the tests cover all cases described in #182. The only remaining question is whether we should attempt to delete non-maximal duplicate links, and if yes, whether the behaviour should be to keep the most recent addition to the database, or the highest weight.

@apontzen
Copy link
Member

pre-commit.ci autofix

@apontzen
Copy link
Member

Thanks - just to understand, this current version will keep multiple links between the same two halos, with the same name, but different weights, is that correct?

This is an edge case that probably nobody will ever see, but still I think that's probably not what most people would expect. I think there is a good argument for dropping all links with the same name linking the same two halos. I am happy with this being done on the basis of keeping the most recent addition rather than the highest weight.

@Martin-Rey
Copy link
Author

The latest change should now remove links with the same name, between the same halos, but with different weight. I added a test to verify that the latest link addition to the database is retained.

The maximal duplicate (same halos, same weight, same name) might feel like an edge case, but it actually occurs naturally when re-importing properties multiple times in an existing database (the current importer duplicates properties and links without checking if they already exist #162). And this was my base use case for developing this feature :)

Let me know if you have any last comments/suggestions

@apontzen apontzen merged commit 32a0b8a into pynbody:master Apr 19, 2022
@Martin-Rey Martin-Rey deleted the new-attempt-duplicates branch April 19, 2022 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants