-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Begin using curies.Converter
in more places
#397
Conversation
This PR automatically instantiates a `curies.Converter` for a given SPARQL configuration and re-implements the compress/expand operations using the `curies` package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks awesome!! THANK YOU!
@hrshdhgd I will leave it to you to deal with the PR - I left 1/2 comments here and there to check, but nothing critical.
I ran this PR locally and there are 4 errors as of now. First one is a CLI error which will be fixed automatically once the following are fixed. Three pytest errors occurring as of now:
From what I can tell the first two errors are thrown by The third error is because the prefix |
I agree but it seems to me that only GitHub environments seem to be problematic. Locally, it seems to work fine. Plus Chris insists all our projects following a standard architecture, hence my hands are sort of tied 🙂 . |
@hrshdhgd thanks for following up on all of this. can you take care of looking into these tests that have inconsistent data? Great to see that we are implicitly now checking data integrity using |
tests/test_parsers.py
Outdated
# f"{self.obographs_file} has the wrong number of mappings.", | ||
# ) | ||
|
||
@unittest.skip(reason="Not sure what is broken in this graph") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry but no; what exactly is going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking into it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've figured out what this was testing and written a more detailed explanation of what it was doing before and why we don't need it anymore.
The other option is we can make the OBO parser much less lenient, and then we will be throwing exceptions left and right (not recommended)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so yes, I have now added a detailed explanation of why this test is not needed. We should keep it skipped, or even better, remove it.
There are issues in mapping-commons/sssom-py#397 because the base schema does not have a valid prefix map - `dc` and `dcterms` are both used as prefixes for the same URI prefix. This happens because LinkML makes some inference about what needs to be there - there are some assorted uses throughout the schema definition with both DC and DCTERMS. This PR updates the schema to only use DCTERMS and puts explicit entries for DCTERMS in the prefix map to address this. It then regenerates the whole project.
This was possible thanks to mapping-commons/sssom#302
@hrshdhgd it appears this PR is passing, even with the Poetry changes removed. |
Yes , I had a quick solution that worked! It won't sustain for future PRs if we're using the latest versions of poetry. |
I merged #399 to main. Now if you merge the main branch down to this one, all should be well. I'll let you do it since you're actively working on this branch. This will eliminate the conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have convinced myself that deleting that broken obographs test was the right thing to do! Thanks for your work on this, I am happy now; the one comment is purely an FYI, the behaviour is desireable.
@@ -117,21 +118,10 @@ def test_parse_obographs(self): | |||
write_table(msdf, file) | |||
self.assertEqual( | |||
len(msdf.df), | |||
9881, | |||
8099, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cthoyt I reinstated this test. Analysing the warnings, I noticed that the 1800 less parsed entities (which is good, since they are not in the prefix map) are due to biopragmatics/bioregistry#917.
I think this is entirely independent of this PR, so I will approve this, this is just FYI.
@hrshdhgd ready for your review and merge, I am happy with it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much cleaner than before! Thanks @cthoyt !
Part of #363
This PR does the following:
curies
that has the strict compress and expand functionscuries
functionalitycurie_from_uri
to usecuries
(will make a follow-up PR that replaces this completely)