-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Make our estimators compatible with scikit-learn #116
base: master
Are you sure you want to change the base?
Conversation
Currently our |
Codecov Report
@@ Coverage Diff @@
## master #116 +/- ##
==========================================
+ Coverage 57.04% 57.35% +0.30%
==========================================
Files 113 114 +1
Lines 6560 7027 +467
==========================================
+ Hits 3742 4030 +288
- Misses 2818 2997 +179
|
dc1ee9f
to
0377ca7
Compare
I've adapted FETALinear's init. I'll have to do the same for our other cores and estimators and then take care of the other parts of the scikit-learn estimator interface. |
0377ca7
to
9450337
Compare
This is turning out to be a lot more effort than expected. To make it somewhat manageable and reviewable I'll proceed as follows:
|
The first part is ready for review: #117 |
9450337
to
031dbc0
Compare
031dbc0
to
21bf823
Compare
Rebased, now that #118 is merged. |
45267a0
to
352cba0
Compare
I have continued work on top of #119 to avoid merge conflicts. We're getting very close to passing the I already started by removing all default values for the regularizer parameters (82a2869). @kiudee do you think that is the right call for the regularizers as well? I had to patch scikit-learn to get more useful/actionable error messages in the test. I'll try to upstream those improvements once scikit-learn/scikit-learn#17756 is resolved. |
352cba0
to
e7e5b0c
Compare
e7e5b0c
to
ceeca54
Compare
All our estimators are default constructible now and the test passes with scikit-learn/scikit-learn#17936 🎉 That means that we cannot add the test before the PR is merged and a new sklearn release is out though. |
The PR was merged upstream. I'm not sure if it'll make it into 0.23.3 or if we have to wait for 0.24. Either way, there should be a stable version with the necessary patch at some point. |
d214673
to
20a757e
Compare
4c637b5
to
b43cb3e
Compare
Good news and bad news. Good news: After #159, we are now compliant with "no attributes in init" 🎉 Bad news: Basically all other checks are unusable as long as our |
Just to re-state the problem (we talked about this in some other PR/issue, but I can't remember which): According to scikit-learn, It should be possible to flatten each ranking instance into a single feature array to please scikit-learn. We could write functions to transform between the two formats. The question is if that is what we want. |
🥳
I think that (writing converters and having flat format be default) would compromise a bit too much. Our setting is a different one and it would not make sense to press it into that format. It would only buy us compatibility with a few preprocessors (let me know if I forgot an important use case), which for our setting likely do not make sense. Anything important we lose? |
I tend to agree. I will look into writing a wrapper anyway, which will at least enable us to make use of the remaining checks. |
Okay, I have done that. Just for rankers for now, I'll fix the checks there first (of course they don't all pass). I already added some little fixes to this PR. The checks are very slow though, so we'll probably not run them with the regular testsuite. |
0f4ae0d
to
9e0f3f4
Compare
The "default constructible" (i.e. no required __init__ parameters) property is the most basic property of a scikit-learn estimator. It is a prerequisite for all other estimator checks.
Required by the scikit-learn estimator API for easier fit-predict chaining.
This reverts commit 2d35098.
9e0f3f4
to
4901119
Compare
4901119
to
0c7ca20
Compare
Now that we use poetry & nix, I have included the environment with the patched scikit-learn that is necessary to run these tests. A simple |
Description
This is a work-in-progress of fixing #94. See that issue for motivation and context.
Currently the one added test is failing since FETADiscreteChoice does not yet conform to the interface. The commit is mostly just to get the PR started. I will incrementally fix the estimators and add them to the list of tested estimators.
Collecting and testing all of our estimators will be easier after #115 is done.
How Has This Been Tested?
Does this close/impact existing issues?
Fixes #94.
Types of changes
Checklist: