-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ignore_nan argument to concordance_cc() #43
Conversation
Codecov Report
|
@dkounadis your implementation proposed in #41 is indeed faster and returns the same results. What I did not completely understood yet is, why could you simply skip the |
I used the expression from Wikipedia: When the correlation coefficient is computed on a N-length data set (i.e, ... |
Cool thanks for that information. This should be documentation enough, if we need to figure this out again. |
So after this PR, we should probably have another one that adds |
Don't know how urgent it is. I think we have to revisit the handling of |
Co-authored-by: Johannes Wagner <[email protected]>
Seems a bit strange to me to only support it with one particular function. |
Feel free to implement it ;) I just had the impression that nobody was asking for it, whereas for |
There is also still a corresponding issue with #14, so it's a known fact. If you like we can extend that issue or open another one for |
What I would propose is to simply add the following to all our functions: if ignore_nan:
mask = ~(np.isnan(truth) + np.isnan(prediction))
truth = truth[mask]
prediction = prediction[mask] |
tests/test_concordance_cc.py
Outdated
False, | ||
), | ||
( | ||
[0, 1, 2, 3, 4, 5, 6, np.NaN], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in addition we should also add cases where np.NaN
is in either truth
or prediction
and in both, but different locations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the tests and added now an additional test for different np.NaN
locations and the possibility to specify the expected truth and prediction values after the mask is applied to avoid using the same code for masking in the test and the implementation.
tests/test_concordance_cc.py
Outdated
prediction = np.array(list(prediction)) | ||
truth = np.array(list(truth)) | ||
|
||
if len(prediction) < 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need those special cases where we return np.NaN
or can we simplify the function now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, forgot to remove this. We don't need this and it is now removed.
Closes #41
Relates to #14
Adds
ignore_nan=False
argument toaudmetric.concordance_cc()
. IfTrue
all samples are ignored that containNaN
as part oftruth
orprediction
.It further uses the proposed implementation from #41 to speed up the calculation of CCC compared to the current main branch. Using the code mention at the end of this repo we get:
and for
ignore_nan=True
: