Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] bottleneck gives erroneous standard deviation in Pandas with float32 array. #443

Open
jamespreed opened this issue Feb 20, 2024 · 4 comments
Labels

Comments

@jamespreed
Copy link

When bottlenecks is installed in an environment with Pandas, it causes pandas to return an incorrect result for .std on a constant array (it should return 0.0).

To Reproduce
First install pandas. The result is the same when using conda and pip.

conda create -n testenv python=3.11 conda-forge::pandas==2.2.0 -y
conda activate testenv

Running the following code gives the expected result:

import pandas as pd

print(pd.Series([271.46] * 150000, dtype='float32').std())
# prints: 0.0

Now in the same environment, install bottleneck. It is the only additional package installed.

conda install bottleneck -y

Running the same code gives an incorrect result:

import pandas as pd

print(pd.Series([271.46] * 150000, dtype='float32').std())
# prints: 0.229433074593544

Version info:
Windows 11, Python 3.11, conda 23.5.2
Output from conda list:

# packages in environment at C:\Users\JamesReed\miniconda3\envs\testenv:
#
# Name                    Version                   Build  Channel
blas                      1.0                         mkl
bottleneck                1.3.7           py311hd7041d2_0
bzip2                     1.0.8                he774522_0
ca-certificates           2023.12.12           haa95532_0
intel-openmp              2023.1.0         h59b6b97_46320
libffi                    3.4.4                hd77b12b_0
mkl                       2023.1.0         h6b88ed4_46358
mkl-service               2.4.0           py311h2bbff1b_1
mkl_fft                   1.3.8           py311h2bbff1b_0
mkl_random                1.2.4           py311h59b6b97_0
numpy                     1.26.3          py311hdab7c0b_0
numpy-base                1.26.3          py311hd01c5d8_0
openssl                   3.0.13               h2bbff1b_0
pandas                    2.2.0           py311hf63dbb6_0    conda-forge
pip                       23.3.1          py311haa95532_0
python                    3.11.7               he1021f5_0
python-dateutil           2.8.2              pyhd3eb1b0_0
python-tzdata             2023.3             pyhd3eb1b0_0
python_abi                3.11                    2_cp311    conda-forge
pytz                      2023.3.post1    py311haa95532_0
setuptools                68.2.2          py311haa95532_0
six                       1.16.0             pyhd3eb1b0_1
sqlite                    3.41.2               h2bbff1b_0
tbb                       2021.8.0             h59b6b97_0
tk                        8.6.12               h2bbff1b_0
tzdata                    2023d                h04d1e81_0
ucrt                      10.0.20348.0         haa95532_0
vc                        14.2                 h21ff451_1
vc14_runtime              14.38.33130         h82b7239_18    conda-forge
vs2015_runtime            14.38.33130         hcb4865c_18    conda-forge
wheel                     0.41.2          py311haa95532_0
xz                        5.4.5                h8cc25b3_0
zlib                      1.2.13               h8cc25b3_0

Additional context
I additionall reported the bug in the Pandas github repo: pandas-dev/pandas#57505

@rdbisme
Copy link
Collaborator

rdbisme commented Feb 23, 2024

Would you be able to do a git bisect and see if this is a regression and has been introduced recently, or it's a bug that always been there?

@rdbisme
Copy link
Collaborator

rdbisme commented Feb 23, 2024

This might be related: #164

@rdbisme
Copy link
Collaborator

rdbisme commented Feb 23, 2024

Also you can check if this fixes your problem: #414

@jamespreed
Copy link
Author

Would you be able to do a git bisect and see if this is a regression and has been introduced recently, or it's a bug that always been there?

I am really sorry, I don't know how to do that :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants