Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Series.clip raises with pyarrow dtype backend #7415

Open
2 of 3 tasks
FBruzzesi opened this issue Jan 2, 2025 · 0 comments
Open
2 of 3 tasks

BUG: Series.clip raises with pyarrow dtype backend #7415

FBruzzesi opened this issue Jan 2, 2025 · 0 comments
Labels
bug 🦗 Something isn't working Triage 🩹 Issues that need triage

Comments

@FBruzzesi
Copy link

Modin version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest released version of Modin.

  • I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)

Reproducible Example

import pandas as pd
import modin.pandas as mpd

df_pd = pd.DataFrame({"x": range(5), "lower": [2, 3, 2, 2, 2], "upper": [4, 4, 3, 3, 2]})

def clip_func(dframe):
    return dframe["x"].clip(dframe["lower"], dframe["upper"], axis=0)
 
clip_func(df_pd)
0    2
1    3
2    2
3    3
4    2
Name: x, dtype: int64

clip_func(df_pd.convert_dtypes(dtype_backend="pyarrow"))
0    2
1    3
2    2
3    3
4    2
Name: x, dtype: int64[pyarrow]

clip_func(mpd.DataFrame(df_pd))
UserWarning: __array_ufunc__ is not currently supported by PandasOnDask, defaulting to pandas implementation.
UserWarning: __array_ufunc__ is not currently supported by PandasOnDask, defaulting to pandas implementation.
0    2
1    3
2    2
3    3
4    2
Name: x, dtype: int64

clip_func(df_pd.convert_dtypes(dtype_backend="pyarrow"))
ERROR - Compute Failed
Key:       _deploy_dask_func-4f65c833-ac3a-4b99-8f53-ed8209c7528d
State:     executing
Task:  <Task '_deploy_dask_func-4f65c833-ac3a-4b99-8f53-ed8209c7528d' _deploy_dask_func(..., ...)>
Exception: 'NotImplementedError("<class \'pandas.core.arrays.arrow.array.ArrowExtensionArray\'> does not support reshape as backed by a 1D pyarrow.ChunkedArray.")'

Issue Description

Series.clip with lower and upper bound being other series raises an exception when dtype_backend="pyarrow"

Expected Behavior

Same as other dtype backend

Error Logs

Out[13]: 2025-01-02 20:31:56,915 - distributed.worker - ERROR - Compute Failed
Key:       _deploy_dask_func-4f65c833-ac3a-4b99-8f53-ed8209c7528d
State:     executing
Task:  <Task '_deploy_dask_func-4f65c833-ac3a-4b99-8f53-ed8209c7528d' _deploy_dask_func(..., ...)>
Exception: 'NotImplementedError("<class \'pandas.core.arrays.arrow.array.ArrowExtensionArray\'> does not support reshape as backed by a 1D pyarrow.ChunkedArray.")'
Traceback: '  File "/home/fbruzzesi/open-source/narwhals/.venv/lib/python3.12/site-packages/modin/core/execution/dask/implementations/pandas_on_dask/partitioning/virtual_partition.py", line 294, in _deploy_dask_func\n    result = deployer(axis, f_to_deploy, f_args, f_kwargs, *args, **kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/fbruzzesi/open-source/narwhals/.venv/lib/python3.12/site-packages/modin/logging/logger_decorator.py", line 144, in run_and_log\n    return obj(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^\n  File "/home/fbruzzesi/open-source/narwhals/.venv/lib/python3.12/site-packages/modin/core/dataframe/pandas/partitioning/axis_partition.py", line 457, in deploy_axis_func\n    result = func(dataframe, *f_args, **f_kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/fbruzzesi/open-source/narwhals/.venv/lib/python3.12/site-packages/modin/core/storage_formats/pandas/query_compiler.py", line 2365, in <lambda>\n    axis, lambda df: df.clip(**kwargs), shape_preserved=True\n                     ^^^^^^^^^^^^^^^^^\n  File "/home/fbruzzesi/open-source/narwhals/.venv/lib/python3.12/site-packages/pandas/core/generic.py", line 9102, in clip\n    result = result._clip_with_one_bound(\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/fbruzzesi/open-source/narwhals/.venv/lib/python3.12/site-packages/pandas/core/generic.py", line 8882, in _clip_with_one_bound\n    return self.where(subset, threshold, axis=axis, inplace=inplace)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/fbruzzesi/open-source/narwhals/.venv/lib/python3.12/site-packages/pandas/core/generic.py", line 10984, in where\n    return self._where(cond, other, inplace, axis, level)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/fbruzzesi/open-source/narwhals/.venv/lib/python3.12/site-packages/pandas/core/generic.py", line 10709, in _where\n    other = np.reshape(other, (-1, 1))\n            ^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/fbruzzesi/open-source/narwhals/.venv/lib/python3.12/site-packages/numpy/_core/fromnumeric.py", line 324, in reshape\n    return _wrapfunc(a, \'reshape\', shape, order=order)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/fbruzzesi/open-source/narwhals/.venv/lib/python3.12/site-packages/numpy/_core/fromnumeric.py", line 57, in _wrapfunc\n    return bound(*args, **kwds)\n           ^^^^^^^^^^^^^^^^^^^^\n  File "/home/fbruzzesi/open-source/narwhals/.venv/lib/python3.12/site-packages/pandas/core/arrays/arrow/array.py", line 1171, in reshape\n    raise NotImplementedError(\n'

Installed Versions

commit : 0691c5cf90477d3503834d983f69350f250a6ff7
python : 3.12.2
python-bits : 64
OS : Linux
OS-release : 5.15.133.1-microsoft-standard-WSL2
Version : #1 SMP Thu Oct 5 21:02:42 UTC 2023
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : C.UTF-8

pandas : 2.2.3
numpy : 2.2.1
pytz : 2024.2
dateutil : 2.9.0.post0
pip : 24.3.1
IPython : 8.29.0
bs4 : 4.12.3
fsspec : 2024.12.0
hypothesis : 6.122.1
jinja2 : 3.1.4
matplotlib : 3.9.3
pyarrow : 18.1.0
pytest : 8.3.4
scipy : 1.14.1
xlrd : 2.0.1
tzdata : 2024.2

@FBruzzesi FBruzzesi added bug 🦗 Something isn't working Triage 🩹 Issues that need triage labels Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working Triage 🩹 Issues that need triage
Projects
None yet
Development

No branches or pull requests

1 participant