Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected dim order behavior #9718

Open
openSourcerer9000 opened this issue Nov 5, 2024 · 7 comments
Open

Unexpected dim order behavior #9718

openSourcerer9000 opened this issue Nov 5, 2024 · 7 comments

Comments

@openSourcerer9000
Copy link

openSourcerer9000 commented Nov 5, 2024

Edit: see below for updated request.

Is your feature request related to a problem?

the rest of the scipy ecosystem requires numpy arrays for everything. dim order is the only organization you have in np. it causes tons of issues moving back and forth between xarray and np simply because dimensions are displayed in alphabetical order rather than their actual order.
image

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

@openSourcerer9000
Copy link
Author

NVMD it shows it correctly after the data var

@kmuehlbauer

This comment has been minimized.

@openSourcerer9000
Copy link
Author

It's still messing me up that dataset does not have dim order. ds.transpose doesn't seem to actually do anything. I think the improvement would be to have ds.transpose actually change the order of dims for each data var to match the order imposed.
image
image

@dcherian
Copy link
Contributor

It does actually do that. to_dataframe lets you specify an order using the dim_order kwarg: https://docs.xarray.dev/en/stable/generated/xarray.DataArray.to_dataframe.html

@openSourcerer9000 openSourcerer9000 changed the title display dimensions in correct order in __repr__ Unexpected dim order behavior Nov 13, 2024
@openSourcerer9000
Copy link
Author

openSourcerer9000 commented Nov 13, 2024

Isn't the dim order already specified? The behavior seems strange to me. It may be ambiguous with many data vars, but with a single data var it should be pretty clear. I think we often use a single var dataset over dataarray to merge variables or to avoid seeing "xarray_dataarray_variable" names pop up after serializing.

@dcherian
Copy link
Contributor

dcherian commented Nov 13, 2024

Datasets does not, and will not, enforce consistency of dimension ordering among dataarrays.

So where it does matter, like in to_dataframe, we are forcing you to be explicit and write out what dimension order you want for that function. We can't just pick the dim order of the first variable because not all variables have the same dimensions.

@keewis
Copy link
Collaborator

keewis commented Nov 13, 2024

We can't just pick the dim order of the first variable because not all variables have the same dimensions.

... and thus Dataset.to_dataframe uses ds.sizes as the default dimension order, which is not affected by Dataset.transpose.

I think we often use a single var dataset over dataarray to merge variables or to avoid seeing "xarray_dataarray_variable" names pop up after serializing.

You can assign a name to a unnamed DataArray, which will be used by to_dataframe:

arr.rename("variable")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants