You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are getting an initial nested-column support with Hats/LSDB ecosystem now. Now we have a couple of catalogs (ZTF alerts, SDSS DR7 spectra) with nested lists that represent nested data we could pack to a single nested column after we read the data.
Today we can nest these list-columns with code like this one:
This works, but it is not a perfect user experience: how would user know which columns can be packed (here it is with name prefixes, but it is not scalable and ugly), how does user save a catalog to the initial format when calling to_hats?
We can solve these issues with a better nested columns support across the ecosystem:
hats: Parse metadata to hats catalog which specifies which list-columns correspond to which nested columns, e.g. mag and mjd form lc, while flux and wave form sed.
hats-import: Generate and save nested column metadata
lsdb: read_hats uses nested column metadata to pack list-columns into NestedDtyped columns. It still allows to select individual "nested" columns, e.g. if "mag" and "magerr" are selected, and "mjd" is not, the first two form an "lc" nested column.
lsdb: to_hats splits nested column to list-columns and creates appropriate metadata
We are getting an initial nested-column support with Hats/LSDB ecosystem now. Now we have a couple of catalogs (ZTF alerts, SDSS DR7 spectra) with nested lists that represent nested data we could pack to a single nested column after we read the data.
Today we can nest these list-columns with code like this one:
This works, but it is not a perfect user experience: how would user know which columns can be packed (here it is with name prefixes, but it is not scalable and ugly), how does user save a catalog to the initial format when calling
to_hats
?We can solve these issues with a better nested columns support across the ecosystem:
hats
: Parse metadata to hats catalog which specifies which list-columns correspond to which nested columns, e.g.mag
andmjd
formlc
, whileflux
andwave
formsed
.hats-import
: Generate and save nested column metadatalsdb
:read_hats
uses nested column metadata to pack list-columns intoNestedDtype
d columns. It still allows to select individual "nested" columns, e.g. if "mag" and "magerr" are selected, and "mjd" is not, the first two form an "lc" nested column.lsdb
:to_hats
splits nested column to list-columns and creates appropriate metadatanested-pandas
: Reimplement parquet I/O according to HATS plans lincc-frameworks/nested-pandas#163The text was updated successfully, but these errors were encountered: