Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing data without objectID or object index #274

Open
3 tasks
nevencaplar opened this issue Apr 4, 2024 · 0 comments
Open
3 tasks

Importing data without objectID or object index #274

nevencaplar opened this issue Apr 4, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@nevencaplar
Copy link
Member

Bug report

When importing data into Parquet from csv via pandas, and then importing, the code fails at this line

https://github.com/astronomy-commons/hipscat-import/blob/62a30a0768e5035c02df19d26037c369d007618f/src/hipscat_import/catalog/map_reduce.py#L152C1-L153C1

This seems to be because the Pandas index is duplicated (each file has the same index values that Pandas assigned to the data). When joining data from different files the code then fails. The solution was to import data to parquet files with index=False. I have to go back and give more information about how this was precisely done.

However, the code should not fail when the data has duplicate indexes that have been created by pandas.

Before submitting
Please check the following:

  • I have described the situation in which the bug arose, including what code was executed, information about my environment, and any applicable data others will need to reproduce the problem.
  • I have included available evidence of the unexpected behavior (including error messages, screenshots, and/or plots) as well as a descriprion of what I expected instead.
  • If I have a solution in mind, I have provided an explanation and/or pseudocode and/or task list.
@nevencaplar nevencaplar added the bug Something isn't working label Apr 4, 2024
@nevencaplar nevencaplar moved this to Todo in HATS / LSDB Aug 22, 2024
@nevencaplar nevencaplar removed the status in HATS / LSDB Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

1 participant