-
-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nnjai support #129
base: main
Are you sure you want to change the base?
nnjai support #129
Conversation
bb78c36
to
c70d3c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this! This is a good start, I think there needs to be a few different changes before this can be merged, but a good start!
graph_weather/data/nnjai_wrapp.py
Outdated
""" | ||
|
||
import numpy as np | ||
from nnja.io import _check_authentication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest only checking for the library when importing the package. Then the AMSUDataset
doesn't have to be in a giant if/else.
from nnja.io import _check_authentication | |
try: | |
from nnja.io import _check_authentication | |
from nnja import DataCatalog | |
Except ModuleImportError: | |
print("NNJA-AI library not installed. Please install with `pip install git+https://github.com/brightbandtech/nnja-ai.git`") |
graph_weather/data/nnjai_wrapp.py
Outdated
latitude = row["LAT"] | ||
longitude = row["LON"] | ||
metadata = np.array([row[col] for col in self.metadata_columns], dtype=np.float32) | ||
return time, latitude, longitude, metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest returning this as a dictionary, so its easier to track.
return time, latitude, longitude, metadata | |
return {"timestamp": time, "latitude": latitude, "longitude": longitude, "metadata": metadata} |
a0358a2
to
c70d3c1
Compare
for more information, see https://pre-commit.ci
Here is the output for the suggestions you requested.
|
Hi, could you push the changes to this branch? I don't see any differences in the code? |
for more information, see https://pre-commit.ci
@jacobbieker any updates? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its getting there! The only thing I would go with changing is to add a unit test in the tests/
directory that checks the output shapes from the dataset to ensure it is what it expects. You basically have the test here, under the main bit, just change it to check for the output shapes and put that part in a unit test, then it can be merged!
graph_weather/data/nnjai_wrapp.py
Outdated
return {key: torch.stack([item[key] for item in batch]) for key in batch[0].keys()} | ||
|
||
|
||
if __name__ == "__main__": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a unit test for this dataset under the tests/
? Basically, have it do this, but instead of printing it out, check for the expected shapes and such.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! Thanks for making the changes.
Thank you for guiding me. If there's anything more I could do for this issue then do let me know. |
Pull Request for issue #128
@jacobbieker
Description
This code provides a data pipeline for handling and processing satellite data using the NNJAI library. It focuses on creating a PyTorch Dataset class (
AMSUDataset
) for efficiently loading and accessing Advanced Microwave Sounding Unit (AMSU) data. Below is a detailed breakdown of its functionality:DataCatalog
using_check_authentication()
.Dataset
andDataLoader
for seamless compatibility with PyTorch-based pipelines.This implementation enhances satellite data processing workflows by providing an efficient and modular solution.
Fixes #128
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration.
Unit Tests:
time
,primary_descriptors
, andadditional_variables
.__getitem__
to ensure data integrity.Integration Tests:
DataLoader
to iterate over the dataset and verified the randomness of shuffled outputs.If your changes affect data processing, have you plotted any changes? i.e. have you done a quick sanity check?
Checklist: