-
-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Script to fetch OpenMeteo Data(NWP Forecast and Historical data) #93
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Thank you for this work! Overall this is a good start, and thank you for spending the time on this.
For this repo, we ideally want to return a grid of data from OpenMeteo to train/run inference on it, not individual points. I know OpenMeteo's data is more aligned for individual points, but we would ideally extract values for a grid of latitude/longitude pairs that covers the globe, and return the data as an Xarray object. We want it to work similarly to getting data from WeatherBench 2 (see #86) where the data is in a regular grid across the whole world and is returned as an Xarray object. OpenMeteo includes models that are not global, so the output does not need to be a global grid of lat/lon data, but it should still be a grid, rather than a single point at a time, if that makes sense?
I would also recommend moving this under the graph_weather/data
folder, so its in with the rest of the repo, and easier to test. Although not necessary for the PR, ideally, there would be a single interface for getting weather data from different sources (WeatherBench 2, OpenMeteo, potentially others) and outputs a uniform format for the data, so having this code be in that folder helps with that.
Again, thanks for all the work! Obviously happy to answer more questions and work with you on this PR to get it merged and used!
from retry_requests import retry | ||
|
||
|
||
class WeatherDataFetcher: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be changed the following, so that the name is more descriptive of the data it is getting
class WeatherDataFetcher: | |
class OpenMeteoWeatherDataFetcher: |
retry_session = retry(cache_session, retries=5, backoff_factor=0.2) | ||
self.openmeteo = openmeteo_requests.Client(session=retry_session) | ||
|
||
def fetch_forecast_data(self, NWP, params): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def fetch_forecast_data(self, NWP, params): | |
def fetch_forecast_data(self, nwp, params): |
We don't want to hard code the NWP that we are using. Ideally, we also want type hints for the inputs and outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, I'll add type hints as per your suggestion.
hourly = response.Hourly() | ||
|
||
# Extract variables | ||
hourly_variables = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, the variables that are extracted are not hardcoded, but can be passed in as arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
surely will do that, but could you please provide guidance on which variables should be included in the Xarray Dataset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this reply slipped through, but I would go with by default, all available ones, and make one of the arguments a list of parameter names. I think there should be a way to get all the available parameters for a model from the API or something?
hourly_data[variable_name] = variable_values | ||
|
||
# Create a DataFrame from the dictionary | ||
hourly_dataframe = pd.DataFrame(data=hourly_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this, we want to have the data be returned in an Xarray Dataset, that has coordinates of latitude
, longitude
, and time_utc
, and then the variables and dataarrays in the Dataset.
print(f"Timezone difference to GMT+0 {response.UtcOffsetSeconds()} s") | ||
|
||
|
||
def main(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be great in the tests folder, as a pytest test! So then we can automatically run this on all code changes.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jacobbieker,
I'm encountering an issue while creating an xarray dataset with the OpenMeteo data due to dimension problems. Although I'm able to successfully fetch datasets for multiple coordinates, I'm facing challenges with dimension handling. although the len of dims are same, still!
I'm planning to add an argument for NWP (Numerical Weather Prediction) if we need to specify a particular NWP in the function. What do you think about this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this is a bit hard to debug from this, but if you add to each data point the coord
latitude and longitude, that might then work to reshape into a grid?
For adding an argument to specify the NWP, that is perfect! We want to be able to access all the NWPs from OpenMeteo from this, so that would be ideal.
Pull Request
Description
This pull request adds functionality to fetch both forecast and historical weather data using the OpenMeteo API. It introduces a new class WeatherDataFetcher to encapsulate the data fetching logic and provides methods to fetch forecast data for specific models and historical data from the OpenMeteo API.
WeatherDataFetcher
to handle weather data fetching.fetch_forecast_data
) and historical data (fetch_historical_data
).process_hourly_data
) extracted from the API response.print_location_info
) extracted from the API response.Fixes #90