diff --git a/README.md b/README.md index 67ad920..a955c2f 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # noaastn -![](https://github.com/UBC-MDS/noaastn/workflows/build/badge.svg) [![codecov](https://codecov.io/gh/UBC-MDS/noaastn/branch/main/graph/badge.svg)](https://codecov.io/gh/UBC-MDS/noaastn) ![Release](https://github.com/UBC-MDS/noaastn/workflows/Release/badge.svg) [![Documentation Status](https://readthedocs.org/projects/noaastn/badge/?version=latest)](https://noaastn.readthedocs.io/en/latest/?badge=latest) +![Buildbadge](https://github.com/UBC-MDS/noaastn/workflows/build/badge.svg) [![codecov](https://codecov.io/gh/UBC-MDS/noaastn/branch/main/graph/badge.svg)](https://codecov.io/gh/UBC-MDS/noaastn) [![Deploy](https://github.com/UBC-MDS/noaastn/actions/workflows/deploy.yml/badge.svg)](https://github.com/UBC-MDS/noaastn/actions/workflows/deploy.yml) [![Documentation Status](https://readthedocs.org/projects/noaastn/badge/?version=latest)](https://noaastn.readthedocs.io/en/latest/?badge=latest) The US National Oceanic and Atmospheric Administration (NOAA) collects and provides access to weather data from land-based weather stations within the US and around the world ([Land-Based Station Data](https://www.ncdc.noaa.gov/data-access/land-based-station-data)). One method for accessing these data is through a publically accessible FTP site. This package allows users to easily download data from a given station for a given year, extract several key weather parameters from the raw data files, and visualize the variation in these parameters over time. The weather parameters that are extracted with this package are: @@ -40,10 +40,47 @@ The US National Oceanic and Atmospheric Administration (NOAA) collects and provi There are few packages in the python ecosystem like [noaa](https://pypi.org/project/noaa/), [noaa-coops](https://pypi.org/project/noaa-coops/), [noaa-sdk](https://pypi.org/project/noaa-sdk/) that do analysis related to NOAA weather station data. These tools are more focused on using the NOAA's [API service](https://www.ncei.noaa.gov/support/access-data-service-api-user-documentation) to obtain forecast information. They do not provide an interface to obtain historical weather data from the NOAA's FTP site, process and visualize key weather parameters like this package do. +## Usage + +Typical usage will begin with downloading the list of available weather stations in the country of interest using the `get_stations_info()` function. A dataframe is returned which can be reviewed to find a suitable station in the area of interest. Alternatively, the NOAA provides a [graphical interface](https://gis.ncdc.noaa.gov/maps/ncei/cdo/hourly) for exploring the available weather stations. + +``` +>>> from noaastn import noaastn +>>> noaastn.get_stations_info(country = "US") +``` + +![Tabular output from get_stations_info function](img/get_stations_info.png) + +After selecting a weather station number, the `get_weather_data()` function can be used to download various weather parameters for the station number and year of interest. The following usage example downloads weather data from station number "911650-22536" for the year 2020 and saves the data to a variable called 'weather_data'. 'weather_data' will be a data frame containing a time series of the following parameters for the station and year of interest: + +- air temperature (degrees Celsius) +- atmospheric pressure (hectopascals) +- wind speed (m/s) +- wind direction (angular degrees) + +``` +>>> weather_data = noaastn.get_weather_data("911650-22536", 2020) +>>> print(weather_data) +``` + +![Tabular output from get_weather_data function](img/get_weather_data.png) + +The function `plot_weather_data()` can be used to visualize a time series of any of the available weather parameters either on a mean daily or mean monthly basis. The function returns an Altair chart object which can be saved or displayed in any environment which can render Altair objects. + +``` +>>> noaastn.plot_weather_data(weather_data, col_name="air_temp", time_basis="monthly") +``` + +![Altair chart with time series of air temperature](img/plot_weather_data.png) + +## Documentation + +Documentation for this package can be found on [Read the Docs](https://noaastn.readthedocs.io/en/latest/) + ## Contributors We welcome and recognize all contributions. You can see a list of current contributors in the [contributors tab](https://github.com/UBC-MDS/noaastn/graphs/contributors). -### Credits +## Credits This package was created with Cookiecutter and the UBC-MDS/cookiecutter-ubc-mds project template, modified from the [pyOpenSci/cookiecutter-pyopensci](https://github.com/pyOpenSci/cookiecutter-pyopensci) project template and the [audreyr/cookiecutter-pypackage](https://github.com/audreyr/cookiecutter-pypackage). diff --git a/img/get_stations_info.png b/img/get_stations_info.png new file mode 100644 index 0000000..2a0c9eb Binary files /dev/null and b/img/get_stations_info.png differ diff --git a/img/get_weather_data.png b/img/get_weather_data.png new file mode 100644 index 0000000..e8227db Binary files /dev/null and b/img/get_weather_data.png differ diff --git a/img/plot_weather_data.png b/img/plot_weather_data.png new file mode 100644 index 0000000..2b9666a Binary files /dev/null and b/img/plot_weather_data.png differ diff --git a/noaastn/noaastn.py b/noaastn/noaastn.py index dd6aae6..ef0c7b4 100644 --- a/noaastn/noaastn.py +++ b/noaastn/noaastn.py @@ -103,9 +103,8 @@ def get_weather_data(station_number, year): Loads and cleans weather data for a given NOAA station ID and year. Returns a dataframe containing a time series of air temperature (degrees Celsius), atmospheric pressure (hectopascals), wind speed (m/s), and wind - direction (angular degrees). Also saves a copy of the raw data file - downloaded from the NOAA FTP server at - ftp://ftp.ncei.noaa.gov/pub/data/noaa/. + direction (angular degrees). The raw data file is downloaded from the NOAA + FTP server at ftp://ftp.ncei.noaa.gov/pub/data/noaa/. Parameters ---------- @@ -145,23 +144,17 @@ def get_weather_data(station_number, year): # data from NOAA FTP site. filename = station_number + "-" + str(year) + ".gz" - noaa_ftp = FTP("ftp.ncei.noaa.gov") - noaa_ftp.login() # Log in (no user name or password required) - noaa_ftp.cwd("pub/data/noaa/" + str(year) + "/") - compressed_data = io.BytesIO() try: + noaa_ftp = FTP("ftp.ncei.noaa.gov") + noaa_ftp.login() # Log in (no user name or password required) + noaa_ftp.cwd("pub/data/noaa/" + str(year) + "/") noaa_ftp.retrbinary("RETR " + filename, compressed_data.write) except error_perm as e_mess: - if re.search("(No such file or directory)", str(e_mess)): - print( - "Data not available for that station number / year combination" - ) - else: - print("Error generated from NOAA FTP site: \n", e_mess) + print("Error generated from NOAA FTP site: \n", e_mess) noaa_ftp.quit() - return + return 'FTP Error' noaa_ftp.quit() diff --git a/tests/test_get_weather_data.py b/tests/test_get_weather_data.py index 572995c..9812b91 100644 --- a/tests/test_get_weather_data.py +++ b/tests/test_get_weather_data.py @@ -44,3 +44,10 @@ def test_station_number_coding(): assert ( weather_df.stn.unique()[0] == station_number ), "Station number should match entries values in stn column" + + +def test_ftp_error_handling(): + assert ( + noaastn.get_weather_data("999999-99999", 1750) == "FTP Error" + ), """Entry of invalid station/year combination should return string + 'FTP error'."""