-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse metar from pandas dataframe into another dataframe #3476
Comments
Hi @jgoriasilva, Here is what I use to do something similar... df = metar.parse_metar_file(StringIO('\n'.join(val for val in data.metar)),
year=date.year, month=date.month) Here I am using the date time module to set a date and the StringIO module for taking the string and making it into a byte-like object to put into the metar parser form MetPy. The above also assumes the Pandas Dataframe is called data with a column named metar. |
Thanks for your answer @kgoebber. That looks good, but doing that way I would lose the information of the original data DataFrame, particularly the alignment between the parsed metar and the rows of the original DataFrame (parse_metar_file or parse_metar_to_dataframe generates an arbitrary index). What I would like to do is to process the metar data from a column of an existing DataFrame and create new columns in that same dataframe with the new columns that the parse_metar_to_dataframe generates. Maybe I'm overlooking something here, but one way that I'm currently doing it is like this:
The problem is that by doing that way, I sometimes get a ParseError for a few rows that present a problematic metar information, which is an additional problem I just found:
I'm still looking for a solution for this as well. |
It's exceedingly frustrating that there's not a way to get Pandas to just expand the tuple into multiple columns, because otherwise from functools import partial
from metpy.io.metar import parse_metar
import pandas as pd
obs = ['KADS 122347Z 17013G20KT 13SM SCT039 23/14 A2986',
'KBCT 122353Z 12008KT 10SM FEW032 22/16 A3009',
'KCWA 122347Z 28010KT 10SM CLR 16/M02 A2969',
'KOUN 122345Z 19014KT 10SM CLR 24/09 A2975']
s = pd.Series(obs)
parser = partial(parse_metar, year=2024, month=4)
s.apply(parser) gives:
What data are you working with that's giving you a column with reports in it? |
What should we add?
I have a dataframe in which there is a column with strings of METAR reports. Currently, if I use the parse_metar_to_dataframe, which only accepts a string as an input, it will generate one dataframe for each string of my column, resulting in a series of dataframes (if I use pandas.series.apply for example).
It would be much easier to use the parser if it could accept a pandas Series and return a single DataFrame with the same columns as currently it does currently but where each row is a parsed METAR, instead of one dataframe with one row only for each parsed string.
I might be missing something with the usage, but as I understand there is no way to do it without creating unnecessary overhead with the
Reference
It would be fairly simple to implement this. I can do it from my side and create a pull request, creating a new function that uses the existing parse_metar (from metpy.io) but that accepts a pandas Series of str (or list of str) and returns a single pandas Dataframe.
The text was updated successfully, but these errors were encountered: