Reshaping data associations using "nest" #696
trantor
started this conversation in
Show and tell
Replies: 1 comment
-
Nice!! :) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello everyone.
Taking inspiration from a colleague's recent request, here is a situation where
mlr
comes in handy when dealing with data that get sent to you in a different format than the one you might desire.So, back to an actual case.
Dealing with e-mail systems, there might be situations when a single address, let's say one of a group/distribution list, is associated with multiple other addresses, e.g. members of said group/list.
Usually, when dealing with such associations, it's handy to have a "de-normalized" form of sorts, such as
where the same value for a field can appear multiple times, with unique combinations of the two fields available.
However, when your average human being is asked to fill in information such as these in a spreadsheet, it might end up as something as follows
Now that's rather less convenient to process in an automated fashion, especially when the number of columns vary wildly between a record and the next.
And that's where
mlr
comes in to convert it back to a 2-column form, using thenest
verb.We need to perform a manual editing of the header first, in order to rename our
ADDRESS X
columns asADDRESS_1,ADDRESS_2,ADDRESS_3,etc
in order fornest
to recognize that they should be merged back as a single field.Once we do that, first we implode the
ADDRESS_*
columns back into oneADDRESS
column, then we explode the values again, this time across records rather than across fields. In the end we filter out records with an emptyADDRESS
value.There you go. Hopefully something similar will be helpful to someone. 👋
Beta Was this translation helpful? Give feedback.
All reactions