-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
keep original columns for clean_data #1
Comments
This is in accordance with @dirkschumacher's suggestion in #1 Also, I found the `comment()` function, which seems really useful for this task :)
IIRC, @thibautjombart is a bit opposed to this concept for the fact that the user can do: old_data <- the_data
the_data <- linelist::clean_data(the_data) |
Keeping the original and modified data together helps spot errors. Especially if you use a magic function like "clean_data" that might behave differently with future releases of linelist. What about adding a parameter with the default to include the original columns? |
I think that's a good idea! Plus, there could be a function that uses diffObj to compare the cleaned and original columns. |
I agree, as an additional argument. I would add the columns so that the original and 'cleaned' variables are next to each other:
|
Merge branch regex-varnames into master
Having a magical function that does everything is great, but I can imagine that keeping the original values of the columns helps to trust the transformation.
E.g. the values before the transformations could be kept in the resulting data frame with an added suffix
date_of_onset_original
or somethingThe text was updated successfully, but these errors were encountered: