-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CSV Import File Validations #1211
Conversation
I think it's normal to have requirements on headers for CSV uploads. So, we could make a note on the upload page that there must be an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about moving the validation logic to its own single responsibility? We can use these validations for file type and email header for CSV imports from all third party services. Maybe we can keep it in app/services/organizations/importers/concerns
?
That does make sense if we have a separate import service for each third party service. I'm not sure if that is ultimately what we will need to do. There is a good chance this service could handle many different third party csv's without being overly complex. And we wouldn't need to repeat a lot of code. I'm hopeful it will be as simple as handling the "email" column and potentially the "Timestamp" column as those two are what the service depends on. The rest of the columns just get added as question/answer. I can't imagine any other things that would cause an issues from one form type to another. If we can keep one clean and simple service, I don't see the need to move the validation logic. If you think it is more organized to move to a concern even with only one import service, I'm happy to do it. |
Ah yes…if we only need one service for all third party forms it makes sense to keep as is! |
|
||
first_row = CSV.foreach(@file.to_path).first | ||
raise EmailColumnError if first_row.nil? | ||
raise EmailColumnError unless first_row.include?("Email") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we add lowercase email
here as well just in case the user doesn't adhere to case requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm confused on what you are asking. If the header row doesn't include "Email"(capitalized) the error is raised. The service is currently using "Email". Do you want "email" to work as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry yeh that’s right. We may as well allow Email and email as valid because users won’t adhere to case rules half the time.
raise FileTypeError unless @file.content_type == "text/csv" | ||
|
||
first_row = CSV.foreach(@file.to_path).first | ||
raise EmailColumnError if first_row.nil? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am understanding this correctly, we are raising EmailColumnError
if the first row is empty, right? Should we have a different error class to handle this case? Something like NoDataError
(I am sure there is a better name than that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is checking for an empty file as an edge case (came up during testing). I just reused the same error because technically the file does not have the "Email" column but I can add a more specific error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I like the structure.
end | ||
Status.new(@errors.empty?, @count, @no_match, @errors) | ||
end | ||
|
||
private | ||
|
||
def validate_file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the vein of single responsibility, we could have separate validation methods for each validation. Might be something to consider if we add more validations. I am fine either way in this case.
email_headers = ["Email", "email", "Email Address", "email address"] | ||
email_headers.each do |e| | ||
@email_header = e if first_row.include?(e) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the simplest way I can think of at the moment to allow both "Email" and "email". Since I had to do this change anyways, it is easy enough to add new naming variation's so I added the stakeholders as well.
There isn't any extra logic to deal with the scenario of multiple headers having an email name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thought, do we need to also validate the timestamp header the same way we do for email?
Good point. It is the default for google but someone could alter their csv and we should handle that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I think we should update the CSV import page to state those two headers need to be present for it to work. I will add that in a PR I am making with small fixes here and there.
Resolves #1210
✍️ Description
This adds a validation check within the csv import service. The file type is checked and the headers are checked for the column named "Email". We still have the issue where an organization could name their "Email" column something different like "Email Address" which would produce an error.
📷 Screenshots/Demos