New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add CSV Import File Validations #1211

Merged

jmilljr24 merged 7 commits into main from 1210-csv-import-file-validation

Dec 6, 2024

Collaborator

jmilljr24 commented Dec 5, 2024

Resolves #1210

✍️ Description

This adds a validation check within the csv import service. The file type is checked and the headers are checked for the column named "Email". We still have the issue where an organization could name their "Email" column something different like "Email Address" which would produce an error.

📷 Screenshots/Demos


          add validation and error messages

a824103

jmilljr24 marked this pull request as draft

December 5, 2024 14:26


          add tests

e0e5f86

jmilljr24 marked this pull request as ready for review

December 5, 2024 15:50

jmilljr24 mentioned this pull request

External Form Upload: Display import summary / errors #1208

Merged

Collaborator

kasugaijin commented Dec 5, 2024 •

edited

Loading

We still have the issue where an organization could name their "Email" column something different like "Email Address" which would produce an error.

I think it's normal to have requirements on headers for CSV uploads. So, we could make a note on the upload page that there must be an email header. We could also consider looking up either email or email address but we'd need to start handling case, spaces and it could get messy. Might be easier to enforce just this one column to keep it simple.

kasugaijin reviewed

View reviewed changes

Collaborator

kasugaijin left a comment

What do you think about moving the validation logic to its own single responsibility? We can use these validations for file type and email header for CSV imports from all third party services. Maybe we can keep it in app/services/organizations/importers/concerns?

Collaborator Author

jmilljr24 commented Dec 6, 2024

What do you think about moving the validation logic to its own single responsibility? We can use these validations for file type and email header for CSV imports from all third party services. Maybe we can keep it in app/services/organizations/importers/concerns?

That does make sense if we have a separate import service for each third party service. I'm not sure if that is ultimately what we will need to do. There is a good chance this service could handle many different third party csv's without being overly complex. And we wouldn't need to repeat a lot of code. I'm hopeful it will be as simple as handling the "email" column and potentially the "Timestamp" column as those two are what the service depends on. The rest of the columns just get added as question/answer. I can't imagine any other things that would cause an issues from one form type to another.

If we can keep one clean and simple service, I don't see the need to move the validation logic. If you think it is more organized to move to a concern even with only one import service, I'm happy to do it.

Collaborator

kasugaijin commented Dec 6, 2024

Ah yes…if we only need one service for all third party forms it makes sense to keep as is!

kasugaijin reviewed

View reviewed changes

app/services/organizations/importers/google_csv_import_service.rb Outdated

+                      first_row = CSV.foreach(@file.to_path).first
+                      raise EmailColumnError if first_row.nil?
+                      raise EmailColumnError unless first_row.include?("Email")

Collaborator

kasugaijin Dec 6, 2024 •

edited

Loading

shall we add lowercase email here as well just in case the user doesn't adhere to case requirement?

Collaborator Author

jmilljr24 Dec 6, 2024

I think I'm confused on what you are asking. If the header row doesn't include "Email"(capitalized) the error is raised. The service is currently using "Email". Do you want "email" to work as well?

Collaborator

kasugaijin Dec 6, 2024

Sorry yeh that’s right. We may as well allow Email and email as valid because users won’t adhere to case rules half the time.

kasugaijin reviewed

View reviewed changes

app/services/organizations/importers/google_csv_import_service.rb Outdated

+                      raise FileTypeError unless @file.content_type == "text/csv"
+                      first_row = CSV.foreach(@file.to_path).first
+                      raise EmailColumnError if first_row.nil?

Collaborator

kasugaijin Dec 6, 2024

If I am understanding this correctly, we are raising EmailColumnError if the first row is empty, right? Should we have a different error class to handle this case? Something like NoDataError (I am sure there is a better name than that).

Collaborator Author

jmilljr24 Dec 6, 2024

Yes it is checking for an empty file as an edge case (came up during testing). I just reused the same error because technically the file does not have the "Email" column but I can add a more specific error.

kasugaijin reviewed

View reviewed changes

app/services/organizations/importers/google_csv_import_service.rb

Collaborator

kasugaijin Dec 6, 2024

Nice work! I like the structure.

kasugaijin reviewed

View reviewed changes

app/services/organizations/importers/google_csv_import_service.rb

                       end
                       Status.new(@errors.empty?, @count, @no_match, @errors)
                     end
                     private
+                    def validate_file

Collaborator

kasugaijin Dec 6, 2024

In the vein of single responsibility, we could have separate validation methods for each validation. Might be something to consider if we add more validations. I am fine either way in this case.

jmilljr24 added 4 commits

December 6, 2024 13:17


          add error for empty file

53d1ae7


          add array of valid email header variations

50ecffc


          Merge branch 'main' into 1210-csv-import-file-validation

e922cd8


          remove debugger

e93623c

jmilljr24 commented

View reviewed changes

app/services/organizations/importers/google_csv_import_service.rb

Comment on lines +62 to +65

+                      email_headers = ["Email", "email", "Email Address", "email address"]
+                      email_headers.each do |e|
+                        @email_header = e if first_row.include?(e)
+                      end

Collaborator Author

jmilljr24 Dec 6, 2024

This is the simplest way I can think of at the moment to allow both "Email" and "email". Since I had to do this change anyways, it is easy enough to add new naming variation's so I added the stakeholders as well.

There isn't any extra logic to deal with the scenario of multiple headers having an email name.

Collaborator

kasugaijin Dec 6, 2024

Yeah I like this.

kasugaijin reviewed

View reviewed changes

Collaborator

kasugaijin left a comment

Another thought, do we need to also validate the timestamp header the same way we do for email?

Collaborator Author

jmilljr24 commented Dec 6, 2024

Another thought, do we need to also validate the timestamp header the same way we do for email?

Good point. It is the default for google but someone could alter their csv and we should handle that.


          add timestamp header validation

3911c0e

jmilljr24 requested a review from kasugaijin

December 6, 2024 19:38

kasugaijin approved these changes

View reviewed changes

Collaborator

kasugaijin left a comment

Looks great! I think we should update the CSV import page to state those two headers need to be present for it to work. I will add that in a PR I am making with small fixes here and there.

jmilljr24 merged commit 4d3912d into main

5 checks passed

jmilljr24 deleted the 1210-csv-import-file-validation branch

December 6, 2024 20:05

khiga8 mentioned this pull request

Staff Dashboard - External Form Upload: Display import summary / errors #1112

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet