Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEV-940/DEV-955 - simplify canister usage #40

Merged
merged 2 commits into from
Oct 24, 2023

Conversation

aelkiss
Copy link
Member

@aelkiss aelkiss commented Oct 20, 2023

  • remove preflight check for redirects file
  • add helper for overriding services in tests

Tests pass for me with no stack issues. I did occasionally see the stack issues in the process of cleaning this up, but not with respect to the redirects file not being found. I suspect it might have been happening when something in Services was overridden, but the test failed and the original value wasn't restored. Using the provided override_service test helper should avoid that (i.e. the original value will be restored even if the test raises an exception)

@moseshll Assuming you're good with this, I'm good with the other branch then being merged to main.

- remove preflight check for redirects file
- add helper for overriding services in tests
@aelkiss aelkiss requested a review from moseshll October 20, 2023 12:47
ENV["NO_DB"] == "1" or Services[:no_external_data?]
end

Services.register(:no_external_data?) do
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used except in Services, so removed.

HathiTrust::Services.register(:no_external_data?) { @save_no_external_data }
HathiTrust::Services.register(:redirects) { @save_redirects }
context "using nonexistent redirect file" do
override_service(:redirect_file) { "no_such_redirects_file.gz.txt" }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redirects itself should get re-composed after the dependent redirect_file gets updated. So long as redirect_file itself doesn't throw an exception, I don't think this should cause any stack issues.

it "bails out" do
expect {
CICTL::Commands.start(["index", "all", "--no-wait", "--quiet", "--log", test_log])
}.to raise_error(Errno::ENOENT)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the "file not found" error this raises is clear enough.

@@ -120,13 +112,17 @@
end

describe "#index today" do
# Create new update and delete files in a temp directory based on fixtures.
after(:each) { HathiTrust::Services.register(:data_directory) { @save_dd } }
Copy link
Member Author

@aelkiss aelkiss Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The override_service helper doesn't work here, because we need to do part of the test with the original data_directory and then later override it. That might point to a need to separately define fixtures or something for the tests rather than doing everything with data_directory, but I don't think it's worth trying to untangle right now.

@aelkiss aelkiss changed the title DEV-940 - simplify canister usage DEV-940/DEV-955 - simplify canister usage Oct 20, 2023
Copy link
Contributor

@moseshll moseshll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with everything here, with one exception: removing the preflight check. That was added not because of the Canister stack issues, but because last time I was in the production code I noticed that cictl index all -- so far as I could tell -- would first purge the Solr index (solr_client.empty_records!) before getting around to checking the redirects file and only then bail out with an error, leaving an empty index. Even with us running this manually this is pretty sketchy behavior.

I'd like to see some kind of redirects check done early enough to prevent that.

@aelkiss
Copy link
Member Author

aelkiss commented Oct 20, 2023

Got it. I will see if I can add a test for that.

@aelkiss
Copy link
Member Author

aelkiss commented Oct 20, 2023

Also regarding the redirects it doesn't look like we test the no_redirects? case, and MockRedirects is not defined. I'll add that back as well.

If the redirect file does not exist, can't be read, etc, then indexing
can fail partway through.
@aelkiss
Copy link
Member Author

aelkiss commented Oct 20, 2023

I pushed another commit that adds back MockRedirects and adds a test around ensuring that if things aren't present it bails out. I tried a few different options and didn't find anything entirely satisfactory - either way knowledge of the redirects file kind of leaks out, and the question is how much... I think this keeps it relatively isolated, but I think it would be preferable for cictl not to need to worry that the redirects exist at all. I think the fundamental issue is that we don't have any general hooks for early initialization, so (without preflight) we load the redirects on demand, but we also don't have a way to roll back if something goes wrong. In the solr cloud future where each time we index we create a new empty core, I think we can avoid having to worry about the pre-flighting, but aren't there yet.

@aelkiss aelkiss requested a review from moseshll October 20, 2023 18:26
Copy link
Contributor

@moseshll moseshll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this.
There's one irritation I have about saving and restoring Canister registrants (both in code I wrote and in override_service) and that is the inability to get the procs that were registered, only the results. To truly restore the Canister state, the original proc should be re-run post-restore and the result memoized again. But that would require an additional accessor built into Canister, and I dunno if it's worth it.
Edit: and I should note that this kind of thing only should pertain to running tests -- I don't expect to be saving/restoring Canister stuff in a production environment.

@aelkiss
Copy link
Member Author

aelkiss commented Oct 20, 2023

I'm happy with this. There's one irritation I have about saving and restoring Canister registrants (both in code I wrote and in override_service) and that is the inability to get the procs that were registered, only the results. To truly restore the Canister state, the original proc should be re-run post-restore and the result memoized again. But that would require an additional accessor built into Canister, and I dunno if it's worth it. Edit: and I should note that this kind of thing only should pertain to running tests -- I don't expect to be saving/restoring Canister stuff in a production environment.

Agreed. Probably something to consider upstream for canister. I created mlibrary/canister#8, and also put that on my "someday/maybe" list - if it becomes a real problem in the future we can address if nobody else gets to it.

@aelkiss aelkiss merged commit 08b41e9 into DEV-940_ENV_to_Canister Oct 24, 2023
1 check passed
@aelkiss aelkiss deleted the DEV-940_canister_2 branch October 24, 2023 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants