Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

end recovery on shutdown #944

Merged
merged 1 commit into from
Aug 15, 2024

Conversation

buck54321
Copy link
Contributor

The (*Wallet).recovery loop is unmonitored, and shutting down the wallet during recovery without locking the wallet first will hang. This change ensures that the recovery loop is ended when (*Wallet).Stop is called.

Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tACK, LGTM 🎉

func (w *Wallet) endRecoveryAndWait() {
if recoverySyncI := w.recovering.Load(); recoverySyncI != nil {
recoverySync := recoverySyncI.(*recoverySyncer)
// If recovery is still running, it will end early with an error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (pre-existing): Add newline before comment?

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a new unit test that shows the deadlock that can happen re improper shutdown. Then this patch could be applied above, demonstrating concrete resolution.

@buck54321 buck54321 force-pushed the end-recovery-on-shutdown branch 2 times, most recently from efaf377 to aba7c35 Compare August 10, 2024 16:12
Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commits could be squashed, otherwise looks good. Thanks for the test.

wallet/wallet_test.go Outdated Show resolved Hide resolved
Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I tried to get the test to fail with the fix to call endRecovery in the Stop method with this patch:

diff --git a/wallet/wallet.go b/wallet/wallet.go
index 4bde6225..7d93d153 100644
--- a/wallet/wallet.go
+++ b/wallet/wallet.go
@@ -280,8 +280,6 @@ func (w *Wallet) quitChan() <-chan struct{} {
 
 // Stop signals all wallet goroutines to shutdown.
 func (w *Wallet) Stop() {
-	<-w.endRecovery()
-
 	w.quitMu.Lock()
 	quit := w.quit
 	w.quitMu.Unlock()
diff --git a/wallet/wallet_test.go b/wallet/wallet_test.go
index 4c1efd7f..019958e0 100644
--- a/wallet/wallet_test.go
+++ b/wallet/wallet_test.go
@@ -420,11 +420,9 @@ func TestEndRecovery(t *testing.T) {
 	// Recovery is running
 	getBlockHashCalls(3)
 
-	// Closing the quit channel, e.g. Stop() without endRecovery, alone will not
-	// end the recovery loop.
-	w.quitMu.Lock()
-	close(w.quit)
-	w.quitMu.Unlock()
+	// Call stop directly simulating a normal shutdown.
+	w.Stop()
+
 	// Continues scanning.
 	getBlockHashCalls(3)
 
@@ -471,9 +469,7 @@ func TestEndRecovery(t *testing.T) {
 
 	// testWallet starts a couple of other unrelated goroutines that need to be
 	// killed, so we still need to close the quit channel.
-	w.quitMu.Lock()
-	close(w.quit)
-	w.quitMu.Unlock()
+	w.Stop()
 
 	select {
 	case <-waitedForShutdown:

With that applied locally, when running both with and without the race condition detector the test still passes. I think the fix itself is sound (now actually signal the recovery loop for a force exit when Stop is called), but I think the test needs a bit more tuning.

Perhaps what we want to do instead, is introspect a bit more into the call to recovery?

btcwallet/wallet/wallet.go

Lines 468 to 475 in aba7c35

// If the wallet requested an on-chain recovery of its funds, we'll do
// so now.
if w.recoveryWindow > 0 {
if err := w.recovery(chainClient, birthdayStamp); err != nil {
return fmt.Errorf("unable to perform wallet recovery: "+
"%w", err)
}
}

wallet/wallet.go Outdated Show resolved Hide resolved
wallet/wallet_test.go Show resolved Hide resolved
@Roasbeef
Copy link
Member

Scratch the comment above, I needed another change to the test, stopping it from explicitly calling endRecovery:

diff --git a/wallet/wallet.go b/wallet/wallet.go
index 4bde6225..7d93d153 100644
--- a/wallet/wallet.go
+++ b/wallet/wallet.go
@@ -280,8 +280,6 @@ func (w *Wallet) quitChan() <-chan struct{} {
 
 // Stop signals all wallet goroutines to shutdown.
 func (w *Wallet) Stop() {
-	<-w.endRecovery()
-
 	w.quitMu.Lock()
 	quit := w.quit
 	w.quitMu.Unlock()
diff --git a/wallet/wallet_test.go b/wallet/wallet_test.go
index 4c1efd7f..06ebb45c 100644
--- a/wallet/wallet_test.go
+++ b/wallet/wallet_test.go
@@ -420,11 +420,9 @@ func TestEndRecovery(t *testing.T) {
 	// Recovery is running
 	getBlockHashCalls(3)
 
-	// Closing the quit channel, e.g. Stop() without endRecovery, alone will not
-	// end the recovery loop.
-	w.quitMu.Lock()
-	close(w.quit)
-	w.quitMu.Unlock()
+	// Call stop directly simulating a normal shutdown.
+	w.Stop()
+
 	// Continues scanning.
 	getBlockHashCalls(3)
 
@@ -461,19 +459,9 @@ func TestEndRecovery(t *testing.T) {
 	// Recovery is running
 	getBlockHashCalls(3)
 
-	// endRecovery is required to exit the unmonitored goroutine.
-	end := w.endRecovery()
-	select {
-	case <-blockHashCalled:
-	case <-recoveryDone:
-	}
-	<-end
-
 	// testWallet starts a couple of other unrelated goroutines that need to be
 	// killed, so we still need to close the quit channel.
-	w.quitMu.Lock()
-	close(w.quit)
-	w.quitMu.Unlock()
+	w.Stop()
 
 	select {
 	case <-waitedForShutdown:
@@ -481,6 +469,7 @@ func TestEndRecovery(t *testing.T) {
 		t.Fatal("WaitForShutdown never returned")
 	}
 
 	if !strings.EqualFold(err.Error(), "recovery: forced shutdown") {
 		t.Fatal("wrong error")
 	}

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐊

I think this is g2g once the commits are squashed with some of the minor comment style comments addressed.

@buck54321 buck54321 force-pushed the end-recovery-on-shutdown branch from aba7c35 to 8e2426a Compare August 13, 2024 17:54
@Roasbeef Roasbeef merged commit 6ecae9c into btcsuite:master Aug 15, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants