end recovery on shutdown #944

buck54321 · 2024-08-07T13:24:19Z

The (*Wallet).recovery loop is unmonitored, and shutting down the wallet during recovery without locking the wallet first will hang. This change ensures that the recovery loop is ended when (*Wallet).Stop is called.

guggero

tACK, LGTM 🎉

guggero · 2024-08-07T15:00:41Z

wallet/wallet.go

+func (w *Wallet) endRecoveryAndWait() {
+	if recoverySyncI := w.recovering.Load(); recoverySyncI != nil {
+		recoverySync := recoverySyncI.(*recoverySyncer)
+		// If recovery is still running, it will end early with an error


nit: (pre-existing): Add newline before comment?

Roasbeef

I think we should add a new unit test that shows the deadlock that can happen re improper shutdown. Then this patch could be applied above, demonstrating concrete resolution.

guggero

Commits could be squashed, otherwise looks good. Thanks for the test.

wallet/wallet_test.go

Roasbeef

So I tried to get the test to fail with the fix to call endRecovery in the Stop method with this patch:

diff --git a/wallet/wallet.go b/wallet/wallet.go
index 4bde6225..7d93d153 100644
--- a/wallet/wallet.go
+++ b/wallet/wallet.go
@@ -280,8 +280,6 @@ func (w *Wallet) quitChan() <-chan struct{} {
 
 // Stop signals all wallet goroutines to shutdown.
 func (w *Wallet) Stop() {
-	<-w.endRecovery()
-
 	w.quitMu.Lock()
 	quit := w.quit
 	w.quitMu.Unlock()
diff --git a/wallet/wallet_test.go b/wallet/wallet_test.go
index 4c1efd7f..019958e0 100644
--- a/wallet/wallet_test.go
+++ b/wallet/wallet_test.go
@@ -420,11 +420,9 @@ func TestEndRecovery(t *testing.T) {
 	// Recovery is running
 	getBlockHashCalls(3)
 
-	// Closing the quit channel, e.g. Stop() without endRecovery, alone will not
-	// end the recovery loop.
-	w.quitMu.Lock()
-	close(w.quit)
-	w.quitMu.Unlock()
+	// Call stop directly simulating a normal shutdown.
+	w.Stop()
+
 	// Continues scanning.
 	getBlockHashCalls(3)
 
@@ -471,9 +469,7 @@ func TestEndRecovery(t *testing.T) {
 
 	// testWallet starts a couple of other unrelated goroutines that need to be
 	// killed, so we still need to close the quit channel.
-	w.quitMu.Lock()
-	close(w.quit)
-	w.quitMu.Unlock()
+	w.Stop()
 
 	select {
 	case <-waitedForShutdown:

With that applied locally, when running both with and without the race condition detector the test still passes. I think the fix itself is sound (now actually signal the recovery loop for a force exit when Stop is called), but I think the test needs a bit more tuning.

Perhaps what we want to do instead, is introspect a bit more into the call to recovery?

btcwallet/wallet/wallet.go

Lines 468 to 475 in aba7c35

    
           // If the wallet requested an on-chain recovery of its funds, we'll do 
        
           // so now. 
        
           if w.recoveryWindow > 0 { 
        
           	if err := w.recovery(chainClient, birthdayStamp); err != nil { 
        
           		return fmt.Errorf("unable to perform wallet recovery: "+ 
        
           			"%w", err) 
        
           	} 
        
           }

wallet/wallet.go

wallet/wallet_test.go

Roasbeef · 2024-08-13T02:03:39Z

Scratch the comment above, I needed another change to the test, stopping it from explicitly calling endRecovery:

diff --git a/wallet/wallet.go b/wallet/wallet.go
index 4bde6225..7d93d153 100644
--- a/wallet/wallet.go
+++ b/wallet/wallet.go
@@ -280,8 +280,6 @@ func (w *Wallet) quitChan() <-chan struct{} {
 
 // Stop signals all wallet goroutines to shutdown.
 func (w *Wallet) Stop() {
-	<-w.endRecovery()
-
 	w.quitMu.Lock()
 	quit := w.quit
 	w.quitMu.Unlock()
diff --git a/wallet/wallet_test.go b/wallet/wallet_test.go
index 4c1efd7f..06ebb45c 100644
--- a/wallet/wallet_test.go
+++ b/wallet/wallet_test.go
@@ -420,11 +420,9 @@ func TestEndRecovery(t *testing.T) {
 	// Recovery is running
 	getBlockHashCalls(3)
 
-	// Closing the quit channel, e.g. Stop() without endRecovery, alone will not
-	// end the recovery loop.
-	w.quitMu.Lock()
-	close(w.quit)
-	w.quitMu.Unlock()
+	// Call stop directly simulating a normal shutdown.
+	w.Stop()
+
 	// Continues scanning.
 	getBlockHashCalls(3)
 
@@ -461,19 +459,9 @@ func TestEndRecovery(t *testing.T) {
 	// Recovery is running
 	getBlockHashCalls(3)
 
-	// endRecovery is required to exit the unmonitored goroutine.
-	end := w.endRecovery()
-	select {
-	case <-blockHashCalled:
-	case <-recoveryDone:
-	}
-	<-end
-
 	// testWallet starts a couple of other unrelated goroutines that need to be
 	// killed, so we still need to close the quit channel.
-	w.quitMu.Lock()
-	close(w.quit)
-	w.quitMu.Unlock()
+	w.Stop()
 
 	select {
 	case <-waitedForShutdown:
@@ -481,6 +469,7 @@ func TestEndRecovery(t *testing.T) {
 		t.Fatal("WaitForShutdown never returned")
 	}
 
 	if !strings.EqualFold(err.Error(), "recovery: forced shutdown") {
 		t.Fatal("wrong error")
 	}

Roasbeef

LGTM 🐊

I think this is g2g once the commits are squashed with some of the minor comment style comments addressed.

This was referenced Aug 7, 2024

BTC Wallet Sync Issues decred/dcrdex#2899

Open

end recovery on shutdown dcrlabs/ltcwallet#9

Merged

guggero approved these changes Aug 7, 2024

View reviewed changes

martonp approved these changes Aug 8, 2024

View reviewed changes

Roasbeef requested changes Aug 9, 2024

View reviewed changes

buck54321 force-pushed the end-recovery-on-shutdown branch 2 times, most recently from efaf377 to aba7c35 Compare August 10, 2024 16:12

guggero approved these changes Aug 12, 2024

View reviewed changes

wallet/wallet_test.go Outdated Show resolved Hide resolved

buck54321 mentioned this pull request Aug 12, 2024

btc: update dep decred/dcrdex#2913

Merged

Roasbeef reviewed Aug 13, 2024

View reviewed changes

wallet/wallet.go Outdated Show resolved Hide resolved

wallet/wallet_test.go Show resolved Hide resolved

Roasbeef approved these changes Aug 13, 2024

View reviewed changes

end recovery on shutdown

8e2426a

buck54321 force-pushed the end-recovery-on-shutdown branch from aba7c35 to 8e2426a Compare August 13, 2024 17:54

Roasbeef merged commit 6ecae9c into btcsuite:master Aug 15, 2024
3 checks passed

buck54321 mentioned this pull request Aug 18, 2024

btc/ltc: update wallet deps decred/dcrdex#2922

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

end recovery on shutdown #944

end recovery on shutdown #944

buck54321 commented Aug 7, 2024

guggero left a comment

guggero Aug 7, 2024

Roasbeef left a comment

guggero left a comment

Roasbeef left a comment

Roasbeef commented Aug 13, 2024

Roasbeef left a comment

	// If the wallet requested an on-chain recovery of its funds, we'll do
	// so now.
	if w.recoveryWindow > 0 {
	if err := w.recovery(chainClient, birthdayStamp); err != nil {
	return fmt.Errorf("unable to perform wallet recovery: "+
	"%w", err)
	}
	}

end recovery on shutdown #944

end recovery on shutdown #944

Conversation

buck54321 commented Aug 7, 2024

guggero left a comment

Choose a reason for hiding this comment

guggero Aug 7, 2024

Choose a reason for hiding this comment

Roasbeef left a comment

Choose a reason for hiding this comment

guggero left a comment

Choose a reason for hiding this comment

Roasbeef left a comment

Choose a reason for hiding this comment

Roasbeef commented Aug 13, 2024

Roasbeef left a comment

Choose a reason for hiding this comment