channeldb: flatten the htlc attempts bucket #5635

bhandras · 2021-08-17T14:31:10Z

This PR is a split from #5392 where in order to be able to fetch htlc attempts of a payment in one backend transaction, we flatten the payment-htlcs-bucket such that it won't include a sub bucket for each individual attempt, but instead we will postfix attempt info keys with the attempt ID. Furthermore we shorten the key prefixes for convenience.

Why this flattening is important?

When using a remote DB backend (etcd or Postgres) fetching a payment can be especially heavy since each DB operation will result in a round-trip to the DB. By flattening the htlc's bucket where each attempt uses a global ID we may be able to fetch the bucket in one go. An implementation of such strategy can be found here: #5640

The speedup is roughly 25% when using the async-payments-benchmark integration test and helps with achiving stable performance using the bottlepay benchmark suite.

yyforyongyu

Interestingly I had thought about this too. I think the current approach will potentially increase the data size of each db tx. If we have many attempts inside a payment, will that be an issue?
Previously I thought we could add to state to the payment so that we wouldn't need to fetch the htlc attempts in every query. All the attempts info can be abstracted using a state. If the payment is in a pending state, we need the attempt info. Otherwise, we won't bother fetching them. WDYT?

channeldb/payments.go

yyforyongyu · 2021-08-19T07:32:49Z

channeldb/payments.go

@@ -312,6 +320,7 @@ func fetchPayment(bucket kvdb.RBucket) (*MPPayment, error) {

 	var htlcs []HTLCAttempt
 	htlcsBucket := bucket.NestedReadBucket(paymentHtlcsBucket)
+


nit: extra new line

channeldb/payments.go

yyforyongyu · 2021-08-19T07:51:40Z

channeldb/migration23/migration.go

+		// Get the bucket which contains the payment, fail if the key
+		// does not have a bucket.
+		bucket := paymentsBucket.NestedReadWriteBucket(k)
+		if bucket == nil {


Could the paymentsBucket be empty?

It could but then ForEach will not iterate.

Actually I wanted to ask was there the above bucket could be nil or not...If it could be it seems we shouldn't return an error below.

Sorry misunderstood what you meant. The semantics of NestedRead(Write)Bucket are a bit tricky since it never returns an error. When it returns nil it means that the bucket doesn't exists. In this case the payments-root-bucket can either be empty (no payments yet) or only contains buckets which we ensure with this check.

yyforyongyu · 2021-08-19T07:53:47Z

channeldb/migration23/migration.go

+		}
+
+		// Delete the attempt bucket.
+		return htlcsBucket.DeleteNestedBucket(aid)


Just to be safe, I think we need to do it in three steps,

get the old keys

put the new keys

delete the old keys iff the second step succeeded

This all happens in one huge write transaction. Therefore the order here doesn't play a huge role, if anything goes wrong, the whole transaction is rolled back.
But that is actually the main concern. Because we need every migration to be capable of running atomically, we'll end up migrating all payments in one transaction. For some nodes that might be a huge chunk of data.

Not sure what we can realistically do about it here...
I only have two ideas:

Just go with it and hope it will be fine for all users. If anyone runs into a problem, we can advise them to delete failed payments first (we might need [WIP] Batched payment deletion #5368 and a CLI tool for using that RPC as well) for that.

Do the migration in four steps: one for each key (info/settle/fail), where we just copy the data and then a last step that deletes the attempt ID buckets. That way the individual transactions would still be atomic (meaning either all payments are copied or none are) but lnd will only start once all 4 steps were run. But if let's say step 3 fails, you're in a bad state as well, so this sounds like a bad idea...

Okay PTAL, I think we should be safe from over allocating ourselves. The only issue may be that bbolt is unable to handle such large txn. Also due to multiple bugs in bbolt cursors we can't safely Delete [1] or Put [2] inside the cursor as it turns out, so I rewrote the migratio to iterate in Go instead.

[1] boltdb/bolt#357
[2] boltdb/bolt#268

This all happens in one huge write transaction.
Yeah I was thinking we should do it in separate transactions, at least for the deletion.

Another idea is to limit how many records we do at once. Maybe we could update N payments each time, and perform this update several times?

yyforyongyu · 2021-08-19T07:55:21Z

channeldb/migration23/migration.go

+		// Collect attempt/settle/fail infos.
+		attemptInfo := attemptBucket.Get(oldAttemptInfoKey)
+		if attemptInfo != nil {
+			key := string(htlcBucketKey(htlcAttemptInfoKey, aid))


Why do we convert it into string?

Migration code is now very differrent, so we avoid this conversion.

channeldb/migration23/migration_test.go

guggero

Great PR, very easy to follow the diff!

Added a comment about the DB transaction size, but not sure what we can realistically do about it. Knowing how many round trips this will save us, I think we do want this change to be implemented.

channeldb/payments.go

guggero · 2021-08-19T15:10:36Z

channeldb/migration23/migration.go

+		}
+
+		settleInfo := attemptBucket.Get(oldSettleInfoKey)
+		if settleInfo != nil {


Do we need to care about nil vs. empty slice here, depending on the backend used? Maybe use len() > 0 instead? They should never be empty if they exist, right?

They should never be empty, but I agree len() > 0 is safer. Done.

guggero · 2021-08-19T15:22:25Z

channeldb/migration23/migration.go

+		}
+
+		// Delete the attempt bucket.
+		return htlcsBucket.DeleteNestedBucket(aid)


This all happens in one huge write transaction. Therefore the order here doesn't play a huge role, if anything goes wrong, the whole transaction is rolled back.
But that is actually the main concern. Because we need every migration to be capable of running atomically, we'll end up migrating all payments in one transaction. For some nodes that might be a huge chunk of data.

Not sure what we can realistically do about it here...
I only have two ideas:

Just go with it and hope it will be fine for all users. If anyone runs into a problem, we can advise them to delete failed payments first (we might need [WIP] Batched payment deletion #5368 and a CLI tool for using that RPC as well) for that.

Do the migration in four steps: one for each key (info/settle/fail), where we just copy the data and then a last step that deletes the attempt ID buckets. That way the individual transactions would still be atomic (meaning either all payments are copied or none are) but lnd will only start once all 4 steps were run. But if let's say step 3 fails, you're in a bad state as well, so this sounds like a bad idea...

guggero · 2021-08-19T15:25:06Z

docs/release-notes/release-notes-0.14.0.md

@@ -127,6 +127,10 @@ you.
  to make it less likely that we retry etcd transactions and make the commit
  queue more scalable.

+* [Flatten the payment-htlcs-bucket](https://github.com/lightningnetwork/lnd/pull/5635)
+  in order to make it possible to prefetch all htlc attempts of a payment in one
+  DB operation.


Depending on the decision we make concerning the potentially large DB transaction, we might need to add some more information here what the user can do if they run into problems.

bhandras

Thanks for the reviews @guggero and @yyforyongyu. Changed migration logic to be more memory friendly and added more tests to cover all cases. PTAL.

bhandras · 2021-08-24T12:56:34Z

channeldb/payments.go

@@ -312,6 +320,7 @@ func fetchPayment(bucket kvdb.RBucket) (*MPPayment, error) {

 	var htlcs []HTLCAttempt
 	htlcsBucket := bucket.NestedReadBucket(paymentHtlcsBucket)
+


channeldb/payments.go

bhandras · 2021-08-24T13:42:24Z

channeldb/migration23/migration.go

+		// Get the bucket which contains the payment, fail if the key
+		// does not have a bucket.
+		bucket := paymentsBucket.NestedReadWriteBucket(k)
+		if bucket == nil {


It could but then ForEach will not iterate.

bhandras · 2021-08-24T15:17:45Z

channeldb/migration23/migration.go

+		// Collect attempt/settle/fail infos.
+		attemptInfo := attemptBucket.Get(oldAttemptInfoKey)
+		if attemptInfo != nil {
+			key := string(htlcBucketKey(htlcAttemptInfoKey, aid))


Migration code is now very differrent, so we avoid this conversion.

bhandras · 2021-08-24T15:20:10Z

channeldb/migration23/migration.go

+		}
+
+		settleInfo := attemptBucket.Get(oldSettleInfoKey)
+		if settleInfo != nil {


They should never be empty, but I agree len() > 0 is safer. Done.

bhandras · 2021-08-24T15:23:01Z

channeldb/migration23/migration.go

+		}
+
+		// Delete the attempt bucket.
+		return htlcsBucket.DeleteNestedBucket(aid)


Okay PTAL, I think we should be safe from over allocating ourselves. The only issue may be that bbolt is unable to handle such large txn. Also due to multiple bugs in bbolt cursors we can't safely Delete [1] or Put [2] inside the cursor as it turns out, so I rewrote the migratio to iterate in Go instead.

[1] boltdb/bolt#357
[2] boltdb/bolt#268

channeldb/migration23/migration_test.go

bhandras · 2021-08-24T17:09:51Z

docs/release-notes/release-notes-0.14.0.md

@@ -127,6 +127,10 @@ you.
  to make it less likely that we retry etcd transactions and make the commit
  queue more scalable.

+* [Flatten the payment-htlcs-bucket](https://github.com/lightningnetwork/lnd/pull/5635)
+  in order to make it possible to prefetch all htlc attempts of a payment in one
+  DB operation.


channeldb/payments.go

guggero

LGTM 🎉

Tested the migration with 790k payments and it took about 40 seconds on my machine which isn't too bad at all!

guggero · 2021-08-26T15:01:22Z

channeldb/payments.go

+			htlcsMap[aid].Failure, err = readHtlcFailInfo(v)
+			if err != nil {
+				return err
+			}


I'm wondering if there should be a default case that returns an error? Since there shouldn't be any other keys in this bucket anymore.

Yes indeed, that's a good point. Done

guggero · 2021-08-26T15:02:47Z

channeldb/payments.go

-func fetchHtlcAttemptInfo(bucket kvdb.RBucket) (*HTLCAttemptInfo, error) {
-	b := bucket.Get(htlcAttemptInfoKey)
-	if b == nil {
-		return nil, errNoAttemptInfo


We're slightly changing the logic here. Now we won't fail anymore if there isn't an attempt info key for a payment. But I guess that's probably okay?

Yes, you're right. Added back a santiy check to ensure that all attempts have an attempt info.

yyforyongyu · 2021-08-26T19:01:54Z

channeldb/migration23/migration.go

+	if err := payments.ForEach(func(hash, v []byte) error {
+		// Get the bucket which contains the payment, fail if the key
+		// does not have a bucket.
+		bucket := payments.NestedReadWriteBucket(hash)


nit: could use a read bucket.

yyforyongyu · 2021-08-26T19:05:20Z

channeldb/migration23/migration.go

+
+	// Collect all payment hashes so we can migrate payments one-by-one to
+	// avoid any bugs bbolt might have when invalidating cursors.
+	// For 100 million payments, this would need about 3 GiB memory so we


How many attempts are there for those 100 million payments?

As far as I understand there's no hard upper bound, but certainly limited by the graph itself so we can say it's constant per payment? Also since we migrate per payment we won't pre-allocate for each attempt.

Gocha. I asked because I was curious about the 3 GiB memory usage.

yyforyongyu · 2021-08-26T19:06:17Z

channeldb/migration23/migration.go

+		// Get the bucket which contains the payment, fail if the key
+		// does not have a bucket.
+		bucket := paymentsBucket.NestedReadWriteBucket(k)
+		if bucket == nil {


Actually I wanted to ask was there the above bucket could be nil or not...If it could be it seems we shouldn't return an error below.

yyforyongyu · 2021-08-26T19:11:56Z

channeldb/migration23/migration.go

+		}
+
+		// Delete the attempt bucket.
+		return htlcsBucket.DeleteNestedBucket(aid)


This all happens in one huge write transaction.
Yeah I was thinking we should do it in separate transactions, at least for the deletion.

Another idea is to limit how many records we do at once. Maybe we could update N payments each time, and perform this update several times?

yyforyongyu · 2021-08-26T19:16:39Z

Tested the migration with 790k payments and it took about 40 seconds on my machine which isn't too bad at all!

What happens when the program exits during those 40 seconds? Will the migration continue again after restart?

guggero · 2021-08-27T06:58:48Z

What happens when the program exits during those 40 seconds? Will the migration continue again after restart?

Since it's one large DB transaction (which is what makes this tricky memory wise for both bolt and remote DBs), it's only committed in the end. So if the process exits, the DB is rolled back and the full migration retries on next startup.

yyforyongyu

LGTM👍

More thoughts on this topic. I think the htlc attempts are only needed when the payment is not in a terminal state. Once the payment is done(fail/succeed), the extra bandwidth used to carry htlc attempts is unnecessary. And since the payment is usually completed in a short time(60 seconds default timeout iirc), most of the time, those htlc attempts could be avoided. It would be nice if we could somehow specify what to be queried in that one big db transaction, kinda like constructing SQL queries beforehand like the old-fashioned way. Of course this is some future work/discussion.

yyforyongyu · 2021-08-31T12:25:59Z

channeldb/migration23/migration.go

+
+	// Collect all payment hashes so we can migrate payments one-by-one to
+	// avoid any bugs bbolt might have when invalidating cursors.
+	// For 100 million payments, this would need about 3 GiB memory so we


Gocha. I asked because I was curious about the 3 GiB memory usage.

yyforyongyu · 2021-08-31T12:27:40Z

channeldb/migration23/migration_test.go

+	hash3Str = "99eb3f0a48f954e495d0c14ac63df04af8cefa59dafdbcd3d5046d1f564784d1"
+	hash3    = hexStr(hash3Str)
+
+	// failing1 will fail because all payment hashes should point to sub


Very nice docs!

orijbot · 2021-09-07T08:47:03Z

Visit https://dashboard.github.orijtech.com?back=0&pr=5635&remote=true&repo=bhandras%2Flnd to see benchmark details.

Roasbeef

LGTM

Just some small nits that can be cleaned up, but can be tacked onto another of the dependent PRs, want to keep this PR train moving along 🚄

Roasbeef · 2021-09-08T02:22:31Z

channeldb/payments.go

-	// failure information, if any.
-	htlcFailInfoKey = []byte("htlc-fail-info")
+	// htlcFailInfoKey is the key used as the prefix of an HTLC attempt
+	// failure information, if any.The  HTLC attempt ID is concatenated at


Missing space after the prior here, also some extra spaces after The.

Roasbeef · 2021-09-08T02:30:34Z

channeldb/payments.go

-		htlcBucket := bucket.NestedReadBucket(k)
+	attemptInfoCount := 0
+	err := bucket.ForEach(func(k, v []byte) error {
+		aid := byteOrder.Uint64(k[len(k)-8:])


8 seems to the the attempt ID itself, could be lifted to a constant somewhere above.

odeke-em · 2021-09-08T07:01:51Z

Congrats @bhandras and well done! Indeed, to chime in your report of a 25% increase, this change showed all good improvements in the benchmarks per https://dashboard.github.orijtech.com/benchmark/8dcabf559b0b435c9a05c203224c9fdd

/cc @cuonglm @kirbyquerby @willpoint

Thanks @Roasbeef for adopting and using bencher :-) We plan on introducing better UX to improve the workflow and results

bhandras force-pushed the flatten-htlc-bucket branch from 0320066 to e2ace14 Compare August 17, 2021 15:07

bhandras mentioned this pull request Aug 18, 2021

kvdb+channeld: extend kvdb with Prefetch for prefetching buckets in one go and speed up payment control by prefetching payments on hot paths #5640

Merged

yyforyongyu self-requested a review August 18, 2021 13:54

yyforyongyu requested changes Aug 19, 2021

View reviewed changes

guggero reviewed Aug 19, 2021

View reviewed changes

bhandras force-pushed the flatten-htlc-bucket branch 2 times, most recently from 538ff09 to 6fd0cab Compare August 24, 2021 17:09

bhandras commented Aug 24, 2021

View reviewed changes

bhandras requested review from yyforyongyu and guggero August 24, 2021 17:10

joostjager reviewed Aug 24, 2021

View reviewed changes

channeldb/payments.go Show resolved Hide resolved

guggero approved these changes Aug 26, 2021

View reviewed changes

yyforyongyu reviewed Aug 26, 2021

View reviewed changes

bhandras force-pushed the flatten-htlc-bucket branch from 6fd0cab to f023631 Compare August 27, 2021 13:12

bhandras requested a review from yyforyongyu August 27, 2021 13:14

yyforyongyu approved these changes Aug 31, 2021

View reviewed changes

bhandras added 3 commits September 7, 2021 10:46

channeldb: flatten the payment-htlcs-bucket

63d7f5f

channeldb: migration to flatten the htlc attempts bucket

204966d

docs: update release-notes-0.14.0.md

f2cc783

bhandras force-pushed the flatten-htlc-bucket branch from f023631 to f2cc783 Compare September 7, 2021 08:47

Roasbeef approved these changes Sep 8, 2021

View reviewed changes

Roasbeef merged commit 75f5b40 into lightningnetwork:master Sep 8, 2021

bjarnemagnussen mentioned this pull request Sep 13, 2021

add DeletePayment that allows to delete a specific payment or its failed HTLCs #5660

Merged

bhandras deleted the flatten-htlc-bucket branch September 12, 2023 15:27

		@@ -312,6 +320,7 @@ func fetchPayment(bucket kvdb.RBucket) (*MPPayment, error) {

		var htlcs []HTLCAttempt
		htlcsBucket := bucket.NestedReadBucket(paymentHtlcsBucket)

channeldb: flatten the htlc attempts bucket #5635

channeldb: flatten the htlc attempts bucket #5635

Conversation

bhandras commented Aug 17, 2021 • edited Loading

yyforyongyu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guggero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhandras left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guggero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yyforyongyu commented Aug 26, 2021

guggero commented Aug 27, 2021

yyforyongyu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orijbot commented Sep 7, 2021

Roasbeef left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

odeke-em commented Sep 8, 2021

bhandras commented Aug 17, 2021 •

edited

Loading