Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic if we're running with outdated state instead of force-closing #1564

Merged
merged 2 commits into from
Jun 27, 2022

Conversation

TheBlueMatt
Copy link
Collaborator

When we receive a channel_reestablish with a data_loss_protect
that proves we're running with a stale state, instead of
force-closing the channel, we immediately panic. This lines up with
our refusal to run if we find a ChannelMonitor which is stale
compared to our ChannelManager during ChannelManager
deserialization. Ultimately both are an indication of the same
thing - that the API requirements on chain::Watch were violated.

In the "running with outdated state but ChannelMonitor(s) and
ChannelManager lined up" case specifically its likely we're running
off of an old backup, in which case connecting to peers with
channels still live is explicitly dangerous. That said, because
this could be an operator error that is correctable, panicing
instead of force-closing may allow for normal operation again in
the future (cc #1207).

In any case, we provide instructions in the panic message for how
to force-close channels prior to peer connection, as well as a note
on how to broadcast the latest state if users are willing to take
the risk.

Note that this is still somewhat unsafe until we resolve #1563.

@TheBlueMatt TheBlueMatt added this to the 0.0.110 milestone Jun 23, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jun 23, 2022

Codecov Report

Merging #1564 (162024b) into main (3676a05) will increase coverage by 0.06%.
The diff coverage is 85.93%.

❗ Current head 162024b differs from pull request most recent head caa2a9a. Consider uploading reports for the commit caa2a9a to get more accurate results

@@            Coverage Diff             @@
##             main    #1564      +/-   ##
==========================================
+ Coverage   91.04%   91.10%   +0.06%     
==========================================
  Files          80       80              
  Lines       44034    44412     +378     
  Branches    44034    44412     +378     
==========================================
+ Hits        40091    40462     +371     
- Misses       3943     3950       +7     
Impacted Files Coverage Δ
lightning/src/ln/channel.rs 88.72% <0.00%> (-0.01%) ⬇️
lightning/src/util/events.rs 41.66% <ø> (ø)
lightning/src/ln/functional_tests.rs 96.92% <79.48%> (-0.30%) ⬇️
lightning-background-processor/src/lib.rs 95.20% <100.00%> (ø)
lightning-persister/src/lib.rs 93.45% <100.00%> (ø)
lightning/src/ln/chanmon_update_fail_tests.rs 97.71% <100.00%> (ø)
lightning/src/ln/channelmanager.rs 84.74% <100.00%> (+0.35%) ⬆️
lightning/src/ln/payment_tests.rs 99.25% <100.00%> (ø)
lightning/src/ln/priv_short_conf_tests.rs 96.60% <100.00%> (ø)
lightning/src/chain/onchaintx.rs 93.98% <0.00%> (-0.93%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3676a05...caa2a9a. Read the comment docs.

Copy link
Contributor

@wpaulino wpaulino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, need to fix the build errors reported in https://github.com/lightningdevkit/rust-lightning/runs/7032273201.

lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channel.rs Outdated Show resolved Hide resolved
lightning/src/ln/channel.rs Outdated Show resolved Hide resolved
lightning/src/ln/channel.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like fuzz tests need updating.

lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
lightning/src/ln/channel.rs Outdated Show resolved Hide resolved
lightning/src/ln/channel.rs Outdated Show resolved Hide resolved
return Err(ChannelError::CloseDelayBroadcast(
"We have fallen behind - we have received proof that if we broadcast remote is going to claim our funds - we can't do any automated broadcasting".to_owned()
));
macro_rules! log_and_panic {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a macro needed? Could just use a variable for $err_msg.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly - format strings have to be an explicit string, they can't be a &'static str or whatever, so I'm not sure how to do this without either repeating the whole string or a macro.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can format it first with let err_msg = format!("..", ..) and then use "{}", err_msg in each statement. Or maybe there is a more idiomatic way of using format_args!?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, no, format_args doesn't seem to do it either. Do you find the current code particularly unreadable? It seems fine to me, given its the one way to do it without building the full string first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, just figured we could do it without using a macro. Given it's gonna panic there's not a huge argument for building the string. Fine either way.

@TheBlueMatt TheBlueMatt force-pushed the 2022-06-panic-on-behind branch from 7126128 to 5fc55c3 Compare June 24, 2022 15:24
lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved
return Err(ChannelError::CloseDelayBroadcast(
"We have fallen behind - we have received proof that if we broadcast remote is going to claim our funds - we can't do any automated broadcasting".to_owned()
));
macro_rules! log_and_panic {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can format it first with let err_msg = format!("..", ..) and then use "{}", err_msg in each statement. Or maybe there is a more idiomatic way of using format_args!?

lightning/src/ln/channel.rs Outdated Show resolved Hide resolved
@TheBlueMatt TheBlueMatt force-pushed the 2022-06-panic-on-behind branch from 5fc55c3 to 162024b Compare June 24, 2022 18:24
jkczyz
jkczyz previously approved these changes Jun 24, 2022
@wpaulino
Copy link
Contributor

@TheBlueMatt feel free to squash.

If a user restores from a backup that they know is stale, they'd
like to force-close all of their channels (or at least the ones
they know are stale) *without* broadcasting the latest state,
asking their peers to do so instead. This simply adds methods to do
so, renaming the existing `force_close_channel` and
`force_close_all_channels` methods to disambiguate further.
When we receive a `channel_reestablish` with a `data_loss_protect`
that proves we're running with a stale state, instead of
force-closing the channel, we immediately panic. This lines up with
our refusal to run if we find a `ChannelMonitor` which is stale
compared to our `ChannelManager` during `ChannelManager`
deserialization. Ultimately both are an indication of the same
thing - that the API requirements on `chain::Watch` were violated.

In the "running with outdated state but ChannelMonitor(s) and
ChannelManager lined up" case specifically its likely we're running
off of an old backup, in which case connecting to peers with
channels still live is explicitly dangerous. That said, because
this could be an operator error that is correctable, panicing
instead of force-closing may allow for normal operation again in
the future (cc lightningdevkit#1207).

In any case, we provide instructions in the panic message for how
to force-close channels prior to peer connection, as well as a note
on how to broadcast the latest state if users are willing to take
the risk.

Note that this is still somewhat unsafe until we resolve lightningdevkit#1563.
@TheBlueMatt
Copy link
Collaborator Author

$ git diff-tree -U1 162024b6 caa2a9a5
$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't Always Broadcast latest state on startup if we'd previously closed-without-broadcast
4 participants