-
Notifications
You must be signed in to change notification settings - Fork 997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure ACKs are sent after backup #921
Conversation
…times sent to the master before binlog events were fully written and fsynced to disk during backup operations. Sending ACKs prematurely in semi-synchronous replication could lead to data loss if the replica fails after sending the ACK but before persisting the event. Key changes: - Introduced an `EventHandler` interface with a `HandleEvent` method for processing binlog events. This allows custom event handling logic to be injected into the replication stream. - Added an `eventHandler` field to `BinlogSyncer` and provided a `SetEventHandler` method to assign an event handler. This enables `BinlogSyncer` to delegate event processing to the assigned handler. - Implemented `BackupEventHandler` which writes binlog events to disk and ensures that each event is fsynced before returning. This ensures data durability before ACKs are sent. - Modified the `onStream` method in `BinlogSyncer` to separate event parsing (`parseEvent`) from event handling and ACK sending (`handleEventAndACK`). This adheres to the single-responsibility principle and makes the code cleaner. - Moved state updates (e.g., updating `b.nextPos`) and GTID set handling from `parseEvent` to `handleEventAndACK` to avoid side effects during parsing. - Ensured that ACKs are sent only after the event has been fully processed and fsynced by sending the ACK in `handleEventAndACK` after event handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will review later
…h SynchronousEventHandler. Simplify the event processing in BinlogSyncerConfig by introducing SynchronousEventHandler for synchronous event handling. Update StartBackup, StartBackupWithHandler, and associated tests to reflect these changes.
bcb07a6
to
d162219
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm. Thanks.
|
||
// Force use raw mode | ||
b.parser.SetRawMode(true) | ||
|
||
// Set up the backup event handler | ||
backupHandler := &BackupEventHandler{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to change StartBackupWithHandler
now? Because your requirement does not use StartBackupWithHandler
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logic of writing on disk is contained in
func (h *BackupEventHandler) HandleEvent(e *BinlogEvent) error {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the old code, we still can use handler
to create io.WriteCloser
and write events to disk.
e, err := s.GetEvent(ctx)
...
if n, err := w.Write(e.RawData); err != nil {
return errors.Trace(err)
} else if n != len(e.RawData) {
return errors.Trace(io.ErrShortWrite)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but I don't want to duplicate the code in HandleEvent
. Do you want me to remove the BackupEventHandler
struct and have HandleEvent
be static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate the code in HandleEvent
Sorry I don't know where is the duplication. Can you elaborate?
Do you want me to remove the BackupEventHandler struct and have HandleEvent be static?
No I don't mean that. I mean reverting StartBackupWithHandler
to old code. Your StartSynchronousBackup
is fine. And BackupEventHandler
seems no need to be public (exported)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StartBackupWithHandler
implemented the code in HandleEvent
which is also used by the SynchronousEventHandler
in replication/binlogsyncer.go
. As a helpful shorthand to prevent the need for clients to reimplement HandleEvent
in their code, we can leave it extracted as its own method.
For example, if we reverted StartBackupWithHandler
we'd either have that code (if e.Header.EventType == ROTATE_EVENT
...) duplicated in HandleEvent
or, if we removed that as well (as you suggested) we'd have it duplicated in the test code. Does that make sense?
If you have a different idea of how this should all change, can you upload an example diff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see the duplication is if e.Header.EventType == ROTATE_EVENT
, thanks. I think it's reasonable.
And here's my suggestions, please check them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we make newBackupEventHandler
private, clients won't be able to do what backup_test.go (an example client) is doing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if they are the same behaviour as BackupEventHandler, they can directly use SynchronousEventHandler. if they have more customised behaviour, current BackupEventHandler is not extendable to other behaviour I think (no place to inject new logic?). so they still need to write many lines of code.
maybe you think they can wrap a structure on BackupEventHandler, dispatch event by event type to BackupEventHandler or the parent structure? I can't think of a real use case for now.
oh you are right 👍
replication/backup_test.go
Outdated
|
||
if isSynchronous { | ||
// Set up a BackupEventHandler for synchronous mode | ||
backupHandler := &BackupEventHandler{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after my previous comment, now the BackupEventHandler
is only used in this test. Maybe move its declaration to this test file.
… return the behavior to 30 days _between_ events, restore some comments, use struct instead of bool as recommended, add a note about SynchronousEventHandler and the parseEvent return values
18cacdc
to
c3188f5
Compare
c3188f5
to
06c9268
Compare
…ivate handler outside the package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm now. please address if you like the new way to deal with select-contexts. not a big problem. I'll merge it when it's daytime in my timezone.
Summary
This PR addresses an issue where acknowledgments (ACKs) were sometimes sent to the master before binlog events were fully written and fsynced to disk during backup operations. Sending ACKs prematurely in semi-synchronous replication could lead to data loss if the replica fails after sending the ACK but before persisting the event.
To fix this issue, the following changes have been made:
SynchronousEventHandler
inBinlogSyncerConfig
to ensure that events are fully processed and persisted before sending ACKs.StartBackup
to use the synchronous event handler when it's set, enabling synchronous handling during backup operations.BackupEventHandler
that writes events to disk and ensures they are fully persisted before acknowledging.onStream
andparseEvent
methods inBinlogSyncer
to support both synchronous and asynchronous event handling modes.Testing
TestAsyncBackup
andTestSyncBackup
to verify the backup process in both asynchronous and synchronous modes.