This repository has been archived by the owner on Feb 20, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 506
Replication Message Serialization Speedup #1570
Labels
performance
Performance related issues or changes.
Comments
If anyone has a particular format that I left out that they think would be good, please let me know. |
BSONbranch: https://github.com/jkosh44/noisepage/tree/bson UPDATE: The original JSON implementation was converting the record contents itself to and from CBOR. The original numbers used for BSON kept that conversion in. The updated numbers remove that conversion.
|
Log Throughput Replica
##Log Throughput Primary
Replication | Durability | Modifications | Log Throughput (records per millisecond) |
---|---|---|---|
Sync | Sync | None | 3.8700566481216665 |
Async | Sync | None | 30.736948475102366 |
Async | Async | None | 29.028352465595663 |
Log Throughput Replica
Durability | Log Throughput (records per millisecond) |
---|---|
Sync | 3.8883037697513125 |
Async | 3.874525911 |
Message Packbranch: https://github.com/jkosh44/noisepage/tree/messagepack Log Throughput Primary
Log Throughput Replica
|
UBJSONbranch: https://github.com/jkosh44/noisepage/tree/ubjson Log Throughput Primary
Log Throughput Replica
|
CBORbranch: https://github.com/jkosh44/noisepage/tree/cbor Log Throughput Primary
Log Throughput Replica
|
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Summary
Currently, we serialize all messages related to replication using JSON. The implementation can be found here:
noisepage/src/include/replication/replication_messages.h
Lines 36 to 108 in 97eb7ec
noisepage/src/replication/replication_messages.cpp
Lines 19 to 56 in 97eb7ec
Turning replication on causes a significant slowdown to the database and one of the primary causes is the JSON serialization of messages. Below are some performance results of running TPCC on dev10 with 8 threads with various database configurations:
Below are some metrics on log throughput for the primary node with various database configurations:
Just for reference below are some metrics on log throughput for the replica node
Solution
A solution to this is to switch to a different message format than JSON and I plan on investigating a handful of alternatives and their impact on log throughput and request throughput.
Nlohmann
We use the Nlohmann JSON package to implement JSON in NoisePage. This package comes with a bunch of other binary formats built into the package. It's probably worth trying all of these since they can each be implemented with a couple of changed lines. Some require you to first convert your data to JSON before converting to a different binary format, and it's unclear to me if this has a significant performance penalty compared to converting directly to the message format.
Alternatives
Below are a handful of message formats I have found from some brief research. I plan on narrowing this down to roughly 4 after some more research.
Dependency Bloat
One of the considerations when implementing a new message format will be dependency bloat. I don't plan on coming up with my own implementation for any of these formats so we'll have to bring in third-party libraries. It will be important to make sure we don't bring in more than necessary to avoid dependency bloat.
The text was updated successfully, but these errors were encountered: