Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add the ability to checkpoint an existing server, and spawn a read-only server on that view. #2548

Open
wants to merge 24 commits into
base: unstable
Choose a base branch
from

Conversation

nathanlo99
Copy link

@nathanlo99 nathanlo99 commented Sep 20, 2024

It would be nice to be able to spawn a read-only server on a snapshot of a given server.

This PR implements this feature, and introduces the flag --snapshot-dir to manage the snapshot.

src/cli/main.cc Outdated Show resolved Hide resolved
src/cli/main.cc Outdated Show resolved Hide resolved
@nathanlo99
Copy link
Author

--- FAIL: TestBitmap (846.21s)
    --- FAIL: TestBitmap/SETBIT/GETBIT/BITCOUNT/BITPOS_boundary_check_(type_bitmap) (150.90s)
        bitmap_test.go:204: 
            	Error Trace:	/Users/runner/work/kvrocks/kvrocks/tests/gocase/unit/type/bitmap/bitmap_test.go:204
            	Error:      	Received unexpected error:
            	            	read tcp 127.0.0.1:50435->127.0.0.1:50429: i/o timeout
            	Test:       	TestBitmap/SETBIT/GETBIT/BITCOUNT/BITPOS_boundary_check_(type_bitmap)

The above check failure seems to be an unrelated i/o timeout; is there a way to rerun the checks?

@git-hulk
Copy link
Member

git-hulk commented Oct 5, 2024

--- FAIL: TestBitmap (846.21s)
    --- FAIL: TestBitmap/SETBIT/GETBIT/BITCOUNT/BITPOS_boundary_check_(type_bitmap) (150.90s)
        bitmap_test.go:204: 
            	Error Trace:	/Users/runner/work/kvrocks/kvrocks/tests/gocase/unit/type/bitmap/bitmap_test.go:204
            	Error:      	Received unexpected error:
            	            	read tcp 127.0.0.1:50435->127.0.0.1:50429: i/o timeout
            	Test:       	TestBitmap/SETBIT/GETBIT/BITCOUNT/BITPOS_boundary_check_(type_bitmap)

The above check failure seems to be an unrelated i/o timeout; is there a way to rerun the checks?

Yes, it's a flaky test and not related to this PR.

@nathanlo-hrt
Copy link
Contributor

@PragmaTwice @git-hulk could you take another look at this? It seems the previous opportunity to merge was missed because new changes made it to the unstable branch.

src/config/config.cc Outdated Show resolved Hide resolved
src/cli/main.cc Outdated Show resolved Hide resolved
src/config/config.cc Outdated Show resolved Hide resolved
src/storage/storage.cc Outdated Show resolved Hide resolved
@@ -75,12 +76,46 @@ const int64_t kIORateLimitMaxMb = 1024000;

using rocksdb::Slice;

static Status CreateSnapshot(Config &config, const std::string &snapshot_location) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I don't really understand how this works, seems this create a checkpoint when opening the file? Do you need create it on server and consume it on cli?

Copy link
Author

@nathanlo99 nathanlo99 Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need create it on server and consume it on cli?

I'm not sure i understand the second part of your question

Actually I don't really understand how this works, seems this create a checkpoint when opening the file?

As for how this works, it will read the data from the dir passed by the user, then use RocksDB snapshot functionality to make a read-only copy of that data into the snapshot-dir directory, then spawn a read-only server on that copy.

This is helpful for my team to debug a running kvrocks app without accidentally changing the values of the data or closing the existing read-write server.

@mapleFU
Copy link
Member

mapleFU commented Oct 22, 2024

Would you mind also add example for how you'd like use this? ( better on doc of cli or doc of project?)

@nathanlo99
Copy link
Author

apologies for the linter back and forth: i'm unable to build this locally with the same set-up as the linter uses, so a lot of these issues fall through the cracks

Copy link

sonarcloud bot commented Nov 6, 2024

src/cli/main.cc Outdated Show resolved Hide resolved
@nathanlo99
Copy link
Author

@mapleFU I'm not sure how to edit the docs, but one example workflow is the following.

Start a read-write Kvrocks server writing data to ~/kvrocks-dir, for the purposes of writing important data.

$ kvrocks [...] --dir ~/kvrocks-dir

Say the user realizes some of the important data is incorrect: they'd like to debug their script and run it on a copy of the data, possibly changing it. They'd like to do this without fear of corrupting or changing the original server's copy of the data:

$ mkdir ~/debug-dir
$ kvrocks [...] --dir ~/kvrocks-dir --snapshot-dir ~/debug-dir

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants