-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A bitcask file can get merged repeatedly [JIRA: RIAK-1844] #205
Comments
This problem was noticed on Riak 1.4.8, and although a lot has changed in 1.4.12 it looks like to me the issue may still be present. I am not sure whether Riak 2.0.x versions are affected or not. |
Thanks for the report and the analysis @dszoboszlay. I'm the engineer who looked at your assessment in the Zendesk ticket and I agree with it. Bitcask in 2.0.1+ has more knobs to limit the amount of disk bytes to merge that would avoid massive merge spikes and make this less of a problem in many cases. We should have a fix soon that prevents the merge process from merging a file already marked for deletion, as well as filter those out early when needs_merge runs the heuristic to determine which files to put in the merge queue. |
You may also consider speeding up the delete process by tweaking |
There is one delete process per partition, they don't interfere with each other @dszoboszlay |
Are you sure? The |
Yes, you are right @dszoboszlay, I hadn't noticed. That is indeed suboptimal. |
This behavior has also been seen in the following zendesk ticket : https://basho.zendesk.com/agent/tickets/9308 |
@engelsanchez FYI seeing this on another ticket (1.4.12 installation) https://basho.zendesk.com/agent/tickets/11440 |
A bitcask file is not deleted immediately after a merge, but handed off to
bitcask_merge_delete
for a deferred delete. In the meantime it is marked by turning on itssetuid
bit.Calling
bitcask:merge/1
1 merges readable files, which list does not include those marked for deletion2.However, the
riak_kv_bitcask_backend
starts a merge by specifying the exact list of files to merge. And that list comes frombitcask:needs_merge/1
3 that does not filter out files marked for a deletion.It means that if a file is not deleted in 3 minutes after it's merged, it will be enlisted for the next merge too. This issue shouldn't happen too often, but is magnified by the delay of all merging until the merging window, when a huge amount of merging activity can suddenly begin. Also,
bitcask_merge_delete
is a per node (not per vnode) process, so a long running fold operation in any of the vnodes (e.g. MDC replication) may block the deferred delete queue for a long time.The result is an unnecessary use of disk IO and CPU, but there is no other risk e.g. no chance for data loss.
The text was updated successfully, but these errors were encountered: