-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sanoid --prune-snapshots performance is bad because of iszfsbusy #912
Comments
@jimsalterjrs i think we can remove the whole send/recv check. zfs destroy will error out if the snapshot is in use and if we need to know the reason why it failed we can capture the output. What do you think? If you are okay with it a will create a PR for it. |
Sounds fine... I knew this was belt-and-suspenders, but did not expect iszfsbusy() to cause so much of a slowdown. Is the slowdown our own fault, or is something upstream performing less-well than it ought to? If the latter, we should probably file a bug upstream despite "fixing" it on our end by not bothering to check. Thanks for offering to create the PR! |
I think the performance impact is just to big when having "empty" snapshots (like sanoid frequently during the nights). Destroying them happens in an instant, but (Normally with a couple of snaps to be deleted, it should have no big effect, but when new people come to sanioid, they may do so because of an issue, like facing slowness and finding 100.000 snaps, so they fix the rules and then the performance impacts. I think I found exactly one such issue in the internet, where someone proposed a script being "ten times faster" (IIRC) possibly for the same reason.) But for @phreaker0 I have to state that I have no patch yet! I think changing the default value shouldn't be too difficult and I think could prepare a patch proposal. Or did you already took a look? What do you all think? |
…y making --force-prune default (and adding --no-force-prune to get old behavior
…y removing unneeded iszfsbusy()
…y making --force-prune default (and adding --no-force-prune to get old behavior)
…y making --force-prune default (and adding --double-check-prune to get old behavior)
…y making --force-prune default (and adding --double-check-prune to get old behavior)
…y making --force-prune default (and adding --no-force-prune to get old behavior)
…y removing unneeded iszfsbusy()
Indeed the change seems very straightforward (I hope I did not miss anything). I see three major approaches and I made a proposal patch for each:
|
That are alternatives - of course one and only one could be merged I'm happy to receive feedback for the choosen option and to improve my proposal according to your review! |
I noticed that on my machine sanoid appears a little slow. I reduced snaps to free a bit of space and first run sanoid got a bit to work on. I noticed that deletion takes ~100ms per zfs destroy, but sanoid needs ~500ms for each. I think because iszfsbusy() takes much time, and it is not executed once per data set, but for each snapshot (I think it is a race anyway).
Do I understand correctly
--force-prune
skips that check (and does nothing else)? Under which circumstances does it help?With
--force-prune
, I get more than5 x the speedEDIT: 10 x the speed and I think it is still safe unless the expiry rules would be so short that they would expire recent snapshots which could be used by replication (and then, if send would start a fraction of a second later, would be lost anyway).Do I understand correctly that this check normally should not be needed, and even if zfs send would be running, there would only be an error message but no disturbance? Do we still need this or could we make
--force-prune
the "default"?The text was updated successfully, but these errors were encountered: