-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28951 rename the wal before retrying the wal-split with another worker #6534
base: master
Are you sure you want to change the base?
Conversation
While going through the code I saw some comments and code that are not aligning. As per this comment, |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@Umeshkumar9414 So the idea here is to have a retry counter attached to the wal name. And whenever split wal fails and another worked picks up same wal, it increments the counter!! |
@Umeshkumar9414 If the the splitwal proc fails and also root procedure fails the how is that handled? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I think the approach is OK, renaming is a typical way for fencing. But I suggest we keep the old behavior when there is no retry, so we can get better compatibility.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/SplitWALProcedure.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/wal/AbstractFSWALProvider.java
Show resolved
Hide resolved
} else { | ||
originalWALPath = walPath.substring(0, walPath.length() - RETRYING_EXT.length() - 3); | ||
} | ||
String walNewName = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when retrying number is 0, we also have the '.retrying' suffix? Will this cause trouble when upgrading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No when we have retrying number (workerChangeCount) 0 we don't have any suffix. This should not cause any trouble in upgrading. As @mnpoonia pointed out I do need to handle one case when SCP rolled back and second SCP created another splitwalProcedure in that case the name will contian retrying suffix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think the current code reflection your explaination here...
If you do not want to change the wal name when retry count == 0, you should just return at the first if condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call this method in RELEASE_SPLIT_WORKER state. At this time first try of wal split is already complete. We only reach here if fir try is not able to split the wal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then please add this comment as a javadoc of this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Yeah I didn't do any changes when there is no retry and kept that as it was. |
Thanks @mnpoonia to point this out. I need to when the parent SCP fails and lets say we have created another SCP. It will just list all the files in WALDirectory and create SplitWalProcedure for all but yeah I need to handle the first retry with retryCount 0. |
1b4fef4
to
10e9b7c
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
What about add a new step after acquire worker to rename the wal file, where we just append the worker's name to the wal file name as suffix? And we need to be very careful when dealing with retrying... There are several problems currently
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
cd0af96
to
a2f732d
Compare
@@ -237,6 +237,9 @@ static void requestLogRoll(final WAL wal) { | |||
/** File Extension used while splitting an WAL into regions (HBASE-2312) */ | |||
public static final String SPLITTING_EXT = "-splitting"; | |||
|
|||
// Extension for the WAL where the split failed on one worker and is being retried on another. | |||
public static final String RETRYING_EXT = ".retrying"; | |||
|
|||
/** | |||
* Pattern used to validate a WAL file name see {@link #validateWALFilename(String)} for | |||
* description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while splitting the wal for meta table. wal name can be rs.XXX.meta.retrying001. Do you think we should update the WAL_FILE_NAME_PATTERN. Althought in splitting we didn't check for valid wal name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Apache9 what do you think?
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
a2f732d
to
dd576cd
Compare
public boolean ifExistRenameWALForRetry(String walPath, String postRenameWalPath) | ||
throws IOException { | ||
if (fs.exists(new Path(rootDir, walPath))) { | ||
if (!fs.rename(new Path(rootDir, walPath), new Path(rootDir, postRenameWalPath))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I'm not terribly familiar with wal split.
Does the WAL file get closed by this time? I'm asking because Ozone doesn't yet support renaming open files. And supporting that is quite a big project itself.
Even thought that's not yet a huge problem for HBase since HBase isn't default to run on Ozone, it would be great if we don't attempt to rename open files.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this time there are cases when the WALFile will still be open. In current code we recoverLease at RS once master assign the splitting to the worker RS. @Apache9 do you think we should move the recoverLease to Master ?
Before this rename we also rename the WALdirectory for the rs. @jojochuang is renaming directory is different from file renaming ? If directory rename is also not supported when some file inside the directory is open then we need changes in current code as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should have been done before we rename the wal directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks guys.
I think what's more relevant for HBase is that it used to cause race conditions if the WAL files are kept open while being renamed. HBASE-27732 fixed one such bug -- because HDFS allows renaming open files, it doesn't fail immediately but it causes NPE later. Ozone fails right away with that bug. Took us a few days to find out.
(I need to check but I think directory rename is fine for Ozone in this case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As recoverLease doesn't support the directory path (https://github.com/apache/hadoop/blob/fb1bb6429dfb4e45687e0bc507c5a2ed26bd0bb0/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/leaserecoverable.md), so we need to recoverlease for each file. We also have to recoverLease before renaming the walFile.
Btw at least for hadoop I think that both (recoverLease after rename or before rename) are fine. As renaming is a metadata operation and data is linked to INodes.
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Added an workerchange counter so that each time we can have a new name, that is needed in case the supposed dead RS starts to process the WAL after some time. I checked that wal name pattern, that we use for validating the wal is
(.+)\.(\d+)(\.[0-9A-Za-z]+)?
. This change is fitting there.