Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-28951 rename the wal before retrying the wal-split with another worker #6534

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Umeshkumar9414
Copy link
Contributor

Added an workerchange counter so that each time we can have a new name, that is needed in case the supposed dead RS starts to process the WAL after some time. I checked that wal name pattern, that we use for validating the wal is (.+)\.(\d+)(\.[0-9A-Za-z]+)? . This change is fitting there.

@virajjasani virajjasani self-requested a review December 12, 2024 19:01
@Umeshkumar9414
Copy link
Contributor Author

Umeshkumar9414 commented Dec 12, 2024

While going through the code I saw some comments and code that are not aligning. As per this comment, WALSplitter.splitLogFile should not be getting used in procedure based splitting but we are using the same method. We are calling SplitLogWorker.splitLog(that is deprecated) in SplitWALCallable. And inside this method here, we call WALSplitter.splitLogFile.
Can someone help me understad this? Is this a miss or it is known? cc @Apache9 , @apurtell

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@mnpoonia
Copy link
Contributor

@Umeshkumar9414 So the idea here is to have a retry counter attached to the wal name. And whenever split wal fails and another worked picks up same wal, it increments the counter!!

@mnpoonia
Copy link
Contributor

mnpoonia commented Dec 16, 2024

@Umeshkumar9414 If the the splitwal proc fails and also root procedure fails the how is that handled?

Copy link
Contributor

@Apache9 Apache9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I think the approach is OK, renaming is a typical way for fencing. But I suggest we keep the old behavior when there is no retry, so we can get better compatibility.

} else {
originalWALPath = walPath.substring(0, walPath.length() - RETRYING_EXT.length() - 3);
}
String walNewName =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when retrying number is 0, we also have the '.retrying' suffix? Will this cause trouble when upgrading?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No when we have retrying number (workerChangeCount) 0 we don't have any suffix. This should not cause any trouble in upgrading. As @mnpoonia pointed out I do need to handle one case when SCP rolled back and second SCP created another splitwalProcedure in that case the name will contian retrying suffix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think the current code reflection your explaination here...

If you do not want to change the wal name when retry count == 0, you should just return at the first if condition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call this method in RELEASE_SPLIT_WORKER state. At this time first try of wal split is already complete. We only reach here if fir try is not able to split the wal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then please add this comment as a javadoc of this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Umeshkumar9414
Copy link
Contributor Author

Umeshkumar9414 commented Dec 16, 2024

In general I think the approach is OK, renaming is a typical way for fencing. But I suggest we keep the old behavior when there is no retry, so we can get better compatibility.

Yeah I didn't do any changes when there is no retry and kept that as it was.

@Umeshkumar9414
Copy link
Contributor Author

@Umeshkumar9414 If the the splitwal p

Thanks @mnpoonia to point this out. I need to when the parent SCP fails and lets say we have created another SCP. It will just list all the files in WALDirectory and create SplitWalProcedure for all but yeah I need to handle the first retry with retryCount 0.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache9
Copy link
Contributor

Apache9 commented Dec 17, 2024

What about add a new step after acquire worker to rename the wal file, where we just append the worker's name to the wal file name as suffix?

And we need to be very careful when dealing with retrying...

There are several problems currently

  1. Renaming failure does not always mean the old file is still there, maybe the renaming is complete but we just do not get the result because of a network issue, then in the current code we will consider the splitting as finished and cause data loss.
  2. We will do the renaming after release worker, if the renaming fails, when retrying we will release the worker again, this will cause trouble...

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Umeshkumar9414
Copy link
Contributor Author

While going through the code I saw some comments and code that are not aligning. As per this comment, WALSplitter.splitLogFile should not be getting used in procedure based splitting but we are using the same method. We are calling SplitLogWorker.splitLog(that is deprecated) in SplitWALCallable. And inside this method here, we call WALSplitter.splitLogFile. Can someone help me understad this? Is this a miss or it is known? cc @Apache9 , @apurtell

@Apache9 , @apurtell

@@ -237,6 +237,9 @@ static void requestLogRoll(final WAL wal) {
/** File Extension used while splitting an WAL into regions (HBASE-2312) */
public static final String SPLITTING_EXT = "-splitting";

// Extension for the WAL where the split failed on one worker and is being retried on another.
public static final String RETRYING_EXT = ".retrying";

/**
* Pattern used to validate a WAL file name see {@link #validateWALFilename(String)} for
* description.
Copy link
Contributor Author

@Umeshkumar9414 Umeshkumar9414 Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while splitting the wal for meta table. wal name can be rs.XXX.meta.retrying001. Do you think we should update the WAL_FILE_NAME_PATTERN. Althought in splitting we didn't check for valid wal name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Apache9 what do you think?

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

public boolean ifExistRenameWALForRetry(String walPath, String postRenameWalPath)
throws IOException {
if (fs.exists(new Path(rootDir, walPath))) {
if (!fs.rename(new Path(rootDir, walPath), new Path(rootDir, postRenameWalPath))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I'm not terribly familiar with wal split.
Does the WAL file get closed by this time? I'm asking because Ozone doesn't yet support renaming open files. And supporting that is quite a big project itself.

Even thought that's not yet a huge problem for HBase since HBase isn't default to run on Ozone, it would be great if we don't attempt to rename open files.

Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this time there are cases when the WALFile will still be open. In current code we recoverLease at RS once master assign the splitting to the worker RS. @Apache9 do you think we should move the recoverLease to Master ?

Before this rename we also rename the WALdirectory for the rs. @jojochuang is renaming directory is different from file renaming ? If directory rename is also not supported when some file inside the directory is open then we need changes in current code as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should have been done before we rename the wal directory?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks guys.

I think what's more relevant for HBase is that it used to cause race conditions if the WAL files are kept open while being renamed. HBASE-27732 fixed one such bug -- because HDFS allows renaming open files, it doesn't fail immediately but it causes NPE later. Ozone fails right away with that bug. Took us a few days to find out.

(I need to check but I think directory rename is fine for Ozone in this case)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As recoverLease doesn't support the directory path (https://github.com/apache/hadoop/blob/fb1bb6429dfb4e45687e0bc507c5a2ed26bd0bb0/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/leaserecoverable.md), so we need to recoverlease for each file. We also have to recoverLease before renaming the walFile.

Btw at least for hadoop I think that both (recoverLease after rename or before rename) are fine. As renaming is a metadata operation and data is linked to INodes.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 35s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 buf 0m 0s buf was not available.
+0 🆗 buf 0m 0s buf was not available.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for branch
+1 💚 mvninstall 2m 58s master passed
+1 💚 compile 3m 27s master passed
+1 💚 checkstyle 0m 41s master passed
+1 💚 spotbugs 3m 40s master passed
+1 💚 spotless 0m 42s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 2m 48s the patch passed
+1 💚 compile 3m 28s the patch passed
+1 💚 cc 3m 28s the patch passed
+1 💚 javac 3m 28s the patch passed
+1 💚 blanks 0m 1s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 36s /results-checkstyle-hbase-server.txt hbase-server: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚 spotbugs 3m 53s the patch passed
+1 💚 hadoopcheck 10m 44s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 hbaseprotoc 1m 13s the patch passed
+1 💚 spotless 0m 40s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 17s The patch does not generate ASF License warnings.
43m 4s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6534/5/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6534
Optional Tests dupname asflicense cc buflint bufcompat codespell detsecrets hbaseprotoc spotless javac spotbugs checkstyle compile hadoopcheck hbaseanti
uname Linux dd88e2ccf603 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / dd576cd
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 86 (vs. ulimit of 30000)
modules C: hbase-protocol-shaded hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6534/5/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 37s Docker mode activated.
-0 ⚠️ yetus 0m 2s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 35s Maven dependency ordering for branch
+1 💚 mvninstall 3m 26s master passed
+1 💚 compile 1m 31s master passed
+1 💚 javadoc 0m 38s master passed
+1 💚 shadedjars 5m 51s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for patch
+1 💚 mvninstall 3m 3s the patch passed
+1 💚 compile 1m 30s the patch passed
+1 💚 javac 1m 30s the patch passed
+1 💚 javadoc 0m 35s the patch passed
+1 💚 shadedjars 5m 47s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 0m 32s hbase-protocol-shaded in the patch passed.
-1 ❌ unit 211m 51s /patch-unit-hbase-server.txt hbase-server in the patch failed.
241m 9s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6534/5/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6534
Optional Tests unit javac javadoc compile shadedjars
uname Linux 153b6b6e2528 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / dd576cd
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6534/5/testReport/
Max. process+thread count 5332 (vs. ulimit of 30000)
modules C: hbase-protocol-shaded hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6534/5/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants