Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#2197] Support reg app conf to server and avoid update committed/cached blockIds bitmap #2196

Merged
merged 6 commits into from
Oct 29, 2024

Conversation

maobaolong
Copy link
Member

@maobaolong maobaolong commented Oct 18, 2024

What changes were proposed in this pull request?

Support reg app conf to server.

Why are the changes needed?

Fix #2197

Avoid update committed/cached blockIds bitmap while storage type is with memory.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Locally

@maobaolong maobaolong changed the title [#XXXX] Support reg app conf to server and avoid update committed/cached blockIds bitmap [#2197] Support reg app conf to server and avoid update committed/cached blockIds bitmap Oct 18, 2024
Copy link

github-actions bot commented Oct 18, 2024

Test Results

 2 926 files  +1   2 926 suites  +1   6h 12m 47s ⏱️ - 1m 15s
 1 088 tests ±0   1 086 ✅ +1   2 💤 ±0  0 ❌  - 1 
13 630 runs  +1  13 600 ✅ +2  30 💤 ±0  0 ❌  - 1 

Results for commit 714ef98. ± Comparison against base commit 85b1f01.

♻️ This comment has been updated with latest results.

@maobaolong maobaolong force-pushed the regAppConfToServer.gh branch from 7342670 to 769372e Compare October 18, 2024 04:43
@maobaolong maobaolong closed this Oct 18, 2024
@maobaolong maobaolong reopened this Oct 18, 2024
@maobaolong maobaolong force-pushed the regAppConfToServer.gh branch from 769372e to ff4b590 Compare October 18, 2024 06:50
@jerqi
Copy link
Contributor

jerqi commented Oct 18, 2024

This may be better as a server configuration.

@maobaolong
Copy link
Member Author

@jerqi Done, add a new configuration

@codecov-commenter
Copy link

codecov-commenter commented Oct 18, 2024

Codecov Report

Attention: Patch coverage is 54.54545% with 25 lines in your changes missing coverage. Please review.

Project coverage is 52.97%. Comparing base (4c14b97) to head (d08c314).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
.../apache/uniffle/client/api/ShuffleWriteClient.java 0.00% 6 Missing ⚠️
...ffle/client/request/RssRegisterShuffleRequest.java 0.00% 6 Missing ⚠️
...ava/org/apache/uniffle/server/ShuffleTaskInfo.java 72.22% 2 Missing and 3 partials ⚠️
...ffle/client/impl/grpc/ShuffleServerGrpcClient.java 0.00% 4 Missing ⚠️
...pache/uniffle/server/ShuffleServerGrpcService.java 0.00% 2 Missing ⚠️
...org/apache/uniffle/server/ShuffleFlushManager.java 50.00% 0 Missing and 1 partial ⚠️
.../org/apache/uniffle/server/ShuffleTaskManager.java 91.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #2196      +/-   ##
============================================
+ Coverage     52.31%   52.97%   +0.65%     
- Complexity     2814     3218     +404     
============================================
  Files           452      487      +35     
  Lines         21211    26224    +5013     
  Branches       1950     2477     +527     
============================================
+ Hits          11097    13891    +2794     
- Misses         9405    11422    +2017     
- Partials        709      911     +202     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@maobaolong maobaolong closed this Oct 18, 2024
@maobaolong maobaolong reopened this Oct 18, 2024
@jerqi
Copy link
Contributor

jerqi commented Oct 18, 2024

Why do we need to register app configuration to server?

@@ -193,6 +193,7 @@ message ShuffleRegisterRequest {
string comparatorClass = 11;
int32 mergedBlockSize = 12;
string mergeClassLoader = 13;
map<string, string> appConf = 1000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer clear defining parameters instead of a map.

@maobaolong
Copy link
Member Author

@jerqi I plan to use the client side Conf as a app level session config, base on this, we do not need to send BlockLayout for each reportShuffleResult, we only need to get it from app session Conf from ShuffleTaskInfo.

Leverage the appConf, we can easy to extends new features without proto change, even without client upgrade.

@jerqi
Copy link
Contributor

jerqi commented Oct 21, 2024

@jerqi I plan to use the client side Conf as a app level session config, base on this, we do not need to send BlockLayout for each reportShuffleResult, we only need to get it from app session Conf from ShuffleTaskInfo.

Leverage the appConf, we can easy to extends new features without proto change, even without client upgrade.

I know. But I am no sure that this is a good idea. Application configuration can contain many things. It lacks constraint. We can't get enough information from the interface. It's easy to develop but it's hard to understand.

@maobaolong
Copy link
Member Author

maobaolong commented Oct 21, 2024

@jerqi I see. It make sense about your concern. How about a trade off that we only filter and send the rss config to server by filter the spark.rss prefix?

@jerqi
Copy link
Contributor

jerqi commented Oct 21, 2024

@jerqi I see. It make sense about your concern. How about a trade off that we only filter and send the rss config to server by filter the spark.rss prefix?

I feel that some options can be passed to the server. But important concepts should be extracted to explict fields in the interface such as BlockLayout

@maobaolong
Copy link
Member Author

@jerqi Thanks for your help for improve this PR.

@maobaolong maobaolong closed this Oct 21, 2024
@maobaolong maobaolong reopened this Oct 21, 2024
@maobaolong maobaolong force-pushed the regAppConfToServer.gh branch from d08c314 to 58e39ab Compare October 22, 2024 06:17
@maobaolong maobaolong requested a review from jerqi October 22, 2024 06:19
@maobaolong maobaolong force-pushed the regAppConfToServer.gh branch 3 times, most recently from 11328dc to eb87001 Compare October 22, 2024 11:52
@maobaolong maobaolong closed this Oct 22, 2024
@maobaolong maobaolong reopened this Oct 22, 2024
@maobaolong
Copy link
Member Author

Encountered famous flaky test QuorumTest.case8

org.opentest4j.AssertionFailedError: expected: <true> but was: <false>
	at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
	at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40)
	at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35)
	at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:179)
	at org.apache.uniffle.test.QuorumTest.case8(QuorumTest.java:829)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

@maobaolong maobaolong reopened this Oct 23, 2024
@maobaolong maobaolong closed this Oct 23, 2024
@maobaolong maobaolong reopened this Oct 23, 2024
@maobaolong
Copy link
Member Author

ping @jerqi Sorry to remind

@@ -719,6 +719,11 @@ public class ShuffleServerConf extends RssBaseConf {
.booleanType()
.defaultValue(false)
.withDescription("Whether to enable app detail log");
public static final ConfigOption<Boolean> SERVER_WITH_MEMORY_STORAGE_TYPE_OPTIMIZE_ENABLED =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use camel style for last segment and 4 - 5 segments for the whole config option like
rss.server.storage.bitmapMemoryOptimizeEnabled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -307,4 +318,25 @@ public String toString() {
+ shuffleDetailInfos
+ '}';
}

public void setProperties(Map<String, String> properties, ShuffleServerConf serverConf) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need serverConf here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I need to check whether this server enabled bitmap optimized feature by SERVER_WITH_MEMORY_STORAGE_TYPE_OPTIMIZE_ENABLED

synchronized (bitmap) {
// With memory storage type should never need cachedBlockIds,
// since client do not need call finish shuffle rpc
if (!shuffleTaskInfo.isClientStorageTypeWithMemory()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get this config option from the server configuration directly instead getting from taskInfo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jerqi It is different, this is a client side config, it configured by client, so I have to get it from shuffle TaskInfo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this config option should be same between server and client side.

@maobaolong maobaolong requested a review from jerqi October 28, 2024 01:24
@maobaolong
Copy link
Member Author

@jerqi Thanks for reply and review, I renamed the property key, I can change to other name if you feel it is not good. PTAL

@maobaolong maobaolong force-pushed the regAppConfToServer.gh branch from 199e00d to dcb37e3 Compare October 28, 2024 03:38
@jerqi
Copy link
Contributor

jerqi commented Oct 28, 2024

We would better add check in the commit rpc implement for server side if client uses commit rpc.

@maobaolong maobaolong closed this Oct 28, 2024
@maobaolong maobaolong reopened this Oct 28, 2024
boolean storageTypeWithMemory =
StorageType.withMemory(StorageType.valueOf(storageType.name()));
if (storageTypeWithMemory) {
LOG.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should throw exception directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jerqi If we throw exception, it will never support for test mode and LOCALFILE client to access the server with LOCALFILE_MEMORY storageType?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I means that it's ok for me to throw exception here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jerqi Sorry for the misunderstand, done.

@maobaolong
Copy link
Member Author

@jerqi Thanks for remind me, I have misunderstood your suggestion, now I throw exception for these two method, PTAL

"finishShuffle should not be called while server-side configured StorageType to %s for appId %s",
storageType, appId);
LOG.error(errorMessage);
throw new IllegalStateException(errorMessage);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnsupportedException?

@maobaolong maobaolong merged commit fdcb54c into apache:master Oct 29, 2024
43 checks passed
@maobaolong
Copy link
Member Author

@jerqi Thanks for your review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement] Support reg app conf to server and avoid update committed/cached blockIds bitmap
3 participants