Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CDAP-20835] Add retries for remote artifact and remote application calls #15577

Merged
merged 1 commit into from
Apr 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

package io.cdap.cdap.internal.app.runtime.artifact;

import com.google.common.annotations.VisibleForTesting;
import com.google.common.io.ByteStreams;
import com.google.common.reflect.TypeToken;
import com.google.gson.Gson;
Expand All @@ -24,14 +25,18 @@
import io.cdap.cdap.api.artifact.ArtifactRange;
import io.cdap.cdap.api.artifact.ArtifactScope;
import io.cdap.cdap.api.data.schema.Schema;
import io.cdap.cdap.api.retry.RetryableException;
import io.cdap.cdap.common.ArtifactNotFoundException;
import io.cdap.cdap.common.NotFoundException;
import io.cdap.cdap.api.service.ServiceUnavailableException;

Check warning on line 31 in cdap-app-fabric/src/main/java/io/cdap/cdap/internal/app/runtime/artifact/RemoteArtifactRepositoryReader.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.imports.CustomImportOrderCheck

Wrong lexicographical order for 'io.cdap.cdap.api.service.ServiceUnavailableException' import. Should be before 'io.cdap.cdap.common.NotFoundException'.
import io.cdap.cdap.common.conf.Constants;
import io.cdap.cdap.common.http.DefaultHttpRequestConfig;
import io.cdap.cdap.common.id.Id;
import io.cdap.cdap.common.internal.remote.RemoteClient;
import io.cdap.cdap.common.internal.remote.RemoteClientFactory;
import io.cdap.cdap.common.service.Retries;
import io.cdap.cdap.common.service.RetryStrategies;
import io.cdap.cdap.common.service.RetryStrategy;
import io.cdap.cdap.gateway.handlers.AppLifecycleHttpHandler;
import io.cdap.cdap.gateway.handlers.ArtifactHttpHandlerInternal;
import io.cdap.cdap.internal.io.SchemaTypeAdapter;
Expand All @@ -49,11 +54,13 @@
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.TimeUnit;
import java.util.function.Predicate;
import org.apache.twill.filesystem.Location;
import org.apache.twill.filesystem.LocationFactory;


/**

Check warning on line 63 in cdap-app-fabric/src/main/java/io/cdap/cdap/internal/app/runtime/artifact/RemoteArtifactRepositoryReader.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.javadoc.SummaryJavadocCheck

First sentence of Javadoc is missing an ending period.
* Implementation for fetching artifact metadata from remote {@link ArtifactHttpHandlerInternal}
*/
public class RemoteArtifactRepositoryReader implements ArtifactRepositoryReader {
Expand All @@ -65,11 +72,16 @@
}.getType();
private static final Type ARTIFACT_DETAIL_LIST_TYPE = new TypeToken<List<ArtifactDetail>>() {
}.getType();
private static final Predicate<Throwable> RETRYABLE_PREDICATE =
throwable -> (throwable instanceof RetryableException);
private static final int RETRY_BASE_DELAY_MILLIS = 100;
private static final int RETRY_MAX_DELAY_MILLIS = 5000;
private static final int RETRY_TIMEOUT_SECS = 300;

private final RemoteClient remoteClient;
private final LocationFactory locationFactory;

@Inject

Check warning on line 84 in cdap-app-fabric/src/main/java/io/cdap/cdap/internal/app/runtime/artifact/RemoteArtifactRepositoryReader.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.javadoc.MissingJavadocMethodCheck

Missing a Javadoc comment.
public RemoteArtifactRepositoryReader(LocationFactory locationFactory,
RemoteClientFactory remoteClientFactory) {
this(locationFactory,
Expand All @@ -86,7 +98,7 @@

/**
* Fetches {@link ArtifactDetail} from {@link AppLifecycleHttpHandler}
* <p>

Check warning on line 101 in cdap-app-fabric/src/main/java/io/cdap/cdap/internal/app/runtime/artifact/RemoteArtifactRepositoryReader.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.javadoc.JavadocParagraphCheck

<p> tag should be placed immediately before the first word, with no space after.

Check warning on line 101 in cdap-app-fabric/src/main/java/io/cdap/cdap/internal/app/runtime/artifact/RemoteArtifactRepositoryReader.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.javadoc.JavadocParagraphCheck

<p> tag should be preceded with an empty line.
* Note that {@link Location} in {@link ArtifactDescriptor} doesn't get transported over, we need
* to instantiate it based on the location URI in the received {@link ArtifactDetail} to construct
* a complete {@link ArtifactDetail}.
Expand All @@ -98,7 +110,7 @@
artifactId.getNamespace().getId(),
artifactId.getName(),
artifactId.getVersion());
HttpRequest.Builder requestBuilder = remoteClient.requestBuilder(HttpMethod.GET, url);
HttpRequest.Builder requestBuilder = getRemoteClient().requestBuilder(HttpMethod.GET, url);
httpResponse = execute(requestBuilder.build());
ArtifactDetail detail = GSON.fromJson(httpResponse.getResponseBodyAsString(),
ARTIFACT_DETAIL_TYPE);
Expand Down Expand Up @@ -133,7 +145,7 @@
artifactId.getName(),
artifactId.getVersion(),
scope);
HttpURLConnection urlConn = remoteClient.openConnection(HttpMethod.GET, url);
HttpURLConnection urlConn = getRemoteClient().openConnection(HttpMethod.GET, url);
throwIfError(artifactId, urlConn);

// Use FilterInputStream and override close to ensure the connection is closed once the input stream is closed
Expand Down Expand Up @@ -161,7 +173,7 @@
range.getUpper().toString(),
limit,
order.name());
HttpRequest.Builder requestBuilder = remoteClient.requestBuilder(HttpMethod.GET, url);
HttpRequest.Builder requestBuilder = getRemoteClient().requestBuilder(HttpMethod.GET, url);
HttpResponse httpResponse = execute(requestBuilder.build());
List<ArtifactDetail> details = GSON.fromJson(httpResponse.getResponseBodyAsString(),
ARTIFACT_DETAIL_LIST_TYPE);
Expand All @@ -187,7 +199,13 @@

private HttpResponse execute(HttpRequest request)
throws IOException, NotFoundException, UnauthorizedException {
HttpResponse httpResponse = remoteClient.execute(request);

RetryStrategy baseRetryStrategy = RetryStrategies.exponentialDelay(
RETRY_BASE_DELAY_MILLIS, RETRY_MAX_DELAY_MILLIS, TimeUnit.MILLISECONDS);
HttpResponse httpResponse =
Retries.callWithRetries(() -> getRemoteClient().execute(request),
RetryStrategies.timeLimit(RETRY_TIMEOUT_SECS, TimeUnit.SECONDS, baseRetryStrategy),
RETRYABLE_PREDICATE);
if (httpResponse.getResponseCode() == HttpURLConnection.HTTP_NOT_FOUND) {
throw new NotFoundException(httpResponse.getResponseBodyAsString());
}
Expand All @@ -212,7 +230,7 @@
if (errorStream != null) {
errorMsg = new String(ByteStreams.toByteArray(errorStream), StandardCharsets.UTF_8);
}
switch (responseCode) {

Check warning on line 233 in cdap-app-fabric/src/main/java/io/cdap/cdap/internal/app/runtime/artifact/RemoteArtifactRepositoryReader.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.coding.MissingSwitchDefaultCheck

switch without "default" clause.
case HttpURLConnection.HTTP_UNAVAILABLE:
throw new ServiceUnavailableException(Constants.Service.APP_FABRIC_HTTP, errorMsg);
case HttpURLConnection.HTTP_NOT_FOUND:
Expand All @@ -226,4 +244,9 @@
errorMsg));
}
}

@VisibleForTesting
public RemoteClient getRemoteClient() {
return remoteClient;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,23 @@

package io.cdap.cdap.metadata;

import com.google.common.annotations.VisibleForTesting;
import com.google.common.reflect.TypeToken;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonObject;
import com.google.inject.Inject;
import io.cdap.cdap.api.retry.Idempotency;

Check warning on line 25 in cdap-app-fabric/src/main/java/io/cdap/cdap/metadata/RemoteApplicationDetailFetcher.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.imports.UnusedImportsCheck

Unused import - io.cdap.cdap.api.retry.Idempotency.
import io.cdap.cdap.api.retry.RetryableException;
import io.cdap.cdap.common.NamespaceNotFoundException;
import io.cdap.cdap.common.NotFoundException;
import io.cdap.cdap.common.conf.Constants;
import io.cdap.cdap.common.http.DefaultHttpRequestConfig;
import io.cdap.cdap.common.internal.remote.RemoteClient;
import io.cdap.cdap.common.internal.remote.RemoteClientFactory;
import io.cdap.cdap.common.service.Retries;
import io.cdap.cdap.common.service.RetryStrategies;
import io.cdap.cdap.common.service.RetryStrategy;
import io.cdap.cdap.internal.app.ApplicationSpecificationAdapter;
import io.cdap.cdap.proto.ApplicationDetail;
import io.cdap.cdap.proto.id.ApplicationReference;
Expand All @@ -40,7 +46,9 @@
import java.lang.reflect.Type;
import java.net.HttpURLConnection;
import java.util.List;
import java.util.concurrent.TimeUnit;
import java.util.function.Consumer;
import java.util.function.Predicate;

/**
* Fetch application detail via internal REST API calls.
Expand All @@ -57,6 +65,11 @@
private static final String APPLICATIONS_KEY = "applications";
private static final String NEXT_PAGE_TOKEN_KEY = "nextPageToken";

private static final Predicate<Throwable> RETRYABLE_PREDICATE =
throwable -> (throwable instanceof RetryableException);
private static final int RETRY_BASE_DELAY_MILLIS = 100;
private static final int RETRY_MAX_DELAY_MILLIS = 5000;
private static final int RETRY_TIMEOUT_SECS = 300;
private final RemoteClient remoteClient;

/**
Expand All @@ -79,7 +92,7 @@
throws IOException, NotFoundException, UnauthorizedException {
String url = String.format("namespaces/%s/app/%s",
appRef.getNamespace(), appRef.getApplication());
HttpRequest.Builder requestBuilder = remoteClient.requestBuilder(HttpMethod.GET, url);
HttpRequest.Builder requestBuilder = getRemoteClient().requestBuilder(HttpMethod.GET, url);
HttpResponse httpResponse;
httpResponse = execute(requestBuilder.build());
return GSON.fromJson(httpResponse.getResponseBodyAsString(), ApplicationDetail.class);
Expand All @@ -95,7 +108,7 @@
String token;

do {
HttpRequest.Builder requestBuilder = remoteClient.requestBuilder(HttpMethod.GET, url);
HttpRequest.Builder requestBuilder = getRemoteClient().requestBuilder(HttpMethod.GET, url);
HttpResponse httpResponse;
try {
httpResponse = execute(requestBuilder.build());
Expand Down Expand Up @@ -126,7 +139,12 @@

private HttpResponse execute(HttpRequest request)
throws IOException, NotFoundException, UnauthorizedException {
HttpResponse httpResponse = remoteClient.execute(request);
RetryStrategy baseRetryStrategy = RetryStrategies.exponentialDelay(
RETRY_BASE_DELAY_MILLIS, RETRY_MAX_DELAY_MILLIS, TimeUnit.MILLISECONDS);
HttpResponse httpResponse =
Retries.callWithRetries(() -> getRemoteClient().execute(request),
RetryStrategies.timeLimit(RETRY_TIMEOUT_SECS, TimeUnit.SECONDS, baseRetryStrategy),
RETRYABLE_PREDICATE);
if (httpResponse.getResponseCode() == HttpURLConnection.HTTP_NOT_FOUND) {
throw new NotFoundException(httpResponse.getResponseBodyAsString());
}
Expand All @@ -135,4 +153,9 @@
}
return httpResponse;
}

@VisibleForTesting
public RemoteClient getRemoteClient() {
return remoteClient;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,23 +20,28 @@
import io.cdap.cdap.api.artifact.ArtifactRange;
import io.cdap.cdap.api.artifact.ArtifactScope;
import io.cdap.cdap.api.artifact.ArtifactVersion;
import io.cdap.cdap.api.retry.RetryableException;
import io.cdap.cdap.common.id.Id;
import io.cdap.cdap.common.internal.remote.RemoteClient;
import io.cdap.cdap.internal.app.runtime.artifact.ArtifactDetail;
import io.cdap.cdap.internal.app.runtime.artifact.ArtifactRepositoryReader;
import io.cdap.cdap.internal.app.runtime.artifact.RemoteArtifactRepositoryReader;
import io.cdap.cdap.proto.artifact.ArtifactSortOrder;
import io.cdap.cdap.proto.id.ArtifactId;
import io.cdap.cdap.proto.id.NamespaceId;
import java.net.SocketTimeoutException;
import java.util.ArrayList;
import java.util.List;
import org.junit.Assert;
import org.junit.Test;
import org.mockito.Mockito;
import org.mockito.invocation.InvocationOnMock;

public class ArtifactHttpHandlerInternalTest extends ArtifactHttpHandlerTestBase {
@Test
public void testListArtifacts() throws Exception {
List<ArtifactInfo> artifactInfoList = null;
ArtifactInfo artifactInfo = null;

Check warning on line 44 in cdap-app-fabric/src/test/java/io/cdap/cdap/internal/app/services/http/handlers/ArtifactHttpHandlerInternalTest.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.coding.VariableDeclarationUsageDistanceCheck

Distance between variable 'artifactInfo' declaration and its first usage is 4, but allowed 3. Consider making that variable final if you still need to store its value in advance (before method calls that might have side effects on the original value).

// Add a system artifact
String systemArtfiactName = "sysApp";
Expand Down Expand Up @@ -106,15 +111,51 @@
// Add a system artifact
String systemArtfiactName = "sysApp";
String systemArtifactVeresion = "1.0.0";
ArtifactId systemArtifactId = NamespaceId.SYSTEM.artifact(systemArtfiactName, systemArtifactVeresion);
ArtifactId systemArtifactId =
NamespaceId.SYSTEM.artifact(systemArtfiactName, systemArtifactVeresion);
addAppAsSystemArtifacts(systemArtifactId);

// Fetch ArtifactDetail using RemoteArtifactRepositoryReader and verify
ArtifactRepositoryReader remoteReader = getInjector().getInstance(RemoteArtifactRepositoryReader.class);
ArtifactRepositoryReader remoteReader =
getInjector().getInstance(RemoteArtifactRepositoryReader.class);
ArtifactDetail detail = remoteReader.getArtifact(Id.Artifact.fromEntityId(systemArtifactId));
Assert.assertNotNull(detail);
ArtifactDetail expectedDetail = getArtifactDetailFromRepository(Id.Artifact.fromEntityId(systemArtifactId));
Assert.assertTrue(detail.equals(expectedDetail));
ArtifactDetail expectedDetail =
getArtifactDetailFromRepository(Id.Artifact.fromEntityId(systemArtifactId));
Assert.assertEquals(detail, expectedDetail);

wipeData();
}

@Test
public void testRemoteArtifactRepositoryReaderGetArtifactWithRetries() throws Exception {
// Add a system artifact
String systemArtfiactName = "sysApp";
String systemArtifactVeresion = "1.0.0";
ArtifactId systemArtifactId =
NamespaceId.SYSTEM.artifact(systemArtfiactName, systemArtifactVeresion);
addAppAsSystemArtifacts(systemArtifactId);
// Fetch ArtifactDetail using RemoteArtifactRepositoryReader and verify
RemoteArtifactRepositoryReader remoteReader =
getInjector().getInstance(RemoteArtifactRepositoryReader.class);
RemoteClient remoteClient = remoteReader.getRemoteClient();

RemoteClient mockRemoteClient = Mockito.spy(remoteClient);
RemoteArtifactRepositoryReader mockRemoteReader = Mockito.spy(remoteReader);
Mockito.doReturn(mockRemoteClient).when(mockRemoteReader).getRemoteClient();

Mockito
.doThrow(new RetryableException(new SocketTimeoutException()))
.doAnswer(InvocationOnMock::callRealMethod)
.when(mockRemoteClient).execute(Mockito.any());

ArtifactDetail detail =
mockRemoteReader.getArtifact(Id.Artifact.fromEntityId(systemArtifactId));
Assert.assertNotNull(detail);
ArtifactDetail expectedDetail =
getArtifactDetailFromRepository(Id.Artifact.fromEntityId(systemArtifactId));
Assert.assertEquals(detail, expectedDetail);
Mockito.verify(mockRemoteClient, Mockito.times(2)).execute(Mockito.any());

wipeData();
}
Expand All @@ -138,8 +179,8 @@

ArtifactRange range = null;
List<ArtifactDetail> details = null;
ArtifactId systemArtifactId = null;

Check warning on line 182 in cdap-app-fabric/src/test/java/io/cdap/cdap/internal/app/services/http/handlers/ArtifactHttpHandlerInternalTest.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.coding.VariableDeclarationUsageDistanceCheck

Distance between variable 'systemArtifactId' declaration and its first usage is 5, but allowed 3. Consider making that variable final if you still need to store its value in advance (before method calls that might have side effects on the original value).
ArtifactDetail expectedDetail = null;

Check warning on line 183 in cdap-app-fabric/src/test/java/io/cdap/cdap/internal/app/services/http/handlers/ArtifactHttpHandlerInternalTest.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.coding.VariableDeclarationUsageDistanceCheck

Distance between variable 'expectedDetail' declaration and its first usage is 6, but allowed 3. Consider making that variable final if you still need to store its value in advance (before method calls that might have side effects on the original value).
int numLimits = 0;


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,16 @@
import com.google.common.collect.ImmutableList;
import io.cdap.cdap.AllProgramsApp;
import io.cdap.cdap.AppWithSchedule;
import io.cdap.cdap.api.retry.RetryableException;
import io.cdap.cdap.common.NamespaceNotFoundException;
import io.cdap.cdap.common.NotFoundException;
import io.cdap.cdap.common.conf.Constants;
import io.cdap.cdap.common.internal.remote.RemoteClient;
import io.cdap.cdap.gateway.handlers.AppLifecycleHttpHandlerInternal;
import io.cdap.cdap.internal.app.services.http.AppFabricTestBase;
import io.cdap.cdap.proto.ApplicationDetail;
import io.cdap.cdap.proto.id.ApplicationReference;
import java.net.SocketTimeoutException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
Expand All @@ -35,6 +38,8 @@
import org.junit.Test;
import org.junit.runner.RunWith;
import org.junit.runners.Parameterized;
import org.mockito.Mockito;
import org.mockito.invocation.InvocationOnMock;

/**
* Tests for {@link RemoteApplicationDetailFetcher} {@link LocalApplicationDetailFetcher} and
Expand Down Expand Up @@ -63,7 +68,7 @@

private ApplicationDetailFetcher getApplicationDetailFetcher(ApplicationDetailFetcherType type) {
ApplicationDetailFetcher fetcher = null;
switch (type) {

Check warning on line 71 in cdap-app-fabric/src/test/java/io/cdap/cdap/metadata/ApplicationDetailFetcherTest.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.coding.MissingSwitchDefaultCheck

switch without "default" clause.
case LOCAL:
fetcher = AppFabricTestBase.getInjector().getInstance(LocalApplicationDetailFetcher.class);
break;
Expand Down Expand Up @@ -113,9 +118,30 @@

// Deploy the application
deploy(AllProgramsApp.class, 200, Constants.Gateway.API_VERSION_3_TOKEN, namespace);

// Get and validate the application
RemoteClient mockRemoteClient = null;
if (fetcherType == ApplicationDetailFetcherType.REMOTE) {
// test remote application detail fetcher retries
RemoteClient remoteClient = ((RemoteApplicationDetailFetcher) fetcher).getRemoteClient();

mockRemoteClient = Mockito.spy(remoteClient);
fetcher = Mockito.spy((RemoteApplicationDetailFetcher) fetcher);
Mockito
.doReturn(mockRemoteClient)
.when((RemoteApplicationDetailFetcher) fetcher).getRemoteClient();

Mockito
.doThrow(new RetryableException(new SocketTimeoutException()))
.doAnswer(InvocationOnMock::callRealMethod)
.when(mockRemoteClient).execute(Mockito.any());
}

ApplicationDetail appDetail = fetcher.get(new ApplicationReference(namespace, appName));
assertAllProgramAppDetail(appDetail);
if (mockRemoteClient != null) {
Mockito.verify(mockRemoteClient, Mockito.times(2)).execute(Mockito.any());
}

// Delete the application
Assert.assertEquals(
Expand All @@ -135,7 +161,7 @@
public void testGetAllApplicationsUsingScan() throws Exception {
ApplicationDetailFetcher fetcher = getApplicationDetailFetcher(fetcherType);
String namespace = TEST_NAMESPACE1;
ApplicationDetail appDetail = null;

Check warning on line 164 in cdap-app-fabric/src/test/java/io/cdap/cdap/metadata/ApplicationDetailFetcherTest.java

View workflow job for this annotation

GitHub Actions / Checkstyle

com.puppycrawl.tools.checkstyle.checks.coding.VariableDeclarationUsageDistanceCheck

Distance between variable 'appDetail' declaration and its first usage is 6, but allowed 3. Consider making that variable final if you still need to store its value in advance (before method calls that might have side effects on the original value).
Integer batchSize = 1;

// No applications have been deployed
Expand Down
Loading