-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CDAP-20993] Namespace and Repository Source control metadata refresh #15641
base: develop
Are you sure you want to change the base?
Conversation
|
||
private final List<SourceControlMetadataRecord> apps; | ||
private final String nextPageToken; | ||
private final Long lastRefreshTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use boxed type? Is it nullable? It's better not to use box type if possible. Is there a meaning default value for it that you can use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its null if the refresh has not been run yet. I think it would be more appropiate to have default value as null in the API response than a actual long value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it can ever be null. The value comes from SourceControlMetadataRefresher.getLastRefreshTime
, which returns primitive long, and the implementation is defaulting it to 0L
@@ -2482,6 +2482,10 @@ public static final class SourceControlManagement { | |||
public static final String REPOSITORY_TTL_SECONDS = "source.control.repository.ttl.seconds"; | |||
public static final String METADATA_MIGRATION_INTERVAL_SECONDS = | |||
"source.control.metadata.migration.interval.seconds"; | |||
public static final String METADATA_REFRESH_INTERVAL_SECONDS = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont think we need this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
|
||
private final List<SourceControlMetadataRecord> apps; | ||
private final String nextPageToken; | ||
private final Long lastRefreshTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its null if the refresh has not been run yet. I think it would be more appropiate to have default value as null in the API response than a actual long value
...app-fabric/src/main/java/io/cdap/cdap/internal/app/services/ApplicationLifecycleService.java
Show resolved
Hide resolved
@@ -415,7 +427,9 @@ public ApplicationDetail getLatestAppDetail(ApplicationReference appRef) | |||
* reference is not found. | |||
*/ | |||
public SourceControlMetadataRecord getSourceControlMetadataRecord(ApplicationReference appRef) | |||
throws SourceControlMetadataNotFoundException { | |||
throws SourceControlMetadataNotFoundException, RepositoryNotFoundException { | |||
// Checking if repository config exists for the namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC RepositoryNotFoundException would be interpreted as 404 right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, right
|
||
/** | ||
* Service for refreshing namespace and repository source control metadata of a namespace at | ||
* scheduled intervals and manually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
...rc/main/java/io/cdap/cdap/internal/app/services/NamespaceSourceControlMetadataRefresher.java
Outdated
Show resolved
Hide resolved
...rc/main/java/io/cdap/cdap/internal/app/services/NamespaceSourceControlMetadataRefresher.java
Show resolved
Hide resolved
return this.lastRefreshTime.get(); | ||
} | ||
|
||
private void updateSourceControlMetaWithNoTransaction(ApplicationReference appRef, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name it updateSourceControlMeta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
SourceControlMetadataRecord record = lastRecord.get(); | ||
return !pageLimitReached || record == null ? null : record.getName(); | ||
}); | ||
List<SourceControlMetadataRecord> apps = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should check if repoconfig is present here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
...p-fabric/src/main/java/io/cdap/cdap/gateway/handlers/SourceControlManagementHttpHandler.java
Show resolved
Hide resolved
c7a05e4
to
06dc4b3
Compare
06dc4b3
to
5b6f2dd
Compare
return; | ||
} | ||
|
||
if (namespaceSourceControlMeta.getLastSyncedAt() != null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add few comments explaining each section
List<SourceControlMetadataRecord> records = new ArrayList<>(); | ||
String type = EntityType.APPLICATION.toString(); | ||
|
||
store.scan(request, type, records::add); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can do
if (!repoFiles.contains(record.getName())) {
store.delete(new ApplicationReference(record.getNamespace(), record.getName()));
}
inside scan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the
return nextPageToken; | ||
} | ||
|
||
public Long getLastRefreshTime() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annotate with @Nullable
* @param lastRefreshTime The timestamp of the last refresh operation. | ||
*/ | ||
public ListSourceControlMetadataResponse(List<SourceControlMetadataRecord> apps, | ||
String nextPageToken, Long lastRefreshTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annotate parameters that can be null with @Nullable
.
public class SingleSourceControlMetadataResponse { | ||
|
||
private final SourceControlMetadataRecord app; | ||
private final Long lastRefreshTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this nullable? If so, please annotate accordingly.
return apps; | ||
} | ||
|
||
public String getNextPageToken() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these getter methods used? If not, no need to define them.
boolean pageLimitReached; | ||
pageLimitReached = applicationLifecycleService.scanSourceControlMetadata( | ||
scanRequest, batchSize, | ||
apps::add); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the old code we don't buffer the app, but in the new code we do. I remember do removed the buffered to avoid high memory consumption, but now we are bringing the same problem back. Is buffering needed? Seems like we can refactor the JsonPaginatedListResponder so that it can emit extra fields (lastRefreshTime in this case) before closing the json object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can extend JsonPaginatedListResponder
to also add additional field.
@chtyim Is it ok to take this up as a fast followup for the change ?
SourceControlMetadataRecord record = apps.isEmpty() ? null :apps.get(apps.size() - 1); | ||
String nextPageToken = !pageLimitReached || record == null ? null : | ||
record.getName(); | ||
long lastRefreshTime = applicationLifecycleService.getLastRefreshTime(namespaceId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like unwise to sync all apps, shouldn't we just sync the apps fetched in this page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we are cloning the full repository which would be time taking it might be wasteful to just update few pipelines.
* NamespaceSourceControlMetadataRefreshService scheduler instances. | ||
*/ | ||
public class SourceControlMetadataRefresher { | ||
private ConcurrentMap<NamespaceId, Long> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final
lastRefreshTimes = new ConcurrentHashMap<>(); | ||
private final TransactionRunner transactionRunner; | ||
private final SourceControlOperationRunner sourceControlOperationRunner; | ||
private final SecureStore secureStore; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused field
private final TransactionRunner transactionRunner; | ||
private final SourceControlOperationRunner sourceControlOperationRunner; | ||
private final SecureStore secureStore; | ||
private static final Logger LOG = LoggerFactory.getLogger( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsued. Also, static field should comes before non static fields
* Constructs a SourceControlMetadataRefreshService instance. | ||
*/ | ||
@Inject | ||
public SourceControlMetadataRefresher(TransactionRunner transactionRunner, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
injected constructor shouldn't be public. Use package private (i.e. no access modifier).
* @return The timestamp of the last refresh | ||
*/ | ||
public long getLastRefreshTime(NamespaceId namespaceId) { | ||
return lastRefreshTimes.getOrDefault(namespaceId, 0L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the refresh time is only kept in memory? Meaning the information will be gone if the pod restarted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we do not have a good place to store this information.
throws IOException, RepositoryNotFoundException { | ||
NamespaceId namespaceId = new NamespaceId(request.getNamespace()); | ||
// Checking if repository config is present in the namespace | ||
sourceControlMetadataRefresher.getRepositoryMeta(namespaceId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point of calling get without using the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to check if RepositoryConfig exist or throw execption otherwise. @adrikagupta Please correct me if I am wrong but you chose this function to reduce duplicated code to fetch namespace repository config
lastRefreshTimes.putIfAbsent(namespaceId, createRefreshService(namespaceId).refreshMetadata()); | ||
} | ||
|
||
public RepositoryMeta getRepositoryMeta(NamespaceId namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method seems duplicating the one in SourceControlManagementService
. Why need to duplicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is actually a member of SourceControlManagementService
. We should move this to a common place. @adrikagupta I think you added a TODO for the same ?
Long lastSynced) throws IOException { | ||
StructuredTable repoTable = getRepositorySourceControlMetadataTable(); | ||
repoTable.upsert(getAllFields(appRef, isSynced, lastSynced)); | ||
} | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't comment on line 120. But there is 21 lines of duplicated code detected by IntelliJ with the NamespaceSourceControlMetadataStore
class. Why are we duplicating code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This two classes are handilng crud functions for two different (but similar) tables. Hence the duplication.
This PR adds the functionality of refreshing namespace and repository source control metadata in one transaction in /repository/apps call before fetching the apps for response. It does the refresh by getting the list of apps from remote repo, cleaning up the repository source control metadata table and then updating the isSynced (sync status) and last modified values accordingly. It also contains fixes to some bugs from the API impl PR in default store.