-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[controller] do not delay RT topic deletion if the store is no longer hybrid #1234
base: main
Are you sure you want to change the base?
Conversation
...enice-controller/src/main/java/com/linkedin/venice/controller/kafka/TopicCleanupService.java
Show resolved
Hide resolved
...enice-controller/src/main/java/com/linkedin/venice/controller/kafka/TopicCleanupService.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add E2E test that verifies RT topic marked for deletion is
- it is not deleted when there are hybrid versions
- deleted once all hybrid versions are removed
9ec9fce
to
ddf8586
Compare
if (version.getHybridStoreConfig() != null && VersionStatus.isVersionErrored(versionStatus) | ||
&& VersionStatus.isVersionKilled(versionStatus)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite get this part.. so only when a hybrid version is errored and killed, we think it's hybrid?
private void safeDeleteRTTopic(String clusterName, String storeName, Store store) { | ||
private void safeDeleteRTTopic(String clusterName, String storeName) { | ||
Store store = getHelixVeniceClusterResources(clusterName).getStoreMetadataRepository().getStore(storeName); | ||
if (canDeleteRTTopic(clusterName, storeName, store)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the conditions for RT topic deletion exist in multiple places, e.g. L3509, canDeleteRTTopic. is it possible to merge these conditions into a single function for simplicity?
if (canDeleteRTTopic(clusterName, storeName, store)) { | ||
deleteRTTopic(clusterName, storeName); | ||
} else { | ||
LOGGER.warn("Topic deletion for topic: {} is delayed.", Version.composeRealTimeTopic(storeName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it easy to log the reason for delayed deletion?
return !Version.containsHybridVersion(store.getVersions()) && canDeleteRTTopic(clusterName, storeName); | ||
} | ||
|
||
public boolean canDeleteRTTopic(String clusterName, String storeName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename the function somehow to reflect the fact that we are querying the child controllers for their opinions?
truncateKafkaTopic(rtTopicToDelete); | ||
for (ControllerClient controllerClient: controllerClientMap.values()) { | ||
controllerClient.deleteKafkaTopic(rtTopicToDelete); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these lines are duplicated later. Can we extract them into a small helper function?
@@ -135,10 +122,26 @@ public TopicCleanupService( | |||
} else { | |||
this.sourceOfTruthPubSubAdminAdapter = null; | |||
} | |||
|
|||
if (!this.admin.isParent()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is keyword this
here necessary? My understanding for our code convention is that this
is used when there is ambiguity or cases it has to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's main reason to move the child controller api calls into constructor?
@@ -235,49 +238,46 @@ Admin getAdmin() { | |||
* If version topic deletion takes more than certain time it refreshes the entire topic list and start deleting from RT topics again. | |||
*/ | |||
void cleanupVeniceTopics() { | |||
// todo - instead of adding all topics into the same queue we must simply use separate queues for RT and nonRT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, todo -> "TODO" so that IDE can highlight it.
…ion if any version is still hybrid
ddf8586
to
8b1a3a8
Compare
Summary, imperative, start upper case, don't end with a period
When TopicCleanupService tries to delete RT topics, it does not have to wait for all the corresponding VT topics to get deleted if all the versions of the store has changed from hybrid to batch. If all the versions are batch, it can be safely assumed that no VT topic is consuming from that RT topic.
Resolves #XXX
How was this PR tested?
added a unit test in TestTopicCleanupService
Does this PR introduce any user-facing changes?