Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-18225: ClientQuotaCallback#updateClusterMetadata is unsupported by kraft #18196

Open
wants to merge 39 commits into
base: trunk
Choose a base branch
from

Conversation

m1a2st
Copy link
Contributor

@m1a2st m1a2st commented Dec 15, 2024

Jira: https://issues.apache.org/jira/browse/KAFKA-18225

we don't implement the ClientQuotaCallback#updateClusterMetadata in Kraft mode. We will implement it in 4.0 version to pass CustomQuotaCallbackTest test in Kraft mode.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@github-actions github-actions bot added the triage PRs from the community label Dec 15, 2024
@m1a2st m1a2st changed the title KAFKA-18225: ClientQuotaCallback#updateClusterMetadata is unsupported by kraft [WIP]KAFKA-18225: ClientQuotaCallback#updateClusterMetadata is unsupported by kraft Dec 15, 2024
@github-actions github-actions bot added core Kafka Broker kraft labels Dec 15, 2024
@@ -565,3 +531,97 @@ class KRaftMetadataCache(
}
}

object KRaftMetadataCache {

def toCluster(clusterId: String, image: MetadataImage): Cluster = {
Copy link
Contributor

@TaiJuWu TaiJuWu Dec 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This result is different with KraftMetadataCache#toCluster (the old will filter partitions base on listener).
If we decide to change this one, maybe we need to document it.

@m1a2st m1a2st changed the title [WIP]KAFKA-18225: ClientQuotaCallback#updateClusterMetadata is unsupported by kraft KAFKA-18225: ClientQuotaCallback#updateClusterMetadata is unsupported by kraft Dec 17, 2024
@github-actions github-actions bot removed the triage PRs from the community label Dec 19, 2024
@chia7712
Copy link
Member

chia7712 commented Jan 8, 2025

@m1a2st please fix the conflicts

m1a2st added 2 commits January 8, 2025 19:39
# Conflicts:
#	core/src/test/scala/integration/kafka/api/CustomQuotaCallbackTest.scala
@m1a2st
Copy link
Contributor Author

m1a2st commented Jan 8, 2025

Thanks for @chia7712 reminder, resolve the conflict

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m1a2st thanks for this fix!

import scala.collection.mutable.ArrayBuffer
import scala.jdk.CollectionConverters._

@Disabled("KAFKA-18213")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice to see this test gets fixed. Could you please close KAFKA-18213 as duplicate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I move it to L104, I didn't fix it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take a look in this fail test

@@ -48,6 +51,11 @@ class DynamicClientQuotaPublisher(
): Unit = {
val deltaName = s"MetadataDelta up to ${newImage.highestOffsetAndEpoch().offset}"
try {
val clientQuotaCallback = conf.getConfiguredInstance(QuotaConfig.CLIENT_QUOTA_CALLBACK_CLASS_CONFIG, classOf[ClientQuotaCallback])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use the callback in quotaManagers rather than creating a new one! They are different instances and so this approach can update the callback used by quotaManagers

@@ -235,6 +235,13 @@ public Optional<Node> node(String listenerName) {
}
return Optional.of(new Node(id, endpoint.host(), endpoint.port(), rack.orElse(null), fenced));
}

public List<Node> nodes() {
List<Node> nodes = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can leverage existent method. for example:

return listeners.keySet().stream().flatMap(l -> node(l).stream()).toList();

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m1a2st thanks for this patch. I test your patch on my local and please fix following code also:

  1. deleteTopic needs to use authorized admin
  2. You must call removeQuotaOverrides after creating group1_user2. Otherwise, the removeQuota method of the custom callback will not be invoked. This is an interesting discrepancy. In ZooKeeper mode, removeQuota is called when altering SASL/SCRAM credentials. However, in Kraft mode, this behavior is absent. I'm uncertain whether this constitutes a breaking change. It appears to be an unusual behavior, as it is triggered by the addition of users. Instead of implementing this peculiar behavior, I suggest updating the documentation of the callback to reflect the actual implementation.
    cc @dajac and @cmccabe

@@ -179,13 +212,12 @@ class CustomQuotaCallbackTest extends IntegrationTestHarness with SaslSetup {
}

private def createTopic(topic: String, numPartitions: Int, leader: Int): Unit = {
// TODO createTopic
TestUtils.createTopicWithAdmin(createAdminClient(), topic, brokers, controllerServers, numPartitions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to honor the leader to make replica leader hosted by correct broker - otherwise, this test will get flaky in the future

@m1a2st
Copy link
Contributor Author

m1a2st commented Jan 13, 2025

Thanks for @chia7712 review, addressed all comments

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m1a2st thanks for this patch.

@@ -337,6 +371,7 @@ object GroupedUserQuotaCallback {
val DefaultProduceQuotaProp = "default.produce.quota"
val DefaultFetchQuotaProp = "default.fetch.quota"
val UnlimitedQuotaMetricTags = new util.HashMap[String, String]
val updateClusterMetadataCalls = new AtomicInteger
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why to add this count?

quotaManagers.clientQuotaCallback().ifPresent(clientQuotaCallback => {
if (delta.topicsDelta() != null || delta.clusterDelta() != null) {
val cluster = KRaftMetadataCache.toCluster(clusterId, newImage)
clientQuotaCallback.updateClusterMetadata(cluster)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this line is required.

@@ -136,9 +148,11 @@ class CustomQuotaCallbackTest extends IntegrationTestHarness with SaslSetup {
// Create large number of partitions on another broker, should result in throttling on first partition
val largeTopic = "group1_largeTopic"
createTopic(largeTopic, numPartitions = 99, leader = 0)
user.removeThrottleMetrics()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unnecessary if we remove the unnecessary call of updateClusterMetadata

user.waitForQuotaUpdate(8000, 2500, defaultRequestQuota)
user.produceConsume(expectProduceThrottle = true, expectConsumeThrottle = true)

user.removeQuotaOverrides()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we call removeQuotaOverrides after creating new user for consistency?

@chia7712
Copy link
Member

@m1a2st could you please revise the docs of ClientQuotaCallback? there are some stale documents.

@chia7712
Copy link
Member

@m1a2st could you please fix the conflicts?

# Conflicts:
#	core/src/test/scala/integration/kafka/api/CustomQuotaCallbackTest.scala
@chia7712
Copy link
Member

@m1a2st please fix the conflicts :(

# Conflicts:
#	core/src/test/scala/integration/kafka/api/CustomQuotaCallbackTest.scala
@chia7712
Copy link
Member

@m1a2st please fix the conflicts

# Conflicts:
#	core/src/main/scala/kafka/server/metadata/KRaftMetadataCache.scala
Option(delta.clientQuotasDelta()).foreach { clientQuotasDelta =>
clientQuotaMetadataManager.update(clientQuotasDelta)
quotaManagers.clientQuotaCallback().ifPresent(clientQuotaCallback => {
if (delta.topicsDelta() != null || delta.clusterDelta() != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving this callback trigger to a individual publisher? That can decouple the callback and other publisher. Also, that can avoid producing incorrect error message?

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m1a2st thanks for this patch

import org.apache.kafka.image.loader.LoaderManifest
import org.apache.kafka.server.fault.FaultHandler

class DynamicTopicClusterQuotaPublisher (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comments to this new publisher. we should emphasize this approach is temporary and there is a follow-up jira

))

// Set up the DynamicTopicClusterQuotaPublisher. This will enable quotas for the cluster and topics.
metadataPublishers.add(new DynamicTopicClusterQuotaPublisher(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add test for controller

@@ -545,3 +511,96 @@ class KRaftMetadataCache(
}
}

object KRaftMetadataCache {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#18632 is trying to remove reference of KRaftMetadataCache, so maybe we can move the helpers to MetadataCache?

clusterId,
config,
sharedServer.metadataPublishingFaultHandler,
"broker",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Controller?

@ijuma ijuma requested a review from cmccabe January 24, 2025 06:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants