-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add BoundServiceAccountToken
trigger authentication type
#6272
base: main
Are you sure you want to change the base?
Conversation
538290e
to
ea01caa
Compare
@@ -58,6 +58,7 @@ import ( | |||
// +kubebuilder:rbac:groups=autoscaling,resources=horizontalpodautoscalers,verbs="*" | |||
// +kubebuilder:rbac:groups="",resources=configmaps;configmaps/status,verbs=get;list;watch | |||
// +kubebuilder:rbac:groups="",resources=events,verbs="*" | |||
// +kubebuilder:rbac:groups="",resources=serviceaccounts/token,verbs=create;get |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we include this as default or disable as default and explain how to enable this @kedacore/keda-core-contributors ? This setting as default, allows KEDA's service account to request tokens on behalf of any other service account in the cluster, which could mean a privilege escalation.
Users should define the resourceNames
if they want to reduce the exposition of the RBAC setting-> https://kubernetes.io/docs/reference/access-authn-authz/rbac/#referring-to-resources
We can (and we must) handle this option in helm as we do with the * of the custom resources, but I'm not sure about the best approach here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree, this is not ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can restrict it to an environment variable? I know for secrets, there exists the variable KEDA_CLUSTER_OBJECT_NAMESPACE
, but I'm not exactly sure how it works. Is there existing code that allows us to use something like WATCH_NAMESPACE
as a comma list of namespaces that keda is allowed to create/get service account tokens for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we can just limit this using RBAC and KEDA's code doesn't need to know about this. We added the flag for secrets because we get secrets across all the cluster and the lister/informer generates a cluster scoped cache. In this case as the TokenRequest isn't a cached API, I think that we don't need to inform to KEDA about limited accesses, it will just fail if the RBAC blocks the request.
The RBAC protection can be done just removing the default permission and documenting how to add RBAC permission for a given service account (using resourceNames
) or globally. As the end users will be who deploy that RBAC, the decision will be on their own
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, makes sense to me, thanks!
func resolveBoundServiceAccountToken(ctx context.Context, client client.Client, logger logr.Logger, namespace string, bsat *kedav1alpha1.BoundServiceAccountToken, triggerAuthRef *kedav1alpha1.AuthenticationRef, acs *authentication.AuthClientSet) string { | ||
serviceAccountName, expiry := bsat.ServiceAccountName, bsat.Expiry | ||
if serviceAccountName == "" { | ||
logger.Error(fmt.Errorf("error trying to get token"), "serviceAccountName is required") | ||
return "" | ||
} | ||
var err error | ||
expirySeconds := ptr.Int64(3600) | ||
if expiry != "" { | ||
duration, err := time.ParseDuration(expiry) | ||
if err != nil { | ||
logger.Error(err, "error trying to parse expiry duration", "expiry", expiry) | ||
return "" | ||
} | ||
// convert duration to seconds | ||
expirySeconds = ptr.Int64(int64(duration.Seconds())) | ||
} | ||
|
||
triggerAuth, _, err := getTriggerAuth(ctx, client, triggerAuthRef, namespace) | ||
if err != nil { | ||
logger.Error(err, "error trying to get [cluster]triggerAuth", "TriggerAuth.Namespace", namespace, "TriggerAuth.Name", triggerAuthRef.Name) | ||
return "" | ||
} | ||
currentAnnotations := triggerAuth.GetAnnotations() | ||
encodedToken := currentAnnotations["keda-serviceAccountToken"] | ||
|
||
// check if service account exists in the namespace | ||
serviceAccount := &corev1.ServiceAccount{} | ||
err = client.Get(ctx, types.NamespacedName{Name: serviceAccountName, Namespace: namespace}, serviceAccount) | ||
if err != nil { | ||
logger.Error(err, "error trying to get service account from namespace", "ServiceAccount.Namespace", namespace, "ServiceAccount.Name", serviceAccountName) | ||
return "" | ||
} | ||
|
||
// check if token is already referenced in the TriggerAuthentication annotation | ||
if encodedToken != "" { | ||
tokenValid := checkTokenValidity(ctx, logger, encodedToken, expirySeconds, acs) | ||
switch tokenValid { | ||
case tokenStatusInvalid: | ||
// token is invalid, or if more than 50% of the token's expiry has passed, create new token | ||
return generateAndAnnotateNewToken(ctx, client, logger, serviceAccountName, namespace, expirySeconds, triggerAuth, acs) | ||
case tokenStatusValid: | ||
return encodedToken | ||
default: | ||
// tokenStatusUnknown | ||
return "" | ||
} | ||
} else { | ||
// token doesn't exist; create new token and embed it in the the TriggerAuth | ||
logger.Info("Token doesn't exist; creating new token and embedding it in the triggerauth", "ServiceAccount.Namespace", namespace, "ServiceAccount.Name", serviceAccountName) | ||
return generateAndAnnotateNewToken(ctx, client, logger, serviceAccountName, namespace, expirySeconds, triggerAuth, acs) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, KEDA only re-executes the resolver in case of scaler creation. It means that the expiration of the token won't be checked until it has expired, triggering the error in the scaler and re-evaluating this code.
About storing the token in the TriggerAuthentication, I don't like this approach too much as it can be exposed just accessing the TriggerAuthentication CR, which isn't a sensitive API.
Considering these 2 things together, I'd not store the token anywhere and just request a fresh token each time, storing it in the memory of the scaler. It solves the storing issue and removed the verification code (which won't be executed until the token has already expired in the common scenario)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, make sense. Then should users still be able to embed an "expiry"? I would imagine a failing query every expiry period might be annoying to some users for availability reasons, but it might help if it fails at expected intervals? Plus, it will be an extra TokenRequest API request every interval.
Or maybe it would better to register scheduled goroutines with Timer to autorotate every half expiry on each authref?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we try to handle the expired token? The we can try to refresh the token and fail -> error if we are not able to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would imagine a failing query every expiry period might be annoying to some users for availability reasons
Actually, the query won't fail because it's silently retried once after the first fail, refreshing the cache and enforcing a new token from TokenRequest API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback everyone! Does my current implementation in 6c9e605 make sense then?
d1785b0
to
44d840c
Compare
Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
Signed-off-by: Max Cao <[email protected]>
67740fd
to
48ccb5f
Compare
Signed-off-by: Max Cao <[email protected]>
Provide a description of what has been changed
Proposes to add a new
[Cluster]TriggerAuthentication
type calledBoundServiceAccountToken
which allows users to bind a ServiceAccount token to a[Cluster]TriggerAuthentication
object. You can specify it like so:You could already inject Kubernetes service account tokens in triggerAuth refs before by using the
Secret
trigger auth type, but instead of manually embedding it in a long-lived secret, you can now directly specify the service account instead, and it will embed the sa token in an annotation in the triggerAuth object, and the keda-operator will autorotate the token if the expiry is at least 50% stale. You can specify which parameter you pull into the trigger withparameter
, the serviceAccountName in the same namespace as theTriggerAuth
CR, and the expiry as a duration. Note that Kubernetes doesn't allow expirys less than 10m. If you use aClusterTriggerAuth
, note that this works similarly to the Secret trigger auth, and the service account then has to be in theKEDA_CLUSTER_OBJECT_NAMESPACE
namespace.Before I write any tests, I'd like to get feedback first! :)
Checklist
Fixes: #6136