-
Notifications
You must be signed in to change notification settings - Fork 923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
very slow apply performance due to openapi schema reparsing #1682
Comments
This issue is currently awaiting triage. SIG CLI takes a lead on issue triage for this repo, but any Kubernetes member can accept issues by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What about passing |
yes, the problem occurs before it even starts talking to the server regarding the manifests (and after it has setup its resource caches from the servers api-resources) |
There might be a latency in discovery endpoint. Could you please run by increasing the log verbosity ( |
it is not an endpoint that is slow, it is spending all its team reading the certificate authority over and over again at 100%-150% cpu usage |
here is the output with network log:
|
if you give me some pointers how to create a cpu profile I can pinpoint the exact cause for you, I'm not very familiar with golang tooling. |
Reading the |
please read my full report, reading the cluster pem does not take 11 seconds, reading the cluster pem hundreds of times (it seems twice per input manifest) and the other code associated with loading the kubeconfig does. adding -v99 does not change the output as you do not log anything for that part of the code
|
using |
Validation interacts with the API server as in here. I think, there is an issue on your cluster. What is your cluster version? |
/remove-kind bug |
the cluster version is 1.30.6 but it is not a cluster problem, there is no network traffic at all during this part. have you tried running my reproducer yourself? I checked how to profile go and the parsing of the cluster.pem is not the main problem (although it is very odd this happens) but in decoding of the 1.3mib large openapi schema which appears to be done for every single input manifest again
|
yes, openapi schema validation would take significant amount of time to validate the resource. You can disable validation as you did above (that is highly not recommended). But I don't think there is anything we can do with respect to this. Validation should be done and it, unfortunately, takes time. |
why does the openapi schema need to be decoded for every single manifest? |
proof of concept patch just caching the parsed schemas: --- a/staging/src/k8s.io/client-go/openapi3/root.go
+++ b/staging/src/k8s.io/client-go/openapi3/root.go
@@ -59,6 +59,7 @@ type Root interface {
type root struct {
// OpenAPI client to retrieve the OpenAPI V3 documents.
client openapi.Client
+ schemas map[string]*spec3.OpenAPI
}
// Validate root implements the Root interface.
@@ -67,7 +68,7 @@ var _ Root = &root{}
// NewRoot returns a structure implementing the Root interface,
// created with the passed rest client.
func NewRoot(client openapi.Client) Root {
- return &root{client: client}
+ return &root{client: client, schemas: make(map[string]*spec3.OpenAPI)}
}
func (r *root) GroupVersions() ([]schema.GroupVersion, error) {
@@ -93,6 +94,12 @@ func (r *root) GroupVersions() ([]schema.GroupVersion, error) {
}
func (r *root) GVSpec(gv schema.GroupVersion) (*spec3.OpenAPI, error) {
+ apiPath := gvToAPIPath(gv)
+ schema, found := r.schemas[apiPath]
+ if found {
+ return schema, nil
+ }
+
openAPISchemaBytes, err := r.retrieveGVBytes(gv)
if err != nil {
return nil, err
@@ -100,6 +107,7 @@ func (r *root) GVSpec(gv schema.GroupVersion) (*spec3.OpenAPI, error) {
// Unmarshal the downloaded Group/Version bytes into the spec3.OpenAPI struct.
var parsedV3Schema spec3.OpenAPI
err = json.Unmarshal(openAPISchemaBytes, &parsedV3Schema)
+ r.schemas[apiPath] = &parsedV3Schema
return &parsedV3Schema, err
} makes it 10 times faster:
|
I don't see an obvious way to make kubectl reuse the decoded schemas for multiple resource visits, though the client-go openapi class already does cache the json document in bytes form. So a possible solution would be to adjust client-go to cache the decoded json instead of or in addition and the kubectl issue would automatically be resolved. |
What happened:
kubectl apply is very slow in reading manifest files. It appears to be the case that kubectl rereads its configuration several times per manifest it is supposed to apply which gets extremely slow, several minutes when applying large amount of manifests.
How to reproduce it (as minimally and precisely as possible):
Now run following dry-run kubectl and observe how often it opens (and reads) the certificate-authority file cluster.pem:
Increasing the number of objects in the manifests file increase the times it reads the cluster certificate from its configuration which does not seem necessary.
edit: see below, the real problem is repeated json decoding of the the openapi schema documents
When applying more manifests the time before kubectl even contacts the server increases to several minutes which can be very relevant for e.g. CI test runs.
The problem should not be the the manifest parsing time, as e.g. python yaml or the golang yq version can parse these manifests in a fraction of the time it takes kubectl.
The text was updated successfully, but these errors were encountered: