Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karpenter integration is missing 'auto_conf.yaml' file #19527

Open
umaasik opened this issue Jan 31, 2025 · 1 comment
Open

Karpenter integration is missing 'auto_conf.yaml' file #19527

umaasik opened this issue Jan 31, 2025 · 1 comment

Comments

@umaasik
Copy link

umaasik commented Jan 31, 2025

Without the requisite auto_conf.yaml file on the path karpenter/datadog_checks/karpenter/data/auto_conf.yaml (/etc/datadog-agent/conf.d/karpenter.d/ on the agent), the Karpenter integration (which isn't even documented here) isn't able to be ignored and will cause errors, such as:

/var/log/datadog/agent.log:2025-01-31 10:16:51 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:143 in LogMessage) | karpenter:955504ddc7914148 | (base.py:74) | There was an error scraping endpoint http://cluster-karpenter.karpenter.svc:8000/metrics: HTTPConnectionPool(host='cluster-karpenter.karpenter.svc', port=8000): Max retries exceeded with url: /metrics (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x7f1f03c2d2e0>: Failed to resolve 'cluster-karpenter.karpenter.svc' ([Errno -2] Name or service not known)"))
/var/log/datadog/agent.log:2025-01-31 10:16:51 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:71 in Error) | check:karpenter | Error running check: [{"message":"There was an error scraping endpoint http://cluster-karpenter.karpenter.svc:8000/metrics: HTTPConnectionPool(host='cluster-karpenter.karpenter.svc', port=8000): Max retries exceeded with url: /metrics (Caused by NameResolutionError(\"<urllib3.connection.HTTPConnection object at 0x7f1f03c2d2e0>: Failed to resolve 'cluster-karpenter.karpenter.svc' ([Errno -2] Name or service not known)\"))","traceback":"Traceback (most recent call last):\n  File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/datadog_checks/base/checks/base.py\", line 1290, in run\n    self.check(instance)\n  File \"/opt/datadog-agent/embedded/lib/python3.12/site-packages/datadog_checks/base/checks/openmetrics/v2/base.py\", line 75, in check\n    raise type(e)(\"There was an error scraping endpoint {}: {}\".format(endpoint, e)) from None\nrequests.exceptions.ConnectionError: There was an error scraping endpoint http://cluster-karpenter.karpenter.svc:8000/metrics: HTTPConnectionPool(host='cluster-karpenter.karpenter.svc', port=8000): Max retries exceeded with url: /metrics (Caused by NameResolutionError(\"<urllib3.connection.HTTPConnection object at 0x7f1f03c2d2e0>: Failed to resolve 'cluster-karpenter.karpenter.svc' ([Errno -2] Name or service not known)\"))\n"}]

I could possibly configure the integration to ensure it points towards the correct place, but we currently have no use for Karpenter's metrics and would rather disable it. Given the logic introduced here, which checks for the presence of an auto_conf.yaml file, any integrations enabled by default should also then have this file in order to make sure they can be ignored using something like DD_IGNORE_AUTOCONF: "redisdb karpenter". This isn't the case at the moment:

$ root@datadog-agent-hwqtr:/# agent version
Agent 7.60.0 - Commit: 799e2984e8 - Serialization version: v5.0.134 - Go version: go1.22.8
$ root@datadog-agent-hwqtr:/# ls -la /etc/datadog-agent/conf.d/karpenter.d/
total 40
drwxr-xr-x   2 root root    31 Jan 31 09:40 .
drwxr-xr-x 218 root root  8192 Jan 31 09:40 ..
-rw-r--r--   1 root root 24978 Jan 31 09:40 conf.yaml.example

Whereas, for example, redisdb is fine and can be ignored:

$ root@datadog-agent-hwqtr:/# ls -la /etc/datadog-agent/conf.d/redisdb.d/
total 24
drwxr-xr-x   2 root root   53 Jan 31 09:40 .
drwxr-xr-x 218 root root 8192 Jan 31 09:40 ..
-rw-r--r--   1 root root  662 Jan 31 09:40 auto_conf.yaml
-rw-r--r--   1 root root 6666 Jan 31 09:40 conf.yaml.example
$ root@datadog-agent-hwqtr:/# grep -ir "redisdb" /var/log/*
/var/log/datadog/agent.log:2025-01-31 09:40:58 UTC | CORE | INFO | (comp/core/autodiscovery/providers/config_reader.go:248 in collectEntry) | Skipping 'auto_conf.yaml' for integration 'redisdb'
@umaasik
Copy link
Author

umaasik commented Jan 31, 2025

There are tons of other integrations as well that do not have this file:

$ find /etc/datadog-agent/conf.d/ -type d -exec test ! -e {}/auto_conf.yaml \; -print
/etc/datadog-agent/conf.d/
/etc/datadog-agent/conf.d/argocd.d
/etc/datadog-agent/conf.d/kyverno.d
/etc/datadog-agent/conf.d/vsphere.d
/etc/datadog-agent/conf.d/avi_vantage.d
/etc/datadog-agent/conf.d/lighttpd.d
/etc/datadog-agent/conf.d/weaviate.d
/etc/datadog-agent/conf.d/aws_neuron.d
/etc/datadog-agent/conf.d/linkerd.d
/etc/datadog-agent/conf.d/weblogic.d
/etc/datadog-agent/conf.d/linux_proc_extras.d
/etc/datadog-agent/conf.d/yarn.d
/etc/datadog-agent/conf.d/azure_iot_edge.d
/etc/datadog-agent/conf.d/boundary.d
/etc/datadog-agent/conf.d/load.d
/etc/datadog-agent/conf.d/zeek.d
/etc/datadog-agent/conf.d/btrfs.d
/etc/datadog-agent/conf.d/mapr.d
/etc/datadog-agent/conf.d/zk.d
/etc/datadog-agent/conf.d/cacti.d
/etc/datadog-agent/conf.d/mapreduce.d
/etc/datadog-agent/conf.d/calico.d
/etc/datadog-agent/conf.d/marathon.d
/etc/datadog-agent/conf.d/cassandra.d
/etc/datadog-agent/conf.d/marklogic.d
/etc/datadog-agent/conf.d/cassandra_nodetool.d
/etc/datadog-agent/conf.d/ceph.d
/etc/datadog-agent/conf.d/memory.d
/etc/datadog-agent/conf.d/mesos_master.d
/etc/datadog-agent/conf.d/checkpoint_quantum_firewall.d
/etc/datadog-agent/conf.d/mesos_slave.d
/etc/datadog-agent/conf.d/mongo.d
/etc/datadog-agent/conf.d/cisco_aci.d
/etc/datadog-agent/conf.d/mysql.d
/etc/datadog-agent/conf.d/nagios.d
/etc/datadog-agent/conf.d/cisco_sdwan.d
/etc/datadog-agent/conf.d/cisco_secure_firewall.d
/etc/datadog-agent/conf.d/network.d
/etc/datadog-agent/conf.d/citrix_hypervisor.d
/etc/datadog-agent/conf.d/network_path.d
/etc/datadog-agent/conf.d/clickhouse.d
/etc/datadog-agent/conf.d/nfsstat.d
/etc/datadog-agent/conf.d/cloud_foundry_api.d
/etc/datadog-agent/conf.d/nginx.d
/etc/datadog-agent/conf.d/cloudera.d
/etc/datadog-agent/conf.d/nginx_ingress_controller.d
/etc/datadog-agent/conf.d/cockroachdb.d
/etc/datadog-agent/conf.d/ntp.d
/etc/datadog-agent/conf.d/confluent_platform.d
/etc/datadog-agent/conf.d/nvidia_triton.d
/etc/datadog-agent/conf.d/oom_kill.d
/etc/datadog-agent/conf.d/container.d
/etc/datadog-agent/conf.d/openldap.d
/etc/datadog-agent/conf.d/container_image.d
/etc/datadog-agent/conf.d/openmetrics.d
/etc/datadog-agent/conf.d/container_lifecycle.d
/etc/datadog-agent/conf.d/openstack.d
/etc/datadog-agent/conf.d/openstack_controller.d
/etc/datadog-agent/conf.d/containerd.d
/etc/datadog-agent/conf.d/oracle-dbm.d
/etc/datadog-agent/conf.d/oracle.d
/etc/datadog-agent/conf.d/orchestrator_ecs.d
/etc/datadog-agent/conf.d/cpu.d
/etc/datadog-agent/conf.d/orchestrator_pod.d
/etc/datadog-agent/conf.d/cri.d
/etc/datadog-agent/conf.d/ossec_security.d
/etc/datadog-agent/conf.d/crio.d
/etc/datadog-agent/conf.d/palo_alto_panorama.d
/etc/datadog-agent/conf.d/pan_firewall.d
/etc/datadog-agent/conf.d/dcgm.d
/etc/datadog-agent/conf.d/pgbouncer.d
/etc/datadog-agent/conf.d/directory.d
/etc/datadog-agent/conf.d/php_fpm.d
/etc/datadog-agent/conf.d/disk.d
/etc/datadog-agent/conf.d/ping_federate.d
/etc/datadog-agent/conf.d/dns_check.d
/etc/datadog-agent/conf.d/postfix.d
/etc/datadog-agent/conf.d/docker.d
/etc/datadog-agent/conf.d/postgres.d
/etc/datadog-agent/conf.d/druid.d
/etc/datadog-agent/conf.d/powerdns_recursor.d
/etc/datadog-agent/conf.d/ecs_fargate.d
/etc/datadog-agent/conf.d/eks_fargate.d
/etc/datadog-agent/conf.d/process.d
/etc/datadog-agent/conf.d/prometheus.d
/etc/datadog-agent/conf.d/envoy.d
/etc/datadog-agent/conf.d/proxysql.d
/etc/datadog-agent/conf.d/esxi.d
/etc/datadog-agent/conf.d/pulsar.d
/etc/datadog-agent/conf.d/ray.d
/etc/datadog-agent/conf.d/file_handle.d
/etc/datadog-agent/conf.d/flink.d
/etc/datadog-agent/conf.d/rethinkdb.d
/etc/datadog-agent/conf.d/fluentd.d
/etc/datadog-agent/conf.d/fluxcd.d
/etc/datadog-agent/conf.d/riakcs.d
/etc/datadog-agent/conf.d/fly_io.d
/etc/datadog-agent/conf.d/sap_hana.d
/etc/datadog-agent/conf.d/foundationdb.d
/etc/datadog-agent/conf.d/sbom.d
/etc/datadog-agent/conf.d/gearmand.d
/etc/datadog-agent/conf.d/scylla.d
/etc/datadog-agent/conf.d/gitlab.d
/etc/datadog-agent/conf.d/service_discovery.d
/etc/datadog-agent/conf.d/gitlab_runner.d
/etc/datadog-agent/conf.d/sidekiq.d
/etc/datadog-agent/conf.d/glusterfs.d
/etc/datadog-agent/conf.d/silk.d
/etc/datadog-agent/conf.d/go_expvar.d
/etc/datadog-agent/conf.d/singlestore.d
/etc/datadog-agent/conf.d/gunicorn.d
/etc/datadog-agent/conf.d/slurm.d
/etc/datadog-agent/conf.d/haproxy.d
/etc/datadog-agent/conf.d/snmp.d/default_profiles
/etc/datadog-agent/conf.d/snmp.d/profiles
/etc/datadog-agent/conf.d/snmp.d/traps_db
/etc/datadog-agent/conf.d/hazelcast.d
/etc/datadog-agent/conf.d/hdfs_datanode.d
/etc/datadog-agent/conf.d/hdfs_namenode.d
/etc/datadog-agent/conf.d/snowflake.d
/etc/datadog-agent/conf.d/hive.d
/etc/datadog-agent/conf.d/solr.d
/etc/datadog-agent/conf.d/hivemq.d
/etc/datadog-agent/conf.d/sonarqube.d
/etc/datadog-agent/conf.d/spark.d
/etc/datadog-agent/conf.d/http_check.d
/etc/datadog-agent/conf.d/hudi.d
/etc/datadog-agent/conf.d/sqlserver.d
/etc/datadog-agent/conf.d/ibm_ace.d
/etc/datadog-agent/conf.d/squid.d
/etc/datadog-agent/conf.d/ibm_db2.d
/etc/datadog-agent/conf.d/ssh_check.d
/etc/datadog-agent/conf.d/ibm_i.d
/etc/datadog-agent/conf.d/statsd.d
/etc/datadog-agent/conf.d/ibm_mq.d
/etc/datadog-agent/conf.d/strimzi.d
/etc/datadog-agent/conf.d/ibm_was.d
/etc/datadog-agent/conf.d/supervisord.d
/etc/datadog-agent/conf.d/ignite.d
/etc/datadog-agent/conf.d/suricata.d
/etc/datadog-agent/conf.d/impala.d
/etc/datadog-agent/conf.d/system_core.d
/etc/datadog-agent/conf.d/io.d
/etc/datadog-agent/conf.d/system_swap.d
/etc/datadog-agent/conf.d/systemd.d
/etc/datadog-agent/conf.d/jboss_wildfly.d
/etc/datadog-agent/conf.d/tcp_check.d
/etc/datadog-agent/conf.d/jetson.d
/etc/datadog-agent/conf.d/tcp_queue_length.d
/etc/datadog-agent/conf.d/jmx.d
/etc/datadog-agent/conf.d/teamcity.d
/etc/datadog-agent/conf.d/journald.d
/etc/datadog-agent/conf.d/tekton.d
/etc/datadog-agent/conf.d/kafka.d
/etc/datadog-agent/conf.d/telemetry.d
/etc/datadog-agent/conf.d/kafka_consumer.d
/etc/datadog-agent/conf.d/teleport.d
/etc/datadog-agent/conf.d/karpenter.d
/etc/datadog-agent/conf.d/temporal.d
/etc/datadog-agent/conf.d/kong.d
/etc/datadog-agent/conf.d/tenable.d
/etc/datadog-agent/conf.d/teradata.d
/etc/datadog-agent/conf.d/tibco_ems.d
/etc/datadog-agent/conf.d/tls.d
/etc/datadog-agent/conf.d/kube_metrics_server.d
/etc/datadog-agent/conf.d/activemq.d
/etc/datadog-agent/conf.d/kube_proxy.d
/etc/datadog-agent/conf.d/torchserve.d
/etc/datadog-agent/conf.d/activemq_xml.d
/etc/datadog-agent/conf.d/traefik_mesh.d
/etc/datadog-agent/conf.d/aerospike.d
/etc/datadog-agent/conf.d/kubeflow.d
/etc/datadog-agent/conf.d/traffic_server.d
/etc/datadog-agent/conf.d/airflow.d
/etc/datadog-agent/conf.d/kubelet.d
/etc/datadog-agent/conf.d/twemproxy.d
/etc/datadog-agent/conf.d/amazon_msk.d
/etc/datadog-agent/conf.d/kubernetes_apiserver.d
/etc/datadog-agent/conf.d/twistlock.d
/etc/datadog-agent/conf.d/ambari.d
/etc/datadog-agent/conf.d/kubernetes_cluster_autoscaler.d
/etc/datadog-agent/conf.d/uptime.d
/etc/datadog-agent/conf.d/varnish.d
/etc/datadog-agent/conf.d/appgate_sdp.d
/etc/datadog-agent/conf.d/kubevirt_api.d
/etc/datadog-agent/conf.d/vault.d
/etc/datadog-agent/conf.d/arangodb.d
/etc/datadog-agent/conf.d/kubevirt_controller.d
/etc/datadog-agent/conf.d/vertica.d
/etc/datadog-agent/conf.d/argo_rollouts.d
/etc/datadog-agent/conf.d/kubevirt_handler.d
/etc/datadog-agent/conf.d/vllm.d
/etc/datadog-agent/conf.d/argo_workflows.d
/etc/datadog-agent/conf.d/voltdb.d

I'm not sure which of these are also enabled by default, but it's worth to know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant