Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

otel-integration: add ebpf agent subchart #473

Merged
merged 5 commits into from
Dec 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions otel-integration/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

## OpenTelemtry-Integration

### v0.0.118 / 2024-12-04

- [Feat] Add ebpf tracing agent subchart.

### v0.0.117 / 2024-12-03

- [Feat] Adding new configs to the Target Allocator.
Expand Down
7 changes: 6 additions & 1 deletion otel-integration/k8s-helm/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: v2
name: otel-integration
description: OpenTelemetry Integration
version: 0.0.117
version: 0.0.118
keywords:
- OpenTelemetry Collector
- OpenTelemetry Agent
Expand Down Expand Up @@ -34,6 +34,11 @@ dependencies:
version: "0.98.6"
repository: https://cgx.jfrog.io/artifactory/coralogix-charts-virtual
condition: opentelemetry-gateway.enabled
- name: coralogix-ebpf-agent
alias: coralogix-ebpf-agent
version: "0.1.4"
repository: https://cgx.jfrog.io/artifactory/coralogix-charts
condition: coralogix-ebpf-agent.enabled
sources:
- https://github.com/coralogix/opentelemetry-helm-charts/tree/main/charts/opentelemetry-collector
maintainers:
Expand Down
48 changes: 48 additions & 0 deletions otel-integration/k8s-helm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,17 @@ Provides information about Kubernetes version.
- container_cpu_cfs_periods_total
- container_cpu_cfs_throttled_periods_total

## Coralogix EBPF Agent

coralogix-ebpf-agent is an agent developed by coralogix. using [EBPF](https://ebpf.io/what-is-ebpf/) to extract network traffic as spans (http requests, SQL traffic ect), allowing for [Coralogix APM](https://coralogix.com/docs/user-guides/apm/getting-started/introduction-to-apm/) capabilities without any service instrumentation.

Componentes:
- coralogix-ebpf-agent - The agent that extracts network traffic as spans, running as a daemonset.
- k8s-watcher - The agent that watches for changes in k8s resources and publishes them to redis pubsub for coralogix-ebpf-agent to consume them, running as a deployment with 1 replica.
- redis - Redis Pubsub is used for communication between k8s-watcher and coralogix-ebpf-agent, running as a sts with 1 replica.

to enable the coralogix-ebpf-agent deployment, set `coralogix-ebpf-agent.enabled` to `true` in the `values.yaml` file.

# Prerequisites

Make sure you have at least these version of the following installed:
Expand Down Expand Up @@ -404,6 +415,43 @@ helm upgrade --install otel-coralogix-integration coralogix-charts-virtual/otel-
--render-subchart-notes -f gke-autopilot-values.yaml --set global.clusterName=<cluster_name> --set global.domain=<domain>
```

### Enabling Coralogix EBPF Agent

To enable the coralogix EBPF agent, set `coralogix-ebpf-agent.enabled` to `true` in the `values.yaml` file.

#### Filtering Specific Services For Coralogix EBPF Agent

By default, the coralogix-ebpf-agent will collect traffic from all services in the cluster.
but there are cases where you might want to filter specific services, or filter out specific services. you can use the
`coralogix-ebpf-agent.ebpf_agent.sampler` parameter in `values.yaml` to change the service filtering behavior.

For example, collect only traffic coming from `carts-service` and `orders-service`:

```yaml
coralogix-ebpf-agent:
enabled: true
ebpf_agent:
sampler:
services_filter: ["carts-service", "orders-service"]
services_filter_type: "Allow"
```

In another example, a case of where we want get all services beside `currencyservice`

```yaml
coralogix-ebpf-agent:
enabled: true
ebpf_agent:
sampler:
services_filter: ["currency-service"]
services_filter_type: "Deny"
```

#### What Is Considered A Service By Coralogix EBPF Agent?

A service is defined by the top owner of the specific container the performed the network request, in most cases a Deploymnet, StatefulSet, DaemonSet or CronJob.
the name of the service is the name of that owner resource.

# How to use it

## Available Endpoints
Expand Down
3 changes: 2 additions & 1 deletion otel-integration/k8s-helm/central-agent-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,5 @@ opentelemetry-agent-windows:
enabled: false
opentelemetry-gateway:
enabled: false

coralogix-ebpf-agent:
enabled: false
2 changes: 2 additions & 0 deletions otel-integration/k8s-helm/central-tail-sampling-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -98,3 +98,5 @@ opentelemetry-cluster-collector:
enabled: false
opentelemetry-agent-windows:
enabled: false
coralogix-ebpf-agent:
enabled: false
2 changes: 2 additions & 0 deletions otel-integration/k8s-helm/ci/tail-sampling-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,5 @@ opentelemetry-cluster-collector:
enabled: true
opentelemetry-agent-windows:
enabled: false
coralogix-ebpf-agent:
enabled: false
5 changes: 4 additions & 1 deletion otel-integration/k8s-helm/gke-autopilot-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ opentelemetry-agent:
- statsd
resources:
# slighly larger resource requests for gke/autopilot
# since it was droping data
# since it was dropping data
requests:
cpu: 100m
memory: 256Mi
Expand All @@ -71,3 +71,6 @@ opentelemetry-cluster-collector:

opentelemetry-agent-windows:
enabled: false

coralogix-ebpf-agent:
enabled: false
4 changes: 4 additions & 0 deletions otel-integration/k8s-helm/tail-sampling-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,5 +58,9 @@ opentelemetry-gateway:

opentelemetry-cluster-collector:
enabled: true

opentelemetry-agent-windows:
enabled: false

coralogix-ebpf-agent:
enabled: false
3 changes: 3 additions & 0 deletions otel-integration/k8s-helm/values-cluster-ksm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,6 @@ opentelemetry-cluster-collector:
- prometheus
- k8s_cluster
- prometheus/ksm

coralogix-ebpf-agent:
enabled: false
4 changes: 3 additions & 1 deletion otel-integration/k8s-helm/values-crd-override.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,6 @@ opentelemetry-cluster-collector:
generate: true
configMap:
create: false


coralogix-ebpf-agent:
enabled: false
3 changes: 3 additions & 0 deletions otel-integration/k8s-helm/values-windows-tailsampling.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -338,3 +338,6 @@ opentelemetry-agent:
opentelemetry-cluster-collector:
nodeSelector:
kubernetes.io/os: linux

coralogix-ebpf-agent:
enabled: false
3 changes: 3 additions & 0 deletions otel-integration/k8s-helm/values-windows.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -361,3 +361,6 @@ opentelemetry-agent:
opentelemetry-cluster-collector:
nodeSelector:
kubernetes.io/os: linux

coralogix-ebpf-agent:
enabled: false
37 changes: 36 additions & 1 deletion otel-integration/k8s-helm/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ global:
defaultSubsystemName: "integration"
logLevel: "warn"
collectionInterval: "30s"
version: "0.0.117"
version: "0.0.118"

extensions:
kubernetesDashboard:
Expand Down Expand Up @@ -1113,3 +1113,38 @@ opentelemetry-receiver:
enabled: false
zipkin:
enabled: false

coralogix-ebpf-agent:
enabled: false
ebpf_agent:
debug: false
debug_modules:
- Otel
otel:
exporter:
max_queue_size: 10240
max_concurrent_exports: 3
sampler:
services_filter: []
services_filter_type: "Allow" # Deny for blacklist
resources:
limits:
cpu: "1"
memory: 2Gi
requests:
cpu: 100m
memory: 128Mi
k8s_watcher:
replicaCount: 1
debug: false
resources:
limits:
cpu: "1"
memory: 2Gi
requests:
cpu: 100m
memory: 128Mi
priorityClass:
create: false
name: ""
value: 1000000000
Loading