Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Compactor Parsing Minio Host Causes Deployment Failure in Helm Loki Distributed Mode #13354

Open
puffinjiang opened this issue Jun 29, 2024 · 0 comments

Comments

@puffinjiang
Copy link

Describe the bug

Use helm to deploy loki distributed mode, use minio storage, and the host error when compactor connects to minio causes the service to be unavailable。

To Reproduce
Steps to reproduce the behavior:

  1. Create the values.yaml
loki:
  schemaConfig:
    configs:
      - from: 2024-04-01
        store: tsdb
        object_store: s3
        schema: v13
        index:
          prefix: loki_index_
          period: 24h

  ingester:
    chunk_encoding: snappy
  tracing:
    enabled: true
  querier:
    max_concurrent: 4

  storage:
    type: s3
    bucketNames:
      chunks: "loki-chunks"
      ruler: "loki-ruler"
     #admin: "loki-admin" 
    s3:
      s3: http://XXXXXX:[email protected]:9000/loki-data
      endpoint: http://192.168.199.190:9000
      secretAccessKey: XXXXXX
      accessKeyId: XXXXXX
      s3ForcePathStyle: false
      insecure: true
     
deploymentMode: Distributed

minio:
    enabled: false

ingester:
  replicas: 3
querier:
  replicas: 3
  maxUnavailable: 2
queryFrontend:
  replicas: 2
  maxUnavailable: 1
queryScheduler:
  replicas: 2
distributor:
  replicas: 3
  maxUnavailable: 2
compactor:
  replicas: 1
indexGateway:
  replicas: 2
  maxUnavailable: 1

bloomCompactor:
  replicas: 0
bloomGateway:
  replicas: 0

backend:
  replicas: 0
read:
  replicas: 0
write:
  replicas: 0

singleBinary:
  replicas: 0
  1. use this command to deploy the service
helm upgrade --values values.yaml --install loki grafana/loki -n monitor
  1. Checking the pods, I found that some services were always in the pending state
kubectl get pod -n monitor

output:

NAME                                                     READY   STATUS    RESTARTS        AGE
loki-distributor-65978865f4-kwqnd                        0/1     Pending   0               5h28m
loki-ingester-zone-b-0                                   0/1     Pending   0               5h28m
loki-ingester-zone-c-0                                   0/1     Pending   0               5h28m
loki-querier-6c976695f6-vvgkp                            0/1     Pending   0               5h28m
loki-results-cache-0                                     2/2     Running   0               5h28m
loki-chunks-cache-0                                      2/2     Running   0               5h28m
loki-index-gateway-1                                     0/1     Pending   0               5h27m
loki-distributor-ccb7dcdb6-km4lm                         0/1     Pending   0               4h29m
loki-querier-78ff88888d-8b7sf                            0/1     Pending   0               4h29m
loki-query-scheduler-56f7c9c4c9-mjhq2                    1/1     Running   0               5h28m
loki-ingester-zone-a-0                                   1/1     Running   0               5h28m
loki-querier-6c976695f6-z8r8g                            1/1     Running   0               5h28m
loki-distributor-65978865f4-8zwlt                        1/1     Running   0               5h28m
loki-canary-th5bz                                        1/1     Running   0               5h28m
loki-index-gateway-0                                     1/1     Running   0               5h28m
loki-gateway-b858d5c76-tvfjs                             1/1     Running   0               5h28m
loki-query-scheduler-56f7c9c4c9-zs4nl                    0/1     Pending   0               13m
loki-query-frontend-56849c945f-52gz2                     1/1     Running   0               5h28m
loki-query-frontend-56849c945f-57mfr                     0/1     Pending   0               13m
loki-compactor-0                                         1/1     Running   0               13m
  1. View compactor logs
kubectl logs -f loki-compactor-0  -n monitor

output:

level=info ts=2024-06-29T15:50:04.039508293Z caller=main.go:120 msg="Starting Loki" version="(version=3.0.0, branch=HEAD, revision=b4f7181c7a)"
level=info ts=2024-06-29T15:50:04.041378725Z caller=server.go:354 msg="server listening on addresses" http=[::]:3100 grpc=[::]:9095
level=info ts=2024-06-29T15:50:04.042218001Z caller=memberlist_client.go:435 msg="Using memberlist cluster label and node name" cluster_label= node=loki-compactor-0-c467c7ae
level=info ts=2024-06-29T15:50:04.044342285Z caller=memberlist_client.go:541 msg="memberlist fast-join starting" nodes_found=1 to_join=4
level=info ts=2024-06-29T15:50:04.046026381Z caller=module_service.go:82 msg=starting module=analytics
level=info ts=2024-06-29T15:50:04.046662569Z caller=module_service.go:82 msg=starting module=runtime-config
level=info ts=2024-06-29T15:50:04.046885497Z caller=module_service.go:82 msg=starting module=server
level=info ts=2024-06-29T15:50:04.04703965Z caller=module_service.go:82 msg=starting module=memberlist-kv
level=info ts=2024-06-29T15:50:04.047068014Z caller=module_service.go:82 msg=starting module=compactor
level=info ts=2024-06-29T15:50:04.05019551Z caller=memberlist_client.go:561 msg="memberlist fast-join finished" joined_nodes=3 elapsed_time=5.855732ms
level=info ts=2024-06-29T15:50:04.050227681Z caller=memberlist_client.go:573 phase=startup msg="joining memberlist cluster" join_members=loki-memberlist
level=info ts=2024-06-29T15:50:04.050247858Z caller=basic_lifecycler.go:297 msg="instance not found in the ring" instance=loki-compactor-0 ring=compactor
level=info ts=2024-06-29T15:50:04.050262251Z caller=basic_lifecycler_delegates.go:63 msg="not loading tokens from file, tokens file path is empty"
level=info ts=2024-06-29T15:50:04.050412048Z caller=compactor.go:410 msg="waiting until compactor is JOINING in the ring"
level=info ts=2024-06-29T15:50:04.050441363Z caller=compactor.go:414 msg="compactor is JOINING in the ring"
level=info ts=2024-06-29T15:50:04.05380835Z caller=memberlist_client.go:580 phase=startup msg="joining memberlist cluster succeeded" reached_nodes=3 elapsed_time=3.570155ms
level=info ts=2024-06-29T15:50:05.051261002Z caller=compactor.go:424 msg="waiting until compactor is ACTIVE in the ring"
level=info ts=2024-06-29T15:50:05.225567068Z caller=compactor.go:428 msg="compactor is ACTIVE in the ring"
level=info ts=2024-06-29T15:50:05.225613034Z caller=loki.go:503 msg="Loki started" startup_time=1.200228407s
level=info ts=2024-06-29T15:50:10.22625152Z caller=compactor.go:489 msg="this instance has been chosen to run the compactor, starting compactor"
level=info ts=2024-06-29T15:50:10.226308266Z caller=compactor.go:518 msg="waiting 10m0s for ring to stay stable and previous compactions to finish before starting compactor"
level=error ts=2024-06-29T15:50:41.340839447Z caller=reporter.go:205 msg="failed to delete corrupted cluster seed file, deleting it" err="RequestError: send request failed\ncaused by: Delete \"http://loki-chunks.192.168.199.190:9000/loki_cluster_seed.json\": dial tcp: lookup loki-chunks.192.168.199.190 on 10.43.0.10:53: no such host"
level=error ts=2024-06-29T15:53:58.263301672Z caller=reporter.go:205 msg="failed to delete corrupted cluster seed file, deleting it" err="RequestError: send request failed\ncaused by: Delete \"http://loki-chunks.192.168.199.190:9000/loki_cluster_seed.json\": dial tcp: lookup loki-chunks.192.168.199.190 on 10.43.0.10:53: no such host"
level=error ts=2024-06-29T15:57:58.750344139Z caller=reporter.go:205 msg="failed to delete corrupted cluster seed file, deleting it" err="RequestError: send request failed\ncaused by: Delete \"http://loki-chunks.192.168.199.190:9000/loki_cluster_seed.json\": dial tcp: lookup loki-chunks.192.168.199.190 on 10.43.0.10:53: no such host"
level=info ts=2024-06-29T16:00:10.227246981Z caller=compactor.go:523 msg="compactor startup delay completed"
ts=2024-06-29T16:00:10.227316863Z caller=spanlogger.go:109 level=info msg="building table names cache"
ts=2024-06-29T16:00:10.248310633Z caller=spanlogger.go:109 level=info msg="table names cache built" duration=20.938542ms
level=error ts=2024-06-29T16:00:10.248335317Z caller=cached_client.go:189 msg="failed to build table names cache" err="RequestError: send request failed\ncaused by: Get \"http://loki-chunks.192.168.199.190:9000/?delimiter=%2F&list-type=2&prefix=index%2F\": dial tcp: lookup loki-chunks.192.168.199.190 on 10.43.0.10:53: no such host"
ts=2024-06-29T16:00:10.248358216Z caller=spanlogger.go:109 level=info msg="building table names cache"
ts=2024-06-29T16:00:10.264549616Z caller=spanlogger.go:109 level=info msg="table names cache built" duration=16.185ms
level=error ts=2024-06-29T16:00:10.264575614Z caller=cached_client.go:189 msg="failed to build table names cache" err="RequestError: send request failed\ncaused by: Get \"http://loki-chunks.192.168.199.190:9000/?delimiter=%2F&list-type=2&prefix=index%2F\": dial tcp: lookup loki-chunks.192.168.199.190 on 10.43.0.10:53: no such host"
ts=2024-06-29T16:00:10.264598303Z caller=spanlogger.go:109 level=info msg="building table names cache"
ts=2024-06-29T16:00:10.28709128Z caller=spanlogger.go:109 level=info msg="table names cache built" duration=22.486864ms
level=error ts=2024-06-29T16:00:10.287113904Z caller=cached_client.go:189 msg="failed to build table names cache" err="RequestError: send request failed\ncaused by: Get \"http://loki-chunks.192.168.199.190:9000/?delimiter=%2F&list-type=2&prefix=index%2F\": dial tcp: lookup loki-chunks.192.168.199.190 on 10.43.0.10:53: no such host"
level=error ts=2024-06-29T16:00:10.287159755Z caller=compactor.go:530 msg="failed to run compaction" err="failed to list tables: RequestError: send request failed\ncaused by: Get \"http://loki-chunks.192.168.199.190:9000/?delimiter=%2F&list-type=2&prefix=index%2F\": dial tcp: lookup loki-chunks.192.168.199.190 on 10.43.0.10:53: no such host"

Why the host is not 192.168.199.190 but loki-chunks.192.168.199.190 ?

Expected behavior

The compactor can connect to minio normally, and all services are running normally。

Environment:

  • OS Debian GNU/Linux 12 (bookworm)
  • k3s version v1.29.3+k3s1 (8aecc26b)
  • helm version v3.15.2
  • loki chart version 6.6.4

Screenshots, Promtail config, or terminal output
N/A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants