Skip to content

Home Eng

chengshiwen edited this page Oct 9, 2024 · 89 revisions

InfluxDB Cluster Wiki Document

CN doc EN doc LICENSE Releases GitHub stars Docker pulls

InfluxDB Cluster - An Open-Source Distributed Time Series Database, Open Source Alternative to InfluxDB Enterprise

🎉🎉 InfluxDB Cluster has now become an officially recommended solution by Huawei Cloud, please refer to Huawei Cloud Highly Available InfluxDB-Cluster Solution and Huawei Cloud HAIC Solution Documentation

TOC

Introduction

InfluxDB Cluster is an open source time series database with no external dependencies. It's useful for recording metrics, events, and performing analytics.

InfluxDB Cluster is inspired by InfluxDB Enterprise, InfluxDB v1.8.10 and InfluxDB v0.11.1, aiming to replace InfluxDB Enterprise.

InfluxDB Cluster is easy to maintain, and can be updated in real time with upstream InfluxDB 1.x.

Features

  • Built-in HTTP API so you don't have to write any server side code to get up and running.
  • Data can be tagged, allowing very flexible querying.
  • SQL-like query language.
  • Clustering is supported out of the box, so that you can scale horizontally to handle your data. Clustering is currently in production state.
  • Simple to install and manage, and fast to get data in and out.
  • It aims to answer queries in real-time. That means every data point is indexed as it comes in and is immediately available in queries that should return in < 100ms.

Architecture

An InfluxDB Cluster installation consists of two separate groups of processes: data nodes and meta nodes. Communication within a cluster looks like this:

architecture.png

Network Architecture:

network-architecture

The meta nodes communicate with each other via a TCP protocol and the Raft consensus protocol that all use port 8089 by default. This port must be reachable between the meta nodes. The meta nodes also expose an HTTP API bound to port 8091 by default that the influxd-ctl command uses.

Data nodes communicate with each other through a TCP protocol that is bound to port 8088. Data nodes communicate with the meta nodes through their HTTP API bound to 8091. These ports must be reachable between the meta and data nodes.

Within a cluster, all meta nodes must communicate with all other meta nodes. All data nodes must communicate with all other data nodes and all meta nodes.

Concepts

Clustering

Please refer to: Clustering. Especially, look out for

Glossary

Please refer to: Glossary. Especially, look out for

Design & Layout

Please refer to:

Docker Quickstart

Start cluster by docker compose

Download docker-compose.yml, then start 3 meta nodes and 2 data nodes by docker-compose:

docker-compose up -d
docker exec -it influxdb-meta-01 bash
influxd-ctl add-meta influxdb-meta-01:8091
influxd-ctl add-meta influxdb-meta-02:8091
influxd-ctl add-meta influxdb-meta-03:8091
influxd-ctl add-data influxdb-data-01:8088
influxd-ctl add-data influxdb-data-02:8088
influxd-ctl show

Stop and remove them when they are no longer in use:

docker-compose down -v

Note: To persist data in containers, be sure to mount the directories /etc/influxdb and /var/lib/influxdb for all meta and data nodes.

Create your first database

curl -XPOST "http://influxdb-data-01:8086/query" --data-urlencode "q=CREATE DATABASE mydb WITH REPLICATION 2"

Insert some data

curl -XPOST "http://influxdb-data-01:8086/write?db=mydb" \
-d 'cpu,host=server01,region=uswest load=42 1434055562000000000'

curl -XPOST "http://influxdb-data-02:8086/write?db=mydb&consistency=all" \
-d 'cpu,host=server02,region=uswest load=78 1434055562000000000'

curl -XPOST "http://influxdb-data-02:8086/write?db=mydb&consistency=quorum" \
-d 'cpu,host=server03,region=useast load=15.4 1434055562000000000'

Note: consistency=[any,one,quorum,all] sets the write consistency for the point. consistency is one if you do not specify consistency. See the Write consistency for detailed descriptions of each consistency option.

any: return success to the client as soon as any node has responded with a write success, or the receiving node has written the data to its hinted handoff queue.

one: return success to the client as soon as any node has responded with a write success, but not if the write is only in hinted handoff.

quorum: return success when a majority of nodes (more than half of replication factor) return success. This option is only useful if the replication factor is greater than 2, otherwise it is equivalent to all.

all: return success only when all nodes return success.

Query for the data

curl -G "http://influxdb-data-02:8086/query?pretty=true" --data-urlencode "db=mydb" \
--data-urlencode "q=SELECT * FROM cpu WHERE host='server01' AND time < now() - 1d"

Analyze the data

curl -G "http://influxdb-data-02:8086/query?pretty=true" --data-urlencode "db=mydb" \
--data-urlencode "q=SELECT mean(load) FROM cpu WHERE region='uswest'"

Kubernetes & Helm Chart

Visit: https://github.com/influxtsdb/helm-charts/tree/master/charts/influxdb-cluster

Download InfluxDB Cluster Helm chart, then run:

helm install influxdb-cluster ./influxdb-cluster

A release named influxdb-cluster will be started.

Note: To persist data in containers, be sure to mount the directory /var/lib/influxdb to PVCs for all meta and data nodes.

Installation

We recommend installing InfluxDB Cluster using one of the pre-built releases.

Complete the following steps to install an InfluxDB Cluster in your own environment:

Note: The installation of InfluxDB Cluster is exactly the same as that of InfluxDB Enterprise, please refer to Install an InfluxDB Enterprise cluster.

Meta node setup

0. Setup description and requirements

The Production Installation process sets up three meta nodes, with each meta node running on its own server.

InfluxDB Cluster requires at least three meta nodes and an odd number of meta nodes for high availability and redundancy.

Note 1: InfluxDB Cluster does not recommend having more than three meta nodes unless your servers or the communication between the servers have chronic reliability issues.

Note 2: Deploying multiple meta nodes on the same server is strongly discouraged since it creates a larger point of potential failure if that particular server is unresponsive. InfluxDB Cluster recommends deploying meta nodes on relatively small footprint servers.

Note 3: To start the cluster with a single meta node, pass the -single-server flag when starting the single meta node.

Suppose there are three servers: influxdb-meta-01, influxdb-meta-02 and influxdb-meta-03.

Ports: Meta nodes communicate over ports 8088, 8089, and 8091.

1. Add appropriate DNS entries for each of your servers

Note: If you only want to use the IP address instead of the hostname, skip the current step and go to step 2.

Ensure that your servers' hostnames and IP addresses are added to your network’s DNS environment.

Verification steps:

Before proceeding with the installation, verify on each meta and data server that the other servers are resolvable. Here is an example set of shell commands using ping:

ping -qc 1 influxdb-meta-01
ping -qc 1 influxdb-meta-02
ping -qc 1 influxdb-meta-03

2. Edit the configuration file

In /etc/influxdb/influxdb-meta.conf:

  • Uncomment hostname and set to the full hostname of the meta node.
hostname="influxdb-meta-0x"

Note: If you only want to use the IP address instead of the hostname, you must set hostname to an IP address.

3. Start the meta services

Start the meta service on server influxdb-meta-01, influxdb-meta-02 and influxdb-meta-03 respectively

/usr/bin/influxd-meta -config /etc/influxdb/influxdb-meta.conf

4. Join the meta nodes to the cluster

From one and only one meta node, join all meta nodes including itself. In our example, from influxdb-meta-01, run:

influxd-ctl add-meta influxdb-meta-01:8091
influxd-ctl add-meta influxdb-meta-02:8091
influxd-ctl add-meta influxdb-meta-03:8091

Note: If you only want to use the IP address instead of the hostname, then you should add the -bind option to specify the binding HTTP address of the meta node to be connected (the default value is localhost:8091), execute:

influxd-ctl -bind meta-01-ip:8091 add-meta meta-01-ip:8091
influxd-ctl -bind meta-01-ip:8091 add-meta meta-02-ip:8091
influxd-ctl -bind meta-01-ip:8091 add-meta meta-03-ip:8091
influxd-ctl -bind meta-01-ip:8091 show

The expected output is:

Added meta node x at influxdb-meta-0x:8091

Verification steps:

Issue the following command on any meta node:

influxd-ctl show

The expected output is:

Data Nodes
==========
ID  TCP Address  Version

Meta Nodes
==========
ID  TCP Address            Version
1   influxdb-meta-01:8091  1.8.11-c1.2.0
2   influxdb-meta-02:8091  1.8.11-c1.2.0
3   influxdb-meta-03:8091  1.8.11-c1.2.0

Data node setup

0. Setup description and requirements

The Production Installation process sets up two data nodes and each data node runs on its own server.

InfluxDB Cluster requires at least two data nodes for high availability and redundancy.

Note 1: that there is no requirement for each data node to run on its own server. However, best practices are to deploy each data node on a dedicated server.

Note 2: InfluxDB Cluster does not function as a load balancer. You will need to configure your own load balancer to send client traffic to the data nodes on port 8086 (the default port for the HTTP API).

Suppose there are two servers: influxdb-data-01 and influxdb-data-02.

Ports: Data nodes communicate over ports 8088, 8089, and 8091.

1. Add appropriate DNS entries for each of your servers

Note: If you only want to use the IP address instead of the hostname, skip the current step and go to step 2.

Ensure that your servers' hostnames and IP addresses are added to your network’s DNS environment.

Verification steps:

Before proceeding with the installation, verify on each meta and data server that the other servers are resolvable. Here is an example set of shell commands using ping:

ping -qc 1 influxdb-data-01
ping -qc 1 influxdb-data-02

2. Edit the configuration file

In /etc/influxdb/influxdb.conf:

  • Uncomment hostname and set to the full hostname of the data node.
hostname="influxdb-data-0x"

Note: If you only want to use the IP address instead of the hostname, you must set hostname to an IP address.

3. Start the data service

Start the data service on server influxdb-data-01 and influxdb-data-02 respectively

/usr/bin/influxd -config /etc/influxdb/influxdb.conf

Note: It is normal for Failed to create storage, failed to store statistics or meta service unavailable logs to appear before the data node is not joined to the cluster.

4. Join the data nodes to the cluster

You should join your data nodes to the cluster only when you are adding a brand new node, either during the initial creation of your cluster or when growing the number of data nodes. If you are replacing an existing data node with influxd-ctl update-data, skip the rest of this step.

Run the add-data command once and only once for each data node you are joining to the cluster:

influxd-ctl add-data influxdb-data-01:8088
influxd-ctl add-data influxdb-data-02:8088

Note: If you are on a data node, then you should add the -bind option to specify the binding address of the meta node to be connected (the default value is localhost:8091), execute:

influxd-ctl -bind influxdb-meta-01:8091 add-data influxdb-data-01:8088
influxd-ctl -bind influxdb-meta-01:8091 add-data influxdb-data-02:8088
influxd-ctl -bind influxdb-meta-01:8091 show

Note: If you only want to use the IP address instead of the hostname, then you should add the -bind option to specify the binding HTTP address of the meta node to be connected (the default value is localhost:8091), execute:

influxd-ctl -bind meta-01-ip:8091 add-data data-01-ip:8088
influxd-ctl -bind meta-01-ip:8091 add-data data-02-ip:8088
influxd-ctl -bind meta-01-ip:8091 show

The expected output is:

Added data node y at influxdb-data-0x:8088

Verification steps:

Issue the following command on any meta node:

influxd-ctl show

The expected output is:

Data Nodes
==========
ID  TCP Address            Version
4   influxdb-data-01:8088  1.8.11-c1.2.0
5   influxdb-data-02:8088  1.8.11-c1.2.0

Meta Nodes
==========
ID  TCP Address            Version
1   influxdb-meta-01:8091  1.8.11-c1.2.0
2   influxdb-meta-02:8091  1.8.11-c1.2.0
3   influxdb-meta-03:8091  1.8.11-c1.2.0

Configuration

Configure Cluster

Please refer to

Note: The configuration of InfluxDB Cluster is almost the same as that of InfluxDB Enterprise, The only difference is that InfluxDB Cluster uses [coordinator] setting, while InfluxDB Enterprise uses [cluster]

Key Settings

Please refer to Configure Key Settings

Unsupported Settings

Compared with InfluxDB Enterprise, the following configuration settings are not currently supported, and will be gradually supported in the future

Data node:

[monitor]
  remote-collect-interval
[hinted-handoff]
  retry-concurrency
  batch-size
[anti-entropy]
  max-fetch
  max-sync
  auto-repair-missing

Meta node:

[meta]
  ldap-allowed
  consensus-timeout

Query

InfluxQL

Please refer to: Influx Query Language (InfluxQL)

Flux

Please refer to: Flux data scripting language

Write

Please refer to: Write line protocols in InfluxDB

HTTP Endpoint

Data Node HTTP Endpoint

/query HTTP endpoint

Please refer to: /query HTTP endpoint

/write HTTP endpoint

Please refer to: /write HTTP endpoint

/api/v2/query HTTP endpoint

Please refer to: /api/v2/query HTTP endpoint

/api/v2/write HTTP endpoint

Please refer to: /api/v2/write HTTP endpoint

Web UI & GUI

Web UI

GUI

  • InfluxDB Studio: InfluxDB Studio is a desktop UI management tool for the InfluxDB time series database
  • Time Series Admin: Administration panel and querying interface for InfluxDB databases (Electron app / Docker container)
  • InfluxDB WorkBench: InfluxDB WorkBench is a GUI tool for the InfluxDB database written in Java that runs on Windows , Mac OS X and Unix/Linux
  • InfluxDB-GUI: The free InfluxDB GUI provides users with an intuitive and easy-to-use database management and operation experience
  • InfluxGUI: Standalone GUI for InfluxDB for debugging and verify Influx Query
  • InfluxDB GUI: InfluxDB GUI tool developed based on Tauri

Basic Guides

Migrate InfluxDB OSS to InfluxDB Cluster

Please refer to: Migrate InfluxDB OSS instances to InfluxDB Cluster

Write data with the InfluxDB API

Please refer to: Write data with the InfluxDB API

Query data with the InfluxDB API

Please refer to: Query data with the InfluxDB API

Authenticate requests to InfluxDB Cluster

Please refer to: Authenticate requests to InfluxDB Cluster

Downsample and retain data

Please refer to: Downsample and retain data

Hardware sizing guidelines

Please refer to: Hardware sizing guidelines

Calculate percentages in a query

Please refer to: Calculate percentages in a query

Administration Guides

Configure authentication

Please refer to: Configure authentication

Configure HTTPS over TLS

Please refer to: Configure HTTPS over TLS

Configure TCP and UDP ports

Please refer to: Configure TCP and UDP ports used in InfluxDB Cluster

Manage clusters

Note: Limited support, influxd-ctl already supports 13 commands, the remaining 7 commands backup, restore, copy-shard-status, kill-copy-shard, node-labels, entropy, ldap are not yet supported

Please refer to: Manage InfluxDB Cluster

Replace nodes

Please refer to: Replace InfluxDB Cluster meta nodes and data nodes

Rebalance clusters

Please refer to: Rebalance InfluxDB Cluster

Manage users and permissions

Note: Not yet supported

Please refer to: Manage users and permissions

Manage subscriptions

Please refer to: Manage subscriptions in InfluxDB Cluster

Rename hosts

Please refer to: Rename hosts in InfluxDB Cluster

Use Anti-entropy service

Note: Not yet supported

Please refer to: Use Anti-Entropy service in InfluxDB Cluster

Backup and restore

Note: Limited support, backup and restore (influxd-ctl backup, influxd-ctl restore) not yet supported

Please refer to: Back up and restore InfluxDB Cluster

Note: Export and import supported (influx_inspect export, influx -import)

Please refer to: Exporting and importing data

Upgrade

Please refer to: Upgrade InfluxDB Cluster

Log and trace

Please refer to: Log and trace InfluxDB Cluster operations

Tools

Please refer to: InfluxDB Cluster tools (influx, influxd, influxd-ctl, influx_inspect)

Diagnostics

Please refer to: Use InfluxQL for diagnostics

Troubleshoot

Please refer to: Troubleshoot InfluxDB Cluster

Release Notes

Please refer to: InfluxDB Cluster Releases

v1.8.11-c1.2.0

  • Apply important bug fixes from release 1.9 to 1.11
  • Fix known and security issues to improve security and stability

Note: Why does v1.8.11 exist?

Because v1.8.10 is the last version of 1.8 and has been out of maintenance for a long time. v1.8.11 is based on v1.8.10, and applied important bug fixes from 1.9 to 1.11. For details of the commits, see https://github.com/chengshiwen/influxdb/compare/v1.8.10...1.8.11

Development Plans

v1.8.11-c1.2.1

  • Backup and restore: influxd-ctl backup and influxd-ctl restore support
  • Copy shard: influxd-ctl copy-shard-status and influxd-ctl kill-copy-shard support
  • Hinted handoff: retry-concurrency and batch-size support to improve hinted handoff rewrite performance

v1.8.11-c1.3.0

  • Node labels: influxd-ctl node-labels support
  • Anti-Entropy: Anti-Entropy service and influxd-ctl entropy support
  • LDAP: influxd-ctl ldap support
  • User and permission

v1.11.7-c1.4.0

Related Projects

Project Stars Description Comments
chengshiwen/influxdb-cluster GitHub stars Open Source Alternative to InfluxDB Enterprise Almost identical to InfluxDB Enterprise based on the latest InfluxDB 1.8.10, high quality and lean code with significantly fewer bugs, easy to maintain and keep up with the InfluxDB upstream, ready for production
influxdata/influxdb-relay GitHub stars Service to replicate InfluxDB data for high availability Officially produced, but not maintained for a long time, many features are missing, like data sharding is not supported (See Note 1)
shell909090/influx-proxy GitHub stars High availability proxy layer to InfluxDB by InfluxData Ele.me produced, inspired by influxdb-relay, but not maintained for a long time, some features are missing (See Note 1)
chengshiwen/influx-proxy GitHub stars InfluxDB Proxy with High Availability and Consistent Hash Forked from shell909090/influx-proxy, more features supported with high availability and consistent hash (See Note 1), ready for production
freetsdb/freetsdb GitHub stars Open-Source replacement for InfluxDB Enterprise Low quality and complex code with many bugs (See Note 2) based on InfluxDB 1.7.4, only 10% of InfluxDB Cluster features are implemented, hard to maintain and unable to keep up with the InfluxDB upstream, not ready for production
spring-avengers/influx-proxy GitHub stars A proxy for InfluxDB like codis Not maintained for a long time, many features are missing, not ready for production
derek0377/influxdb-cluster GitHub stars InfluxDB cluster proxy like codis Not maintained for a long time, many features are missing, not ready for production
jasonjoo2010/chronus GitHub stars InfluxDB cluster based on version 1.8.3 Program design and behavior differ from InfluxDB Enterprise, many features are missing, not ready for production

Note 1: shell909090/influx-proxy/issues/111

Note 2: freetsdb/freetsdb/issues/7, freetsdb/freetsdb/issues/8, freetsdb/freetsdb/issues/9

Community & Communication

wechat-group-eng
Clone this wiki locally