First touch Telegraf + InfluxDB

2022/05/11

First touch Telegraf + InfluxDB

Telegraf 的 plugins 非常多,配置很灵活,参考文档使用。

写入 InfluxDB 前,需要手动配置数据库的 RP (Retention Policy) https://docs.influxdata.com/influxdb/v1.8/query_language/manage-database/#retention-policy-management

CREATE RETENTION POLICY "7d" on "telegraf" DURATION 168h REPLICATION 1 DEFAULT

Telegraf config

[agent]
flush_interval = "3s"
precision = "1ms" # 设置时间精度

[[inputs.kafka_consumer]]
brokers = [ "kafka0:9092", "kafka1:9092", "kafka2:9092" ]
topics = ["uplog_log_hub"]
consumer_group = "telegraf_log_hub_20220511"
offset = "newest"
name_override = "log-hub" # 覆盖 meaturement 名称
data_format = "json"
json_time_key = "@timestamp"
json_time_format = "unix_ms"
tag_keys = ["log_time", "ip"]
json_string_fields = ["error", "date", "h", "m", "path"]

[[outputs.influxdb]]
urls = ["http://influxdb.prometheus.svc.cluster.fud3:8086"]
retention_policy = "7d" # 保留策略的名称

Continuous Query

为了方便 grafana 作告警监控,使用连续查询聚合 5min 的数据,写入新的 measurement 。

CREATE CONTINUOUS QUERY "query5m" ON "telegraf"
BEGIN
  SELECT count(date) 
  INTO telegraf."7d"."log-hub-count-5m" 
  FROM telegraf."7d"."log-hub" 
  GROUP BY time(5m) fill(0) 
END

Errors

运行三天后 telegraf 报如下错误

2022-05-15T02:21:33Z E! [outputs.influxdb] E! [outputs.influxdb] Failed to write metric (will be dropped: 400 Bad Request): partial write: max-series-per-database limit exceeded: (1000000) dropped=11
2022-05-15T02:21:36Z E! [outputs.influxdb] E! [outputs.influxdb] Failed to write metric (will be dropped: 400 Bad Request): partial write: max-series-per-database limit exceeded: (1000000) dropped=13
2022-05-15T02:21:39Z E! [outputs.influxdb] E! [outputs.influxdb] Failed to write metric (will be dropped: 400 Bad Request): partial write: max-series-per-database limit exceeded: (1000000) dropped=15
2022-05-15T02:21:42Z E! [outputs.influxdb] E! [outputs.influxdb] Failed to write metric (will be dropped: 400 Bad Request): partial write: max-series-per-database limit exceeded: (1000000) dropped=13

临时解决办法: InfluxDB 配置解除 max-series-per-database 限制,启动时添加环境变量 INFLUXDB_DATA_MAX_SERIES_PER_DATABASE=0

Ref