下载二进制包:Download | Prometheus
前提条件
Prometheus
alertmanager
设置告警规则
已有监控节点/服务
创建告警机器人
创建群聊
添加机器人
配置安全设置为加签,并记录Webhook和加签密钥
安装dingtalk-webhook
下载地址:Releases · timonwong/prometheus-webhook-dingtalk (github.com)
安装
tar zxvf prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz -C /usr/local/prometheus
mv /usr/local/prometheus/prometheus-webhook-dingtalk-2.1.0.linux-amd64 /usr/local/prometheus/dingtalk
修改配置文件
配置告警消息
vim /usr/local/prometheus/dingtalk/default.tmpl
{{ define "__subject" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}]
{{ end }}
{{ define "__alert_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
**告警主题**: {{ .Annotations.summary }}
**告警类型**: {{ .Labels.alertname }}
**告警级别**: {{ .Labels.severity }}
**告警主机**: {{ .Labels.instance }}
**告警信息**: {{ index .Annotations "description" }}
**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}
{{ define "__resolved_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
**告警主题**: {{ .Annotations.summary }}
**告警类型**: {{ .Labels.alertname }}
**告警级别**: {{ .Labels.severity }}
**告警主机**: {{ .Labels.instance }}
**告警信息**: {{ index .Annotations "description" }}
**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
**恢复时间**: {{ dateInZone "2006.01.02 15:04:05" (.EndsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}
{{ define "default.title" }}
{{ template "__subject" . }}
{{ end }}
{{ define "default.content" }}
{{ if gt (len .Alerts.Firing) 0 }}
**====侦测到{{ .Alerts.Firing | len }}个故障====**
{{ template "__alert_list" .Alerts.Firing }}
---
{{ end }}
{{ if gt (len .Alerts.Resolved) 0 }}
**====恢复{{ .Alerts.Resolved | len }}个故障====**
{{ template "__resolved_list" .Alerts.Resolved }}
{{ end }}
{{ end }}
{{ define "ding.link.title" }}{{ template "default.title" . }}{{ end }}
{{ define "ding.link.content" }}{{ template "default.content" . }}{{ end }}
{{ template "default.title" . }}
{{ template "default.content" . }}
钉钉机器人集成
vim /usr/local/prometheus/dingtalk/config.yml
## Request timeout
# timeout: 5s
## Uncomment following line in order to write template from scratch (be careful!)
#no_builtin_template: true
## Customizable templates path
templates:
- /usr/local/prometheus/dingtalk/default.tmpl
## You can also override default template using `default_message`
## The following example to use the 'legacy' template from v0.3.0
#default_message:
# title: '{{ template "legacy.title" . }}'
# text: '{{ template "legacy.content" . }}'
## Targets, previously was known as "profiles"
targets:
webhook1:
# token
url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxxxxxxxxxxxxxx
# 加签密钥
secret: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
修改alertmanager配置文件
vim /usr/local/prometheus/alertmanager/alertmanager.yml
route:
group_by: ['dingtalk']
group_wait: 1s
group_interval: 5m
repeat_interval: 1h
receiver: 'dingtalk.webhook1'
routes:
- receiver: "dingtalk.webhook1"
match_re:
altername: ".*"
receivers:
- name: 'dingtalk.webhook1'
webhook_configs:
- url: 'http://localhost:8060/dingtalk/webhook1/send'
send_resolved: true
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
重启alertmanager
systemctl restart alertmanager
验证
访问alertmanager地址:http://ip:9093/#/status
,验证配置生效。
测试
找一个节点或服务,我这里停掉当前节点的node_exporter服务
systemctl stop node_exporter
触发以下告警规则
- alert: 服务器宕机
expr: up == 0
for: 1s
labels:
severity: 严重告警
annotations:
summary: "{{$labels.instance}} 服务器宕机, 请尽快处理!"
description: "{{$labels.instance}} 服务器node_exporter服务被关闭,当前状态{{ $value }}. "
稍等一下,收到