Myluzh Blog

Prometheus Blackbox(黑盒监控)

发布时间: 2024-10-28 文章作者: myluzh 分类名称: Kubernetes 朗读文章


0x00 前言
白盒监控:主要通过应用程序内部的指标(如 Prometheus 的 /metrics 接口)监测系统性能,提供深入的技术洞察。
黑盒监控:主要通过 HTTP 请求、TCP 测试等外部监控手段获取数据,关注系统外部行为和功能,往往不需要访问应用程序的内部结构。
---
通过kube-prometheus方式部署,默认就有blackbox-exporter,如果没有的话手动部署下。

0x01 手动部署 blackbox_exporter
github地址:https://github.com/prometheus/blackbox_exporter
下面的 blackbox.yml 也可以直接下载下来导入
# curl -o blackbox.yml https://raw.githubusercontent.com/prometheus/blackbox_exporter/master/blackbox.yml
# kubectl get configmap blackbox-exporter-config --namespace=monitoring
apiVersion: apps/v1
kind: Deployment
metadata:
  name: blackbox-exporter
  namespace: monitoring
  labels:
    app: blackbox-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: blackbox-exporter
  template:
    metadata:
      labels:
        app: blackbox-exporter
    spec:
      containers:
      - name: blackbox-exporter
        image: quay.io/prometheus/blackbox-exporter:latest
        ports:
        - containerPort: 9115
        # livenessProbe:
        #   httpGet:
        #     path: /-/healthy
        #     port: 9115
        #   initialDelaySeconds: 30
        #   periodSeconds: 10
        # readinessProbe:
        #   httpGet:
        #     path: /-/ready
        #     port: 9115
        #   initialDelaySeconds: 30
        #   periodSeconds: 10
        volumeMounts:
        - name: config-volume
          mountPath: /etc/blackbox_exporter
      volumes:
      - name: config-volume
        configMap:
          name: blackbox-exporter-config
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: blackbox-exporter-config
  namespace: monitoring
data:
  blackbox.yml: |
    modules:
      http_2xx:
        prober: http
        http:
          preferred_ip_protocol: "ip4"
      http_post_2xx:
        prober: http
        http:
          method: POST
      tcp_connect:
        prober: tcp
      pop3s_banner:
        prober: tcp
        tcp:
          query_response:
          - expect: "^+OK"
          tls: true
          tls_config:
            insecure_skip_verify: false
      grpc:
        prober: grpc
        grpc:
          tls: true
          preferred_ip_protocol: "ip4"
      grpc_plain:
        prober: grpc
        grpc:
          tls: false
          service: "service1"
      ssh_banner:
        prober: tcp
        tcp:
          query_response:
          - expect: "^SSH-2.0-"
          - send: "SSH-2.0-blackbox-ssh-check"
      ssh_banner_extract:
        prober: tcp
        timeout: 5s
        tcp:
          query_response:
          - expect: "^SSH-2.0-([^ -]+)(?: (.*))?$"
            labels:
            - name: ssh_version
              value: "${1}"
            - name: ssh_comments
              value: "${2}"
      irc_banner:
        prober: tcp
        tcp:
          query_response:
          - send: "NICK prober"
          - send: "USER prober prober prober :prober"
          - expect: "PING :([^ ]+)"
            send: "PONG ${1}"
          - expect: "^:[^ ]+ 001"
      icmp:
        prober: icmp
      icmp_ttl5:
        prober: icmp
        timeout: 5s
        icmp:
          ttl: 5
---
apiVersion: v1
kind: Service
metadata:
  name: blackbox-exporter
  namespace: monitoring
  labels:
    app: blackbox-exporter
spec:
  ports:
  - port: 9115
    targetPort: 9115
    protocol: TCP
  selector:
    app: blackbox-exporter
  type: ClusterIP
测试 Blackbox Exporter 是否能够成功探测并访问指定的目标网址:
root@k8s-master:~/kube-prometheus/manifests# kubectl get svc -n  monitoring | grep black
blackbox-exporter       ClusterIP   10.43.9.81     <none>        9115/TCP,19115/TCP           6d1h
root@k8s-master:~/kube-prometheus/manifests# curl http://10.43.9.81:19115/probe?target=http://www.baidu.com&module=http_2xx
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gauge
probe_success 1
...

0x02 创建抓取配置(Additional
1、编写additional-scrape-configs.yaml 下面的blackbox exporter地址记得改,我这边集群blackbox exporter http是19115,有些是9115
root@k8s-master:~/prom# vi prometheus-additional.yaml
- job_name: 'blackbox'
  metrics_path: /probe
  params:
    module: [http_2xx]
  static_configs:
    - targets:
        - https://www.baidu.com
        - http://dxp.test.sxhlcloud.com/
        - http://dxp.test.sxhlcloud.com/prod-api/code
        - http://dxp.sxhlcloud.com/
        - http://dxp.sxhlcloud.com/prod-api/code
  relabel_configs:
    - source_labels: [__address__]
      target_label: __param_target
    - source_labels: [__param_target]
      target_label: instance
    - target_label: __address__
      replacement: blackbox-exporter:19115
    #- source_labels: [instance]
    #  regex: 'https?://([^/]+)/.*'
    #  target_label: target
    #  replacement: '$1'
    - source_labels: [instance]
      target_label: target

- job_name: 'blackbox_exporter'
  static_configs:
    - targets: ['blackbox-exporter:19115']
2、把编写好的创建成secret
root@k8s-master:~/prom# kubectl create secret generic additional-scrape-configs --from-file=prometheus-additional.yaml --dry-run=client -o yaml > additional-scrape-configs.yaml
root@k8s-master:~/prom# kubectl create -f additional-scrape-configs.yaml -n monitoring
secret/additional-scrape-configs created
root@k8s-master:~/prom# kubectl get secret -n monitoring | grep additional
additional-scrape-configs         Opaque                                1      4m12s
3、编辑名为 k8s 的 Prometheus 资源,添加 additionalScrapeConfigs 参数来引用 prometheus-additional.yaml 的内容。
编辑成功会提示 prometheus.monitoring.coreos.com/k8s edited
root@k8s-master:~/kube-prometheus/manifests# kubectl get Prometheus -n monitoring
NAME   VERSION   REPLICAS   AGE
k8s    2.26.0    2          6d18h
root@k8s-maste:~/kube-prometheus/manifests# kubectl edit Prometheus k8s -n monitoring 
spec:
  additionalScrapeConfigs:
    key: prometheus-additional.yaml
    name: additional-scrape-configs
4、在Prometheus-》Status-》Targets就可以看到,如果没有的话,重启下prometheus跟operator或者reload下prometheus。
kubectl rollout restart deployment prometheus-operator -n monitoring 
kubectl rollout restart statefulset prometheus-k8s -n monitoring
# Prometheus 需要手动重新加载配置文件。可以通过访问 Prometheus 的 /-/reload API 手动触发重新加载。也可以通过 Prometheus 的 Web UI 手动重新加载配置。在页面右上角找到“Reload Configuration”。
curl -X POST http://172.30.233.87:30926/-/reload
5、也可以通过curl查询 Prometheus 中的 probe_success 指标,通过黑盒监控的指标都是probe开头的
root@k8s-master:~/prom# curl -G 'http://10.43.162.41:9090/api/v1/query' --data-urlencode 'query=probe_success'
{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"probe_success","instance":"http://prometheus.io","job":"blackbox"},"value":[1730171008.843,"1"]},{"metric":{"__name__":"probe_success","instance":"http://secret.cn","job":"blackbox"},"value":[1730171008.843,"1"]},{"metric":{"__name__":"probe_success","instance":"https://prometheus.io","job":"blackbox"},"value":[1730171008.843,"1"]},{"metric":{"__name__":"probe_success","instance":"https://www.itho.cn","job":"blackbox"},"value":[1730171008.843,"0"]}]}}
6、常见常见的 probe_ 指标
probe_success:表示探测(probe)是否成功。值为 1 表示成功,0 表示失败。
probe_duration_seconds:探测请求的持续时间,单位为秒。用于衡量响应时间。
probe_http_status_code:探测到的 HTTP 状态码。这可以用来确定目标服务的健康状态(例如,200 表示成功,404 表示未找到,500 表示服务器错误等)。
probe_ssl_earliest_cert_not_after:SSL 证书的到期时间,表示探测的目标服务使用的 SSL 证书的有效性。
probe_ssl_cert_not_after:SSL 证书的到期时间,表示探测的目标服务的 SSL 证书何时失效。

0x03 Grafana添加Blackbox Dashboards
https://grafana.com/grafana/dashboards/14928-prometheus-blackbox-exporter/


标签: Prometheus blackbox additional 监控 黑盒监控 白盒监控 指标监控

发表评论