Less is more

Setting Up Telegram Alerts for Prometheus - A Journey

Today I added something useful and long overdue to my homelab. My Spectrum router does not cover the whole house (it is not in the middle, but in a side room which is my home office), so I purchased an additional WiFi pod from Spectrum to create a mesh network. Unfortunately, the outlet where the pod is plugged in is a bit loose, so sometimes it disconnects and then Netflix starts to misbehave.

I have a small k3s cluster at home, built as an experiment on top of two Raspberry Pis (RP5 and RP Zero W 2) and an old laptop, and I thought it would be a good idea to use the cluster to solve this problem.

What We Wanted

I use Telegram a lot, so I decided I wanted to get a Telegram notification when our Spectrum pod (192.168.1.135) goes down.

Plan

The Discovery Phase

Did Helm install Alertmanager with Prometheus?

$ kubectl get pods -n monitoring | grep alert
prometheus-alertmanager-0    1/1     Running   0      6d

Cool, it is running, but completely useless without proper configuration.

Adding the Probe Target

We are already using blackbox-exporter to ICMP ping a bunch of hosts (routers, laptops, servers, DNS servers, etc.), so adding our Spectrum pod was easy: just drop it into the targets list.

- 192.168.1.135    # Spectrum pod

Creating the Alert Rule

Then we needed an actual alert rule. A simple one that fires when any of our monitored hosts fails the ICMP probe for more than 2 minutes:

- alert: HostDown
  expr: probe_success{job="blackbox"} == 0
  for: 2m
  labels:
    severity: critical
  annotations:
    summary: "Spectrum POD {{ $labels.instance }} is down"
    description: "ICMP probe to {{ $labels.instance }} has failed for more than 2 minutes."

Pretty straightforward: if probe_success is 0, something is wrong.

The Telegram Integration (aka The Fun Part)

Here is where it got interesting. We wanted to send alerts to Telegram, but we had a problem: how do you configure secrets without committing them to GitHub?

First, you need a bot. Hit up @BotFather on Telegram.

Then get your chat ID.

The Alertmanager Telegram Config

Next, create an Alertmanager config that sends nicely formatted HTML messages to Telegram.

global:
  resolve_timeout: 5m
route:
  group_by: ['alertname', 'cluster', 'service']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 12h
  receiver: 'telegram'
receivers:
  - name: 'telegram'
    telegram_configs:
      - bot_token: 'YOUR_BOT_TOKEN_HERE'
        chat_id: YOUR_CHAT_ID_HERE
        parse_mode: 'HTML'
        message: |
          {{ range .Alerts }}
          <b>{{ .Status | toUpper }}</b>
          <b>Alert:</b> {{ .Labels.alertname }}
          <b>Instance:</b> {{ .Labels.instance }}
          <b>Severity:</b> {{ .Labels.severity }}
          <b>Summary:</b> {{ .Annotations.summary }}
          <b>Description:</b> {{ .Annotations.description }}
          {{ end }}          

The message template uses Go templating to loop through alerts and format them with HTML tags that Telegram understands.

Keeping secrets out of Git

  1. Add the actual config file with real credentials to .gitignore.
  2. Created a Kubernetes Secret with the config :
    kubectl apply -f prometheus/alertmanager-config.yaml
    
  3. Configured Helm to use this secret by adding to prometheus-values.yaml:
    alertmanager:
      enabled: true
      configFromSecret: "alertmanager-telegram-config"
    
  4. Applied with Helm:
    helm upgrade prometheus prometheus-community/prometheus \
    -n monitoring -f prometheus/prometheus-values.yaml
    

Seemed perfect. Except…

The Plot Twist

After applying everything with helm upgrade, Alertmanager was still using the default receiver. What gives?

It turned out the configFromSecret parameter we set in the Helm values was not actually working. The Helm chart kept creating and using a ConfigMap with the default config, completely ignoring our fancy Secret.

Classic Kubernetes moment.

The Fix

Sometimes you have to be pragmatic. Instead of fighting with Helm, we just:

  1. Pulled the config from our Secret.
  2. Manually updated the ConfigMap.
  3. Restarted the Alertmanager pod.
kubectl get secret alertmanager-telegram-config -n monitoring -o jsonpath='{.data.alertmanager\.yml}' | base64 -d > /tmp/alertmanager-telegram.yml

kubectl create configmap prometheus-alertmanager --from-file=alertmanager.yml=/tmp/alertmanager-telegram.yml -n monitoring --dry-run=client -o yaml | kubectl apply -f -

kubectl delete pod prometheus-alertmanager-0 -n monitoring

Boom. Then we verified the config:

$ curl -s http://localhost:9093/api/v2/status | jq -r '.config.route.receiver'
telegram

Testing Time

We sent a test alert via the Alertmanager API:

curl -X POST http://localhost:9093/api/v2/alerts -H "Content-Type: application/json" -d '[{
  "labels": {
    "alertname": "HostDown",
    "instance": "192.168.1.135",
    "severity": "critical",
    "job": "blackbox"
  },
  "annotations": {
    "summary": "Spectrum POD 192.168.1.135 is down",
    "description": "ICMP probe to 192.168.1.135 has failed for more than 2 minutes."
  }
}]'

And Telegram notification received.

Lessons Learned

  1. Helm charts do not always work the way you expect – sometimes you have to work around them.
  2. ConfigMaps get overwritten by Helm
  3. Testing with the API is much faster – do not wait 2 minutes for real alerts to fire.
  4. Never commit secrets to Git – even when it seems convenient.

What’s Next?

Now that we have the basic setup working, we could:

But for now, alerts are going to Telegram, and that is a win.

P.S. Don’t forget to run those troubleshooting commands after any helm upgrade because the ConfigMap will get reset.

#homelab #alertmanager #prometheus #devops