fix spellcheck
continuous-integration/drone/pr Build is passing Details

This commit is contained in:
Robert Kaussow 2022-01-31 23:03:44 +01:00
parent 54585619b9
commit 306e73b654
Signed by: xoxys
GPG Key ID: 4E692A2EAECC03C0
2 changed files with 5 additions and 2 deletions

View File

@ -40,3 +40,6 @@ semver
CLI
PyPi
readme
SSL
Telegraf
OSCP

View File

@ -1,6 +1,6 @@
---
title: "SSL certificate monitoring pitfalls"
date: 2022-01-31T22:45:00+01:00
date: 2022-01-31T23:00:00+01:00
authors:
- robert-kaussow
tags:
@ -18,7 +18,7 @@ resources:
Certificates are a fundamental part of the Internet's security. At least since Let's Encrypt, a free and automated Certificate Authority, has started its service, SSL is nearly used everywhere. To avoid Certificate issues and possible service outages, it's a good idea to monitor the SSL certificates used by your services, especially as Let's Encrypt certificates have a short lease time of 90 days.
I'm using Prometheus to monitor my infrastructure, and for Prometheus there are multiple ways to get started. Most of the tutorials and posts of the internet will cover the case of expired certificates, and it's pretty easy to achieve. I prefer to use Telegraf, a plugin based metrics collector that also provides Prometheus compatible outputs, instead of dedicated Prometheus exporters. To monitor SSL certificates, I'm using Telegraf's `x509_cert` input plugin that provides a metric called `x509_cert_expiry` that can be utilized to write simple alerting rules. That's actually pretty cool already, as Prometheus will send out alerts a few weeks before the certificates would expire in case there is a problem within the automatic renewal process.
I'm using Prometheus to monitor my infrastructure, and for Prometheus there are multiple ways to get started. Most of the tutorials and posts of the internet will cover the case of expired certificates, and it's pretty easy to achieve. I prefer to use Telegraf, a plugin based metrics collector that also provides Prometheus compatible outputs, instead of dedicated Prometheus exporters. To monitor SSL certificates, I'm using the `x509_cert` input plugin of Telegraf that provides a metric called `x509_cert_expiry` which can be utilized to write simple alerting rules. That's actually pretty cool already, as Prometheus will send out alerts a few weeks before the certificates would expire in case there is a problem within the automatic renewal process.
A week ago, Let's Encrypt has informed affected users that they need to [revoke faulty certificates](https://community.letsencrypt.org/t/questions-about-renewing-before-TLS-ALPN-01-revocations/170449) issued and validated with the `TLS-ALPN-01` challenge. Even if I'm using the `DNS-01` for almost all of my certificates, I have also received a mail and started to look into it. Sadly, the notification mail only contained a "random" ACME registration ID, and I was not able to find the matching client. As mentioned, I don't really use `TLS-ALPN-01`, so I decided to stop the research and leave it to my monitoring to tell me which forgotten service is the evil one after the certificates were revoked. Nothing happened after the revocation, and the monitoring was not complaining. Good - well no, a user reported that one of the services is not reachable anymore and of course this was the one missing client that was using `TLS-ALPN-01` verified certificates - dang. While the issue itself was easy to resolve by a force renew of the certificate, I was still wondering why the monitoring has not caught it.