EPISODE + ARTIFACT DROP

Cert Expiry Timebomb

Not a hack. Not a DDoS. Just a date. This page ships the exact operator artifacts from the episode: a one-page battle card and a PowerShell starter pack to inventory endpoints and flag risky expiry windows.

Download the artifacts Watch on YouTube What broke Emergency playbook

FREE OPERATOR ARTIFACTS

Battle card (PDF) + PowerShell starter pack + ownership CSV — delivered to your inbox.

Delivered instantly by email. No spam. Unsubscribe anytime.

What broke

A certificate hit its NotAfter timestamp. That sounds small, but it can cascade: VPN drops, APIs return 503s, and browsers show “Not Secure” because trust collapses at the handshake.

The failure mode

Leaf cert expires → single endpoint breaks (usually recoverable fast).
Intermediate CA expires → everything it signed becomes untrusted (this is the “global outage” version).
Caching + staggered refresh makes symptoms intermittent and hard to correlate.

Why it pages people

The dashboards may show “Up,” while clients refuse the connection.
Monitoring agents may stop reporting, so you go blind during your own incident.
If you don’t have an owner mapped to endpoints, fixes stall on access and tribal knowledge.

What you get

These are the same artifacts referenced in the episode — designed to be copied into a real environment.

Cert Expiry Battle Card (PDF)

1 page

outage pki certs

Detection signals, fastest checks, chain verification, and prevention controls — formatted for incident-time use.

Get via email

PowerShell Starter Pack (ZIP)

scripts

outage powershell pki

Inventory endpoints, perform a fast handshake, pull remote X509 data, and flag expiring certs with clear thresholds.

Get via email

Emergency playbook

If you’re in the middle of it right now, don’t freestyle. Do the boring, reliable steps.

During the outage

Verify the full chain (leaf + intermediate + root) — don’t stop at the leaf.
Issue replacement immediately (ADCS / your CA platform).
Deploy to all termination points: LB, ingress, gateways, DR.
Validate outside-in from a public vantage point.

After the outage

Map owners to every endpoint (no owner = no production).
Set alerting thresholds: <30 warn, <14 critical, <7 incident.
Automate renewals where possible — but keep a manual audit tool as a backstop.

Sources & further reading

Primary sources used in this episode. Confirmed vs inferred is kept explicit throughout — that’s the Disaster Dissected standard.

Standards & specifications

Real-world cert expiry postmortems

Method

Credibility > vibes. Rule: no speculation presented as fact. Confirmed vs likely vs unknown stays explicit.

Research standards

Primary sources first (status pages, incident reports, postmortems).
Clear separation: confirmed vs hypothesis vs unknown.
If it’s not sourced, it’s labeled (or removed).

Corrections

Wrong info gets corrected fast.
Corrections noted in write-up + pinned comment (when relevant).
Email: contact@disasterdissected.com

About

Disaster Dissected is video-first: concise breakdowns of modern IT failures, with receipts and practical lessons.

Why this exists

Incidents are inevitable. Repeating the same ones is optional. The goal is to make postmortem thinking accessible — without dumbing it down.

Topics

Outages (cloud, SaaS, network)
Operational failures (deploys, configs, capacity)
Breaches (public + confirmed facts only)

Contact

Tips, corrections, sponsorship inquiries, or “please dissect this incident” — send it over.

Email

contact@disasterdissected.com

If you have links/sources, include them. “Receipts” speed everything up.

Submit an incident

Open submission form