Cloudflare just unveiled the technical culprit behind Tuesday's massive outage that knocked ChatGPT offline for hours. CEO Matthew Prince's detailed post-mortem reveals a ClickHouse database query gone rogue caused duplicate data to overwhelm the Bot Management system, cascading across 20% of global web traffic. The infrastructure giant calls it their worst disruption since 2019.
Cloudflare CEO Matthew Prince didn't mince words in his late-night technical breakdown of Tuesday's catastrophic outage. What started as suspected DDoS attacks or cyberwarfare turned out to be something far more mundane but equally devastating - a database query that couldn't stop duplicating itself.
The chaos began in Cloudflare's Bot Management system, the AI-powered gatekeeper that's supposed to distinguish between legitimate users and automated crawlers scraping data for OpenAI and other AI training operations. The system relies on a machine learning model that constantly updates its configuration file to identify bot behavior patterns. But a change to the underlying ClickHouse database query started generating endless duplicate "feature" rows.
"A change in our underlying ClickHouse query behaviour that generates this file caused it to have a large number of duplicate 'feature' rows," Prince explained in the technical post-mortem. As the configuration file ballooned beyond preset memory limits, it brought down the core proxy system that processes customer traffic.
The timing couldn't have been worse. Cloudflare powers roughly 20% of the global web, making it one of the internet's most critical single points of failure. When the Bot Management module crashed, it created a cascade effect that knocked major services offline including ChatGPT, X, and ironically, the popular outage tracker Downdetector.










