Amazon Web Services suffered a massive DNS resolution failure Monday morning that cascaded across the internet, taking down everything from WhatsApp to ChatGPT to British government sites. The outage, stemming from the company's critical US-EAST-1 region in Virginia, exposed how cloud concentration has created dangerous single points of failure across the web's infrastructure.
The internet just had another wake-up call about putting too many eggs in one basket. Amazon Web Services suffered a DNS resolution meltdown Monday morning that rippled across the web like dominoes falling, taking down some of the world's most critical digital services in the process.
The cascade started around 3 AM ET when something went wrong with domain name resolution in AWS's US-EAST-1 region - that massive data center hub in northern Virginia that powers a shocking chunk of the internet. Within hours, users couldn't access WhatsApp, ChatGPT was down, PayPal's Venmo payment system went dark, and even British government websites vanished from the web.
"Based on our investigation, the issue appears to be related to DNS resolution of the DynamoDB API endpoint in US-EAST-1," Amazon wrote in status updates as engineers scrambled to fix the problem. The company's own Ring doorbells and Alexa smart assistants joined the casualty list, along with Epic Games services and countless other platforms.
To understand why this matters so much, think of DNS as the internet's phone book - it translates human-readable website names like "techbuzz.ai" into the numeric IP addresses that computers actually use to find each other. When that system breaks, it's like every phone number in the directory suddenly connecting to wrong numbers or dead lines.
"When the system couldn't correctly resolve which server to connect to, cascading failures took down services across the internet," Davi Ottenheimer, a security operations veteran and vice president at data infrastructure company Inrupt, tells us. "Today's AWS outage is a classic availability problem, and we need to start seeing it more as data integrity failure."
The technical fix came relatively quickly - AWS applied "initial mitigations" by 5:22 AM ET and declared the underlying issues resolved by 6:35 AM. But the damage was done, with some services needing hours more to work through backlogs and restore full functionality.
This isn't Amazon's first rodeo with major outages. The company suffered a that followed similar patterns, and each time these events happen, they underscore the same uncomfortable truth: the internet has become dangerously dependent on a handful of cloud giants.