We know what caused the recent massive Google Cloud outage – and it’s a bit embarassing

Google Cloud’s API service ere to blame for widespread outage
Most regions were back online in 40 minutes, but some took even longer
The company has promised to protect against future outages and improve communication

Following Google Cloud’s recent widespread outage, which took sites like Spotify, Cloudflare and Discord offline, the company released its detailed report sharing exactly why it failed customers.

The company says the root cause was a code issue in Service Control – part of the company’s API management and policy checking system.

Specifically, invalid automated quota update and a lack of proper error handling triggered a global crash loop, with 503 errors seen across not only Google Cloud services, but services using its APIs.

Google Cloud outage caused by API issue

The outage affected the Google Cloud infrastructure, as well as other popular Google Workspace apps like Drive, Docs, Gmail and Calendar. However, third-party sites accessing Google Cloud’s API, including popular music streaming platform Spotify which boasts of 678 users, as well as some Cloudflare services, were also affected.

“On May 29, 2025, a new feature was added to Service Control for additional quota policy checks,” the company wrote in its incident report. “The issue with this change was that it did not have appropriate error handling nor was it feature flag protected.”

Google Cloud boasted that its Site Reliability Engineering team had started triaging the incident within two minutes, having identified the root cause within 10 minutes. “The red-button [to disable the serving path] was ready to roll out ~25 minutes from the start of the incident,” Google said, with the rollout complete within 40 minutes.

Although smaller regions recovered relatively quickly, larger regions like us-central-1 took longer to come back online – around two hours and 40 minutes in the case of this particular region.

In its mini incident report issues on the day of the outage, Google Cloud promised to “do better.” Its more detailed report promises the usual responses going forward, such as improving static analysis and testing practices, auditing and modularizing Service Control’s architecture to contain future incidents, but the company has also pledged to “improve [its] external communications” to better inform customers, ensuring that its communications infrastructure remains online even during such outages in the future.

https://cdn.mos.cms.futurecdn.net/UJ5CFPQLDaMmXUqcw3CEXh.jpg

Source link

Garmin Approach S50 review: a mid-range banger that shoots well below its handicap

Malwarebytes just proved its no-logs VPN policy is the real deal

Geekom A5 Pro mini PC review

Geekom A5 Pro mini PC review

Ex-Israeli Intelligence Official: Shockwaves of Trump’s “Take Over Gaza” Heard, Felt Across Region

What UK political parties are promising in the 2019 general election

Otto Warmbier’s parents want North Korea to suffer for their son’s death

Could a ‘youthquake’ cause Boris Johnson to lose the general election?

Disney+’s Gritty Crime Thriller Proves the Franchise’s Darkest Stories Are Its Best

Star Wars: Maul – Shadow Lord Season 1, Episodes 1-8 Review

Dan Levy Considered Schitt’s Creek Sequel Before Catherine O’Hara Died

‘Project Hail Mary’ Ascends to No. 1

Barclays upgrades First American Financial stock rating on valuation

AI and job loss: the identity crisis no one is preparing for

Asia FX struggles for direction amid mixed Iran war signals

Inside Blackstone’s intense 90-day CEO search process for its 250 portfolio companies

The YouTuber who has become one of Gen Z’s most beloved celebrities

26 last-minute holiday gifts that are still thoughtful and unique

Practicing gratitude regularly can make you less stressed and sleep better

8 things millennials wish you would just stop getting them for the holidays

We know what caused the recent massive Google Cloud outage – and it’s a bit embarassing

Disney+’s Gritty Crime Thriller Proves the Franchise’s Darkest Stories Are Its Best

Barclays upgrades First American Financial stock rating on valuation

Star Wars: Maul – Shadow Lord Season 1, Episodes 1-8 Review

AI and job loss: the identity crisis no one is preparing for