Cloud Outage: When it rains, it pours
Entering my favorite coffee shop before work Tuesday morning in desperate need of caffeine, I watched an employee come out from behind the counter, taping a handwritten sign to the door.
“No cards accepted today. Sorry for the inconvenience.”
Mildly inconvenienced, but wildly frustrated I headed less than a block to another coffee shop, one seemingly able to accept credit cards. Coffee acquired with minimal effort. Just the way I like it.
Unlocking the door to my unassuming Upper West Side clothing store and robotically following the opening procedures I checked my email, and the day from hell began.
Today, the harbinger of death (or in this case, inconvenience) was a G-chat.
“Hey team!” the chat read sunnily, “looks like NewStore is out at the moment. Please make your teams aware to switch to offline mode to record transactions. Should be back up shortly, but for the time being no returns or exchanges.”
NewStore, the platform our fleet of 48 stores across the country use as a central POS system to ring up customers, find merchandise, look up old sales to process returns or exchanges, was down company wide. Another minor inconvenience to start the day. They said it would be back up shortly, right?
A customer enters with one of our brightly-branded bags, the tell-tale sign of a return. While I apologized profusely, the confused customer said something interesting.
“Funny. Warby Parker couldn’t do anything either.”
A second, more irate customer came in, saying she “took the morning off work,” to return something at our store. The bizarro world in which someone would take time off to return a product at a store with a one-year return policy notwithstanding, I realized today would be one of those days, and wondered why the technology gods had cursed me so.
The reality of the situation was much scarier than the simple possibility that a single sales/POS platform was down.
Amazon Web Services (AWS), the cloud service that deals with hundreds of clients, and an uncountable number of data points and lines of code, was completely down.
Disney+, Venmo, DoorDash, Slack, The United States Social Security Administration and my store. All of these web-based platforms (and more) had the same problem: none of them were working.
While the outage mildly inconvenienced us as workers, the toll it took on consumers, familiar with getting everything they want immediately, seemed to be much more significant.
I was scolded multiple times and yelled at once for the inability to process a return. Smiling and giving out discount cards for a later date could not dissipate the animosity directed towards me, a store supervisor in a retail store, as if I personally was the coder responsible for a nationwide outage.
AWS is the largest cloud-computing provider in the United States. Tracking-website downdetector.com reported 24,000 outages along the East Coast Tuesday morning. That is 24,000 individual businesses, platforms, apps, etc. with no ability to function as a business.
We got by using an “offline mode,” that allowed us to manually enter barcodes of items and save credit card details for a later date. But we were the lucky ones.
I say we not just as “we,” a clothing retailer looking at $100 million in revenue in 2021, I say we as a country, were lucky.
We were extremely lucky this was a simple bug in AWS, and not something like a cyber attack, which experts warn the US is extremely vulnerable to.
On multiple occasions in the last few years real and credible attacks on US infrastructure occurred. In 2018, the Los Angeles Department of Water and Power was hacked in only six hours. That hack was a test, performed by hackers-for-hire to reveal vulnerabilities within the system. In reporting on the issue, Bloomberg called our country’s vulnerabilities, “shocking.”
I’m no cyber-security expert, but the fact that a small bug can take down most of the East Coast’s ability to function on the internet is probably indicative of a massive problem within our system.
The pandemic’s hastening of our overreliance on Amazon to do everything for us has clearly bled over into computing and storage, and when that goes wrong, the results can be catastrophic.
While continuing to smile and apologize profusely for the mild inconvenience of not being able to return an item on a certain day, “I came ALL THE WAY from the EAST SIDE,” said one beleaguered customer, I couldn’t help but think that this might be a best-case scenario.
Clouds bring rain, and when it rains, it pours.