The Day I *Almost* Rage-Quit Amazon
“No, no, no! This is not at all what I wanted!!!” Frowning, X put my whitepaper face down on the table, refusing to read it past the first page.
I had worked on it for months, done hours and hours of research and data gathering, and solicited feedback from dozens of engineers around me. I thought I had a solid writeup and started taking it up the food chain. That did not go well. I walked to a 7-Eleven and bought myself a pack of cigarettes. I don’t even smoke, but God, I needed one that afternoon to calm my nerves as I waited for the bus to take me home. I sent my resume around, had a triple shot of Lagavulin and crawled into bed. It was 4 pm.
Amazon has a bad reputation in the industry as being a somewhat harsh work environment. So many horror stories of senior leaders being extremely blunt or lacking empathy. Is this true? Somewhat. I did thrive in that environment for over a decade (minus that day!!!). A lot of ex-amazonians miss it after they leave. This directness helped me grow in my career, and it was refreshing (but there’s often a very blurry line between directness and being a jerk). Am I crazy or weird? Maybe.
I left Amazon in 2020 and have been at Google for 15 months. I’m a direct communicator and I prefer when others around me are also, but my time at Google has opened my eyes to the fact that not everybody communicates the same way. Google foments a much more inclusive and respectful culture, which welcomes valuable perspectives that may be overshadowed in the Amazon culture. I wanted to take a minute to highlight that there is a downside to the Amazon directness: it does create an environment where one personality type thrives and other personality types do not.
You could say that a lot of Amazon senior leaders lack empathy, but they also do have extremely high judgement. I compartmentalized. I was able to separate the actual substance from the style used to deliver a message. This skill has been invaluable in many situations.
This story tries to give you a little window into a part of the Amazon culture that is often misunderstood (the directness). Another takeaway is don’t let one bad day dictate how you feel about a job — keep your eye on the long run. Sometimes, a critic can turn out to be an advocate. And lastly, sometimes, crashing and burning can put you on an unexpected new journey better than the initial journey.
It’s not that X had read the doc and then hated it; he literally refused to read it at all. He scanned the first page, circled a bunch of things in red ink, and then just gave up. He put the document down, crossed his arms and impatiently checked his email on his phone while others around him awkwardly continued reading it. He then proceeded to berate me in front of a dozen Directors and Principal Engineers about how much he hated it.
I was somewhat vindicated later: among other things, my 6-pager predicted the 2018 Prime Day outage that caused millions of dollars in losses, in eerily exact details, 18 months before it happened. It could have prevented it.
My first exposure to the “Blunt Amazon Leader”
Back in 2012, a different person, L, pushed me harder than I’ve ever been pushed, in an almost painfully direct manner, and did not spend much energy sugar-coating his thoughts. But to this date, I continue to think of him as a role model, and those days as some of my best work.
Let me illustrate the sort of dialogue we regularly had. At the time, I had just launched what would later become the load and performance testing solution for the entire company. I launched it in July, and by October I had maybe 30 customers. I was stoked. I showed up to my 1:1 with L with a graph of customer growth, beaming with pride. He frowned, nonplussed. “This is linear growth. Come back when it’s exponential!” he barked.
I know what you’re thinking: that was rude! But hear me out. L was just very direct and when you understood that, you understood he was genuinely trying to help you. It was never ever a personal attack. There were no petty politics. What you saw was what you got.
And, he was also right. At Amazon, linear growth is always unimpressive. We expected to grow all our business at non-linear rates. When you’re pursuing exponential growth, you have to think about your product differently than when you’re pursuing linear growth. I knew he was right. I started applying different techniques to growing my customer base, and reached 10,000 Amazon services using my product for load testing, a couple of years later. When I had that 1:1 with L and I showed him 30, I was thinking too small: I never even imagined I could reach 10,000. But he did. He saw potential in my work that I hadn’t yet seen myself.
I came to really appreciate the directness. I knew exactly where I stood with him at all times. No need to read between the lines. It was liberating and refreshing. L was four levels above me yet I felt perfectly comfortable challenging his ideas and having a heated yet data-driven debate — there was no ego. For all his gruffness, L was a strong mentor and a huge influence in my path to Principal Engineer at Amazon. I continue to seek mentors and managers who have the same disregard for politeness and just give it to me straight: data and facts.
The ill-fated meeting with X
Back in 2016, I was obsessed with a problem I saw getting worse and worse at Amazon. We did not properly fund centralized developer tooling, which led to dozens of bespoke tools popping up left and right to fill random needs. This problem is not unique to Amazon; I’ve also seen it at Microsoft and Google (I wrote this to give a little glimpse into it). You might think letting a thousand flowers bloom is a positive thing, but the problem with proliferation of custom tools is that they’re often unfunded and unsupported. We needed more deliberate headcount investment to secure all these business-critical things that were essentially ticking time bombs. What if that one person maintaining it left the company? I spent months gathering up a list of ticking time bombs, analyzing their likelihood of blowing up, potential timing for said explosions, and ideas for preventing them. Rather than a dozen different subpar ways to do something, I wanted to create well-lit paths for our developers. I showed it to my manager, and he agreed. I started sharing more broadly, and everybody around me agreed. So I continued escalating.
This is why I was so shocked when I got to that ill-fated meeting. I was expecting it to be a slam dunk, and instead, my doc ended up crashing and burning spectacularly.
When I woke up the next morning, and after a cup of coffee improved my foul mood, I started dissecting the conversation from the day before.
Nothing justified X’s behavior. It was legitimate for me to feel upset about how I was treated: it was inappropriate (he did apologize a few days later in a private 1:1). And when a senior leader behaves like that in front of many, they normalize that behavior and others emulate it, leading to a toxic culture.
Setting aside the behavior, was there validity to his actual feedback?
My paper was too narrow, too tactical and too focused on the short-term: the ticking timebombs waiting to go off. X wanted me to think bigger, more strategic, and focus on the long-term. What were the bigger problems lurking around the corner that would surface in two, three, five years? My whitepaper proposed bandaids to wounds we had today. I needed to be thinking about a treatment plan for keeping us from even getting wounds down the road, and the elephant in the room was moving Amazon to AWS.
Moving Amazon to AWS
Most of Amazon runs on AWS. But there’s some nuance to that. This didn’t happen overnight, and it didn’t happen without a significant amount of effort and risk. And it happened in stages.
When I joined Amazon in 2009, the bulk of amazon.com ran on bare metal, on data centers. It was clear that the retail site needed to run on EC2, since its resource consumption varied wildly from day to night, and from season to season. The website also needed to scale up to unexpected traffic. And we didn’t want to waste millions with over-provisioning.
So we started a Move to AWS, or MAWS, in 2010. Laura Grit, a fantastic Distinguished Engineer at Amazon, has a great talk about this: Drinking Our Own Champagne. Moving tens of thousands of pre-existing legacy services to the Cloud was going to be extremely expensive. Running a large-scale migration in a software company is a painful and thankless job. If each service owner needed to spend time changing configuration and maybe code, it would easily be in the millions of dollars of productivity, and it would be years of herding cats.
Additionally, Amazon had a wealth of finely tuned internal developer tools to build, deploy and test code, monitor health, etc, that worked extremely well together. You couldn’t use these on AWS, because they were all built with all kinds of assumptions about a non-cloud world. So by moving to AWS, you were leaving the comfort of the ecosystem, and everything was just a little bit harder.
Laura, who was running the MAWS program, understood they couldn’t boil the ocean and needed to be pragmatic about getting something done quickly rather than an open-ended migration. MAWS resorted to simply putting an abstraction layer on top of EC2 to make it look a little bit more like the old bare metal so that it could work with all our old internal tools, and it didn’t require any service changes. For most service owners, things just “magically” moved to EC2 one day. One day the deployment system was deploying to bare metal; the next day, it was deploying to EC2 hosts that sort of looked like bare metal. Werner Vogel, Amazon CTO, writes about this in The Story of Apollo — Amazon’s Deployment Engine.
There were tradeoffs made. Because it was an abstraction layer, MAWS did not allow us to truly interact with native EC2 the way our external customers did. And worse, the easiest way to give EC2 hosts access to our corporate and production networks was to create them in one giant VPC (Virtual Private Cloud). One giant VPC for 50k engineers and hundreds of thousands of EC2 hosts is a HUGE blast radius. Eric Brandwine, another fantastic Amazon Distinguished Engineer, talks about that in A Day in the Life of a Billion Packets.
Fast forward five or six years. Every day it was a little more clear that we needed to move from one giant VPC to each team having its own VPC to reduce the blast radius and allow them to operate faster and more independently. We also wanted to take advantage of all the cool innovations that had happened in AWS, like Fargate and Lambda. But we were a hostage to the limitations from MAWS. The internal developer tools were so easy to use, and the external developer tools were so hard to use, so engineers simply weren’t motivated to deal with that pain. That was slowing down Amazon’s move from MAWS to native AWS.
Investing in MAWS was tactical; investing in native AWS was strategic. X saw this. And I started seeing it too. In 2016, he wanted me to be a thought leader about the Amazon developer experience in 2021, not the one in 2016. I needed to get that going so that when the time came, we had anticipated this future, and we were ready to meet it.
I went back to the drawing board and feverishly wrote a 6-pager (in a format called PRFAQ) to articulate that future, particularly in the testing aspect of the software development life cycle. A Press Release & Frequently Asked Questions (PRFAQ) is a format that Amazon uses everywhere to socialize any product you want to build and secure funding to actually build it (some good places to learn more are here, here or here). I partnered with the Head of Software Test Automation at Amazon (AndyK). We went through hundreds of iterations of the doc, sharing more and more broadly with Directors, VPs, Principal and Distinguished Engineers. And yes, X too. And this time, he even read the doc — with a smile.
And we made it to The Chop
X turned out to be a surprising advocate for my vision. He sponsored the document up to The Chop, which was significantly more than I envisioned. Stealing from here, Andy Jassy, Amazon’s CEO, has a conference room nicknamed The Chop, “where ideas, and sometimes employees, go to get chopped down to size.”
Showing up with your PRFAQ at Chop is one of the pinnacles of working at Amazon. It’s kind of like the TV show Shark Tank. You have a vision that is being dissected and scrutinized by Jassy’s Senior Leadership team to decide if they want to fund it or not, and that is an extremely tough audience. They are also extremely, extremely smart. And, their time is valuable so that meeting alone probably costs tens of thousands of dollars. The Chop was probably one of the scariest things AndyK and I ever did — but it was exhilarating too. We probably put at least 100 hours of work into the whitepaper before it got to The Chop.
The story has a happy ending. I didn’t rage-quit Amazon (I did move to Google 4 yrs later, but not in anger!). We did get our vision approved. We did get funding to build this thing. A team of top-notch engineers executed and improved the vision after me. Several promotions came out of this work — including a Senior and a Principal Engineer that are dear friends. It is used by thousands today internally.