Two or three years into a role, you may find that your personal rate of learning has trailed off. You know your team well, the industry particulars are no longer quite as intimidating, the mystery of getting things done at your company solved. This can be a sign to start looking for your next role, but it’s also a great opportunity to build experience with succession planning.| lethain.com
As organizations grow more complex, the folks running them interface with reality through increasingly incorporeal abstractions. On the smallest teams, leadership might be deep in the code on a daily basis. A bit larger, and you’re talking about tasks in sprints. Larger still, and you’re talking about collections of tasks, and adopting fancy terminology like ’epics.’ At a hundred plus engineers, you’re likely talking primarily in themes of work with focus on several key initiatives....| lethain.com
There are many important meetings in your first ninety days as a new engineering leader, but one that’s both easy to forget and surprisingly important is your first meeting with the finance team. There’s a lot to learn from the finance team, particularly drilling into your profit and loss statement, but there’s one narrow topic that causes a surprising amount of frustration between engineering and finance teams: how do you capitalize software engineering costs?| lethain.com
Once again we’re experimenting with a paper reading group for our engineering team, this time with more success than previously, albeit in an unintended direction.| lethain.com
An agent to use Notion docs as prompts to comment on Notion docs.| lethain.com
In Work on what matters, I wrote about Hunter Walk’s idea of snacking: doing work that is easy to complete but low impact. The best story of my own snacking behaviors comes from my time at Stripe. I was focused on revamping the engineering organization’s approach to operating reliable software, and decided that it might also make sense to start an internal book club. It was, dear reader, not the right time to start a book club. Once you start looking for this behavior, it is everywhere, i...| lethain.com
Last weekend, I wrote a bit about using Zapier to load Notion pages as prompts to comment on other Notion pages. That worked well enough, but not that well. This weekend I spent some time getting the next level of this working, creating an agent that runs as an AWS Lambda. This, among other things, allowed me to rely on agent tool usage to support both page and block-level comments, and altogether I think the idea works extremely well.| lethain.com
Uber’s best known corporate value is probably Super Pumped, which, in addition to being a one-time company value, is also the title of Mike Isaac’s account of Uber and the subsequent television show. However, for me personally, the value I remember most is Let Builders Build. Working in Uber’s infrastructure engineering organization, I once chatted with a product engineering manager who wanted to continue rolling out a new feature that was hammering the production database. I was concer...| lethain.com
Most growth companies are starved for experienced leadership. As they expand, continued growth builds up pressure on their existing leadership. This gets quite stressful! The rare executive manages to build an effective organization solely by investing in their existing team, but most supplement their organization with some external hires to maintain a balance of folks who’ve seen it before and folks who’re actively learning their role.| lethain.com
In management we often find ourselves balancing the freedoms of the few against the freedoms of the many. This is, as you might imagine, a tricky business.| lethain.com
Some years ago, I was explaining to my manager that I was feeling a bit bored, and they told me to learn how to read a Profit & Loss (P&L) statement. At the time, that sounded suspiciously like, “Stop wasting my time,” but operating in an executive role has shifted my perspective a bit: this is actually a surprisingly useful thing to learn. The P&L statement is a map of a company’s operation and is an effective tool for pointing you towards the most pressing areas to dig in.| lethain.com
One of my side quests at work is to get a simple feedback loop going where we can create knowledge bases that comment on Notion documents. I was curious if I could hook this together following these requirements: No custom code hosting Prompt is editable within Notion rather than requiring understanding of Zapier Should be be fairly quickly Ultimately, I was able to get it working. So a quick summary of how it works, some comments on why I don’t particularly like this approach, then some mo...| lethain.com
For managers who have spent a long time reporting to a specific leader or working in an organization with well‑understood goals, it’s easy to develop skill gaps without realizing it. Usually this happens because those skills were not particularly important in the environment you grew up in. You may become extremely confident in your existing skills, enter a new organization that requires a different mix of competencies, and promptly fall on your face. There are a few common varieties of t...| Irrational Exuberance
I’m turning forty in a few weeks, and there’s a listicle archetype along the lines of “Things I’ve learned in the first half of my career as I turn forty and have now worked roughly twenty years in the technology industry.” How do you write that and make it good? Don’t ask me. I don’t know! As I considered what I would write to summarize my career learnings so far, I kept thinking about updating my post Advancing the industry from a few years ago, where I described using that co...| lethain.com
While Stripe is a widely admired company for things like its creation of the Sorbet typer project, I personally think that Stripe’s most interesting strategy work is also among its most subtle: its willingness to significantly prioritize API stability. This strategy is almost invisible externally. Internally, discussions around it were frequent and detailed, but mostly confined to dedicated API design conversations. API stability isn’t just a technical design quirk, it’s a foundational ...| lethain.com
I recently had the opportunity to present to a small group of early-stage founders about evolving their engineering organization as their company scaled. While preparing, I realized that the most relevant piece I’ve written about organization design was about running reorganizations.| lethain.com
There’s a lot of excitement about what AI (specifically the latest wave of LLM-anchored AI) can do, and how AI-first companies are different from the prior generations of companies. There are a lot of important and real opportunities at hand, but I find that many of these conversations occur at such an abstract altitude that they border on meaningless. Sort of like saying that your company could be much better if you merely adopted more software. That’s certainly true, but it’s not a pa...| lethain.com
If you end up working in an engineering team that wants to accelerate hiring, at some point you’ll hear the dreaded statement, “We need to grow our eng brand.” The method to accomplish that aim isn’t always clear, but the goal: what can we do so that candidates enter our process thinking highly of our engineering efforts?| lethain.com
Have you ever worked at a company where the same one or two people consistently got the most important projects? Me too. It’s not only frustrating, I think it’s a key indicator of poor organizational health. Over the past few years, I’ve started to use a very deliberate approach to selecting project leads, which has lead to a much wider set of leaders, and has been a great learning opportunity to evolve.| lethain.com
Long term, I believe that your career will be largely defined by getting lucky and the rate at which you learn. I have no advice about luck, but to speed up learning I have two suggestions: work at a rapidly expanding company, and make your peers your first team.| lethain.com
For a long time, I found the micromanager CEO archetype very frustrating to work with. They would often pop out of nowhere, jab holes in the work I had done without understanding the tradeoffs, and then disappear when I wanted to explain my decisions. In those moments, I wished they would trust me based on my track record of doing good work. If they didn’t trust my track record, could they at least take the time to talk through the situation so I could explain my decisions?!| lethain.com
Over the past 19 months, I’ve written Crafting Engineering Strategy, a book on creating engineering strategy. I’ve also been working increasingly with large language models at work. Unsurprisingly, the intersection of those two ideas is a topic that I’ve been thinking about a lot. What, I’ve wondered, is the role of the author, particularly the long-form author, in a world where an increasingly large percentage of writing is intermediated by large language models?| lethain.com
Since 2020, I’ve been working on my desk setup, and I think I finally have it mostly pulled together at this point. I don’t really think my desk setup is very novel, and I’m sure there are better ways to pull it together, but I will say that it finally works the way I want since I added the CalDigit TS5 Plus, which has been a long time coming. My requirements for my desk are:| lethain.com
Today’s my last day at Carta, where I got the chance to serve as their CTO for the past two years. I’ve learned so much working there, and I wanted to end my chapter there by collecting my thoughts on what I learned. (I am heading somewhere, and will share news in a week or two after firming up the communication plan with my new team there.) The most important things I learned at Carta were:| lethain.com
Back in 2018, I wrote lethain/systems as a domain-specific language for writing runnable systems models, and introduced it with this blog post modeling a hiring funnel. While it’s far from a perfect system, I’ve gotten a lot of value out of it over the last seven years, because it allows me to maintain systems models in version control. As I’ve been playing with writing Model Context Protocol (MCP) servers, one I’ve been thinking about frequently is one to help writing systems syntax,...| Irrational Exuberance
In some pockets of the industry, an axiom of software development is that deploying software quickly is at odds with thoroughly testing that software. One reason that teams believe this is because a fully automated deployment process implies that there’s no opportunity for manual quality assurance. In other pockets of the industry, the axiom is quite different: you can get both fast deployment and manual quality assurance by using feature flags to decouple deployment (shipping the code) and...| lethain.com
Fostering an inclusive organization is intimidating work. Not only is it intimidating, for the longest time I struggled to find a framework to even think about it effectively. Over time, I’ve found a framework for thinking about inclusion efforts that is simple but that I’ve found very useful. An inclusive organization is one where folks have access to membership and opportunity. Membership is participating as version of themselves they feel comfortable with. Opportunity is having access ...| lethain.com
A few years ago I wrote about reading a Profit & Loss statement, which is a foundational executive skill. I also subsequently wrote about ways to measure your engineering organization. Despite having written those, I still spend a lot of time wondering about effective ways to represent an engineering organization to your board of directors. Over the past few years, one of the most useful charts I’ve found for explaining an R&D organization is a scatterplot of R&D spend as a % of margin vers...| lethain.com
I’ve been reading Steven Sinofsky’s Hardcore Software, and particularly enjoyed this quote from a memo discussed in the Zero Defects chapter: You can improve the quality of your code, and if you do, the rewards for yourself and for Microsoft will be immense. The hardest part is to decide that you want to write perfect code. If I wrote that in an internal memo, I imagine the engineering team would mutiny, but software quality is certainly an interesting topic where I continue to refine my ...| lethain.com
Once you become an engineering executive, an invisible timer starts ticking in the background. Tick tick tick. At some point that timer will go off, at which point someone will rush up to you demanding an engineering strategy. It won’t be clear what they mean, but they will want it, really, really badly. If we just had an engineering strategy, their eyes will implore you, things would be okay. For a long time, those imploring eyes haunted me, because I simply didn’t know what to give them...| lethain.com
Many hypergrowth companies of the 2010s battled increasing complexity in their codebase by decomposing their monoliths. Stripe was somewhat of an exception, largely delaying decomposition until it had grown beyond three thousand engineers and had accumulated a decade of development in its core Ruby monolith. Even now, significant portions of their product are maintained in the monolithic repository, and it’s safe to say this was only possible because of Sorbet’s impact.| lethain.com
In How should Stripe deprecate APIs?, the diagnosis depends on the claim that deprecating APIs is a significant cause of customer churn. While there is internal data that can be used to correlate deprecation with churn, it’s also valuable to build a model to help us decide if we believe that correlation and causation are aligned in this case. In this chapter, we’ll cover: What we learn from modeling API deprecation’s impact on user retention Developing a system model using the lethain/s...| lethain.com
As part of my work on #eng-strategy-book, I’ve been editing a bunch of stuff. This morning I wanted to work on two editing problems. First, I wanted to ensure I was referencing strategies evenly across chapters (and not relying too heavily on any given strategy). Second, I wanted to make sure I was making references to other chapters in a consistent, standardized way, Both of these are collecting Markdown links from files, grouping those links by either file or url, and then outputting the ...| lethain.com
An important tenet of organizational design is minimizing cross-team coordination to achieve your goals. Teams that can move with little cross-team coordination finish projects while others are still ratifying their implementation proposal.| lethain.com
The most unnatural stage of hiring someone is when you’ve extended an offer, know they have other offers, and you are trying to give them advice on which offer to take. The obvious answer is that they should your offer, but you end up in some interesting discussions toeing the line between objectivity and outcome. Typically the offer numbers are already out in the open, so the topics tend to get more abstract. I recently had one of these discussions that hinged on an unexpected question, ...| lethain.com
Along with slow technical migrations, I believe reorganizations are the second largest activity which cause quickly growing companies to slow down. Here is a framework for running an engineering reorg effectively.| lethain.com
Building on the framework in “Designations, levels and calibrations”, I wanted to discuss a number of special topics related to designing and running performance systems. These topics are particularly interesting to me because they tend to be the emergent, accidental properties that emerge from common performance management systems and behaviors. Because they’re accidental, they surprise many managers early in their careers, and surprise is the cardinal sin of performance management.| lethain.com
Yesterday, the tj-actions repository, a popular tool used with Github Actions was compromised (for more background read one of these two articles). Watching the infrastructure and security engineering teams at Carta respond, it highlighted to me just how much LLMs can’t meaningfully replace many essential roles of software professionals. However, I’m also reading Jennifer Palkha’s Recoding America, which makes an important point: decision-makers can remain irrational longer than you can...| lethain.com
A look at the evolution of data infrastructure over the past four or five years, from the lambda architecture to the kappa architecture and beam paradigm.| lethain.com
Discussions around acquisitions often focus on technical diligence and deciding whether to make the acquisition. However, the integration that follows afterwards can be even more complex. There are few irreversible trapdoor decisions in engineering, but decisions made early in an integration tend to be surprisingly durable. This engineering strategy explores Stripe’s approach to integrating their 2018 acquisition of Index. While a business book would focus on the rationale for the acquisiti...| lethain.com
Once you’ve written your strategy’s exploration, the next step is working on its diagnosis. Diagnosis is understanding the constraints and challenges your strategy needs to address. In particular, it’s about slowing yourself down from jumping to solutions before fully understanding the nuances and constraints of the problem. If you ever find yourself wanting to skip the diagnosis phase–let’s get to the solution already!–then maybe it’s worth acknowledging that every strategy tha...| lethain.com
A surprising number of strategies are doomed from inception because their authors get attached to one particular approach without considering alternatives that would work better for their current circumstances. This happens when engineers want to pick tools solely because they are trending, and when executives insist on adopting the tech stack from their prior organization where they felt comfortable. Exploration is the antidote to early anchoring, forcing you to consider the problem widely b...| lethain.com
At some point in a startup’s lifecycle, they decide that they need to be ready to go public in 18 months, and a flurry of IPO-readiness activity kicks off. This strategy focuses on a company working on IPO readiness, which has identified a gap in internal controls for managing user data access. It’s a company that wants to meaningfully improve their security posture around user data access, but which has had a number of failed security initiatives over the years.| lethain.com
Most companies believe they are constrained by funding, product market fit or hiring. Books have been written about each of those, and this will be a foray into hiring. In particular, it’ll be a look at how to use the fundamental hiring diagnostic tool: the hiring funnel.| lethain.com
Most engineering organizations separate engineering and product leadership into distinct roles. This is usually ideal, not only because these roles benefit on distinct skills, but also because they thrive from different perspectives and priorities. It’s quite hard to do both well at the same time. This post takes a look at my high-level approach to product management for when you do happen to find yourself wearing both hats.| lethain.com
Entering 2025, I decided to spend some time exploring the topic of agents. I started reading Anthropic’s Building effective agents, followed by Chip Huyen’s AI Engineering. I kicked off a major workstream at work on using agents, and I also decided to do a personal experiment of sorts. This is a general commentary on building that project. What I wanted to build was a simple chat interface where I could write prompts, select models, and have the model use tools as appropriate. My side goa...| lethain.com
In my career, the majority of the strategy work I’ve done has been in non-executive roles, things like Uber’s service migration. Joining Calm was my first executive role, where I was able to not only propose but also mandate strategy. Like almost all startups, the engineering team was scattered when I joined. Was our most important work creating more scalable infrastructure? Was our greatest risk the failure to adopt leading programming languages? How did we rescue the stuck service decom...| lethain.com
In early 2014, I joined as an engineering manager for Uber’s Infrastructure team. We were responsible for a wide number of things, including provisioning new services. While the overall team I led grew significantly over time, the subset working on service provisioning never grew beyond four engineers. Those four engineers successfully migrated 1,000+ services onto a new, future-proofed service platform. More importantly, they did it while absorbing the majority, although certainly not the ...| lethain.com
At the core of Uber’s service migration strategy (2014) is understanding the service onboarding process, and identifying the levers to speed up that process. Here we’ll develop a system model representing that onboarding process, and exercise the model to test a number of hypotheses about how to best speed up provisioning. In this chapter, we’ll cover: Where the model of service onboarding suggested we focus on efforts Developing a system model using the lethain/systems package on Githu...| lethain.com
In Jim Collins’ Great by Choice, he develops the concept of Fire Bullets, Then Cannonballs. His premise is that you should cheaply test new ideas before fully committing to them. Your organization can only afford firing a small number of cannonballs, but it can bankroll far more bullets. Why not use bullets to derisk your cannonballs’ trajectories? This chapter presents a series of concrete techniques that I have personally used to effectively refine strategies before reaching the cannonb...| lethain.com
In How should you adopt LLMs?, we explore how a theoretical ride sharing company, Theoretical Ride Sharing, should adopt Large Language Models (LLMs). Part of that strategy’s diagnosis depends on understanding the expected evolution of the LLM ecosystem, which we’ve build a Wardley map to better explore. This map of the LLM space is interested in how product companies should address the proliferation of model providers such as Anthropic, Google and OpenAI, as well as the proliferation of ...| lethain.com
The first time I heard about Wardley Mapping was from Charity Majors discussing it on Twitter. Of the three core strategy refinement techniques, this is the technique that I’ve personally used the least. Despite that, I decided to include it in this book because it highlights how many different techniques can be used for refining strategy, and also because it’s particularly effective at looking at the broadest ecosystems your organization exists in.| lethain.com
Gitlab is an integrated developer productivity, infrastructure operations, and security platform. This Wardley map explores the evolution of Gitlab’s users’ needs, as one component in understanding the company’s strategy. In particular, we look at how Gitlab’s strategy of a bundled, all-in-one platform anchors on the belief that build and security tooling is moving from customization to commodity. Reading this document To quickly understand the analysis within this Wardley Map, read f...| lethain.com
The aim of a development group is to build business value. Building technical leverage is the focus on increasing the business value a development group delivers over time.| lethain.com
This is a work-in-progress draft! Even the very best policies fail if they aren’t adopted by the teams they’re intended to serve. In my experience, it’s common for a thoughtful strategy to be ruined by a terrible rollout strategy. Can we persistently change our company’s behaviors with a one-time announcement? No, probably not. The good news is that effectively operating a policy doesn’t have to be magic. There are common patterns that take time and attention, but I’ve seen them w...| lethain.com
Shortly after a senior leader joins a new company, sometimes you’ll notice them quickly steer the organization towards a total architectural rewrite. Perhaps this is a switch from batch to streaming computation, perhaps a switch from a monolith to a services architecture, perhaps it’s a rewrite into a new programming language. If you take a few minutes to reflect, I bet you can identify several times where you’ve had this experience. Regardless of the proposed technical change, it’s a...| lethain.com
A while ago I wrote about modeling a hiring funnel as an example of creating a system model, but that post doesn’t explore how the process of evolving a system model can be helpful. This post does.| lethain.com
In 2020, you could credibly argue that ZIRP explains the world, but that’s an impossible argument to make in 2024 when zero-interest rate policy is only a fond memory. Instead, we’re seeing a number of companies designed for rapid expansion learning to adapt to a world that expects immediate free cash flow rather than accepting the sweet promise of discounted future cash flow. This chapter wants to tackle that problem head-on, taking the role of an engineering organization attempting to n...| lethain.com
While I was probably late to learn the concept of strategy testing, I might have learned about systems modeling too early in my career, stumbling on Donella Meadows’ Thinking in Systems: A Primer before I began my career in software. Over the years, I’ve discovered a number of ways to miuse systems modeling, but it remains the most effective, flexible tool I’ve found to debugging complex problems. In this chapter, we’ll work through:| lethain.com
One of the trademarks of private equity ownership is the expectation that either the company maintains their current margin and grows revenue at 25-30%, or they instead grow slower and increase their free cash flow year over year. In many organizations, engineering costs have a major impact on their free cash flow. There are many costs to reduce, cloud hosting and such, but inevitably part of the discussion is addressing engineering headcount costs directly.| lethain.com
The How should you adopt LLMs? strategy explores how Theoretical Ride Sharing might adopt LLMs. It builds on several models, the first is about LLMs impact on Developer Experience. The second model, documented here, looks at whether LLMs might improve a core product and business problem: maximizing active drivers on their ridesharing platform. In this chapter, we’ll cover: Where the model of ridesharing drivers identifies opportunities for LLMs How the model was sketched and developed using...| lethain.com
In How should you adopt Large Language Models? (LLMs), we considered how LLMs might impact a company’s developer experience. To support that exploration, I’ve developed a system model of the developing software at the company. In this chapter, we’ll work through: Summary results from this model How the model was developed, both sketching and building the model in a spreadsheet. (As discussed in the overview of systems modeling, I generally would recommend against using spreadsheets to d...| lethain.com
If I could only popularize one idea about technical strategy, it would be that prematurely applying pressure to a strategy’s rollout prevents evaluating whether the strategy is effective. Pressure changes behavior in profound ways, and many of those changes are intended to make you believe your strategy is working while minimizing change to the status quo (if you’re an executive) or get your strategy repealed (if you’re not an executive). Neither is particular helpful.| lethain.com
From their first introduction in 2005, the debate between adopting a microservices architecture, a monolithic service architecture, or a hybrid between the two, has become one of the least-reversible decisions that most engineering organizations make. Even migrating to a different database technology is generally a less expensive change than moving from monolith to microservices or from microservices to monolith. The industry has in many ways gone full circle on that debate, from most hypersc...| lethain.com
This is a work-in-progress draft! Often you’ll see a disorganized collection of ideas labeled as a “strategy.” Even when they’re dense with ideas, these can be hard to parse, and are a major reason why most engineers will claim their company doesn’t have a clear strategy even though all companies follow some strategy, even if it’s undocumented. This chapter lays out a repeatable, structured approach to creating strategy. In it, we’ll cover:| lethain.com
Even if you believe that strategy is generally useful, it is difficult to decide that today’s the day to start writing engineering strategy. When you do start writing strategy, it’s easy write so much strategy that your organization is overwhelmed and ignores your strategy rather than investing time into understanding it. Fortunately, these are universal problems, and there are a handful of useful mental models to avoid both extremes. This chapter covers:| lethain.com
Whenever I transition to a new opportunity, I think about how to “start well.” How can I ramp up as effectively as possible? How do I balance the urge to “show value” immediately with making the right decisions?| lethain.com
For a long time, the path to engineering manager began with a prolonged stint of technical leadership. Then you’d transition into an initial management role that balanced people and technical responsibilities. Some companies call this a tech lead manager role. Folks entering those sorts of managerial roles were often the senior-most technical contributor on their team. If they struggled with the transition, many of them would fall back into the familiar habit of technical leadership instead...| lethain.com
As discussed in Components of engineering strategy, a complete engineering strategy has five components: explore, diagnose, refine (map & model), policy, and operation. However, it’s actually quite challenging to read a strategy document written that way. That’s an effective sequence for creating a strategy, but it’s a challenging sequence for those trying to quickly read and apply a strategy without necessarily wanting to understand the complete thinking behind each decision. This post...| lethain.com
Some governmental agencies have started to adopt No Wrong Door policies, which aim to provide help–often health or mental health services–to individuals even if they show up to the wrong agency to request help. The core insight is that the employees at those agencies are far better equipped to navigate their own bureaucracies than an individual who knows nothing about the bureaucracy’s internal function. For the most part, technology organizations are not complex bureaucracies, but some...| lethain.com
Whether you’re a product engineer, a product manager, or an engineering executive, you’ve probably been pushed to consider using Large Language Models (LLMs) to extend your product or enhance your processes. 2023-2024 is an interesting era for LLM adoption, where these capabilities have transitioned into the mainstream, with many companies worrying that they’re falling behind despite the fact that most integrations appear superficial. That context makes LLM adoption a great topic for a ...| lethain.com
One of the common conceits in leadership is that nobody is truly essential for a company’s continuity. I call it a conceit, but I do mostly agree with it: I’ve felt literally sick after hearing about some peer’s unexpected departure, but I’m continually amazed at how resilient companies are to departures, even of important people. About two-thirds of Digg’s team left in layoffs in 2010, but we found ways to amble on. Much of Uber’s leadership team turned over in the 2017 era, and ...| lethain.com
Pretty much every company I know is looking for a way to benefit from Large Language Models. Even if their executives don’t see much applicability, their investors likely do, so they’re staring at the blank page nervously trying to come up with an idea. It’s straightforward to make an argument for LLMs improving internal efficiency somehow, but it’s much harder to describe a believable way that LLMs will make your product more useful to your customers.| lethain.com
One of the trickiest, and most common, leadership scenarios is leading without authority, and I’ve written about one of the styles that I’ve found surprisingly effective in those conditions. I call it Model, Document, and Share.| lethain.com
When you’re driving a car down a road, you might get a bit stuffy and decide to roll your windows down. The air will flow in, the wind will get louder, and the sensation of moving will intensify. Your engine will start working a bit harder–and louder–to maintain the same speed. Every sensation will tell you that you’re moving faster, but lowering the window has increased your car’s air resistance, and you’re actually going slower. Or at minimum you’re using more fuel to maintain...| lethain.com
Planning the work for infrastructure engineering organization can be a challenge, in part due to a lack of clarity around what such an organization contributes value to the company it operates within. I have thoughts, and a simple thinking aid, for that.| lethain.com
Folks are sometimes surprised to learn that I started out working as a frontend engineer. I’d like to imagine it’s because I’m so terribly knowledgeable about infrastructure, but I suspect it’s mostly grounded in my unconscionably poor design aesthetic. Something that has stuck with me from that experience was feeling treated as a second-tier engineer: folks were unwilling to do any frontend work, but were careful to categorize it as trivial.| lethain.com
I’ve come to believe that most organizational design questions can be answered by recursively applying a framework for sizing teams. Over the past year I’ve refined my approach to team sizing into a bit of a framework, and even changed my mind on several aspects, especially the viability of small teams. This post describes how I now size teams| lethain.com
Technical infrastructure is never complete. System processes can always run with less overhead or be bin-packed onto fewer machines. Data can be retrieved more quickly and stored at a cheaper cost per terabyte. System design can broaden the gap between failure and user impact. Transport layers can be more secure.| lethain.com
Recently a bunch of teams I work with have turned the corner, having paid down technical debt to a long-term sustainable level. The future unfurls with possibility. Many of the infrastructure engineer teams I’ve been a part of have struggled to make the transition from maintenance to innovation, and I wanted to write down some of the ideas that we’re exploring to ease this shift.| lethain.com
I’m speaking at Velocity on June 12th on ‘How Stripe invests in technical infrastructure’, and this is the rough outline of the content the talk will cover. I hope to see y’all there.| lethain.com
The tech lead manager role is often presented as an easy on-ramp to Team Manager, but my experience is that being a tech lead manager is considerably harder to do well than Team Management, to the extent that I believe the tech lead manager role is a trap for new managers.| lethain.com
A peculiar challenge of management is trying to invest in someone’s career development when they themselves are uncertain about their goals. As a manager, you may have more experience and more access to opportunities within the company, but that represents a small slice of their career possibilities. Our schooling often rewards us for being methodical, linear thinkers, but that approach is less effective outside the intentionally constrained possibility spaces.| lethain.com
Perceiving the layers of context in problems will unlock another stage of career progression as a Staff-plus engineer, but there’s at least one essential skill to develop afterwards: navigating ambiguity. In my experience, navigating deeply ambiguous problems is the rarest skill in engineers, and doing it well is a rarity. It’s sufficiently rare that many executives can’t do it well either, although I do believe that all long-term successful executives find at least one toolkit for thes...| lethain.com
Recently I was chatting with a Staff-plus engineer who was struggling to influence his peers. Each time he suggested an approach, his team agreed with him, but his peers in the organization disagreed and pushed back. He wanted advice on why his peers kept undermining his approach. After our chat, I followed up by talking with his peers about some recent disagreements, and they kept highlighting missing context from the engineer’s proposals. As I spoke with more peers, the engineer’s probl...| lethain.com
Big Ball of Mud was published twenty years ago, and rings just as true today: the most prominent architecture in successful, growth-stage companies is non-architecture. Crisp patterns are slowly overgrown by the chaotic tendrils of quick fixes, and productivity creeps towards zero.| lethain.com
As an organization grows beyond fifty people or so, you’ll feel a building pressure to add a third layer of management, and eventually you will. This ought to be a benign event, what’s the difference between supporting some managers and supporting their managers? It shouldn’t be too different, but for me it was when my previous mechanisms of alignment stopped working very well. Two of the most effective tools I’ve found are strategy and vision documents, and this post introduces how t...| lethain.com
There are few things more exciting than being at a company during hypergrowth, but it’s easy to let hypergrowth get away from you, and to end up reacting instead of planning. It’s hard to steer when you’re rebuilding a plane mid-flight, but you can always nudge it in the right direction.| lethain.com
About a year ago I started sending public weekly updates to a relevant public (within the company) mailing list. I’ve found the practice useful enough to write a few works on the how and why. This practice is sometimes called a 5-15 report reflecting the goal of spending fifteen minutes a week writing a report that can be read in five minutes.| lethain.com
This was an eventful year. My son went to preschool, I joined Carta, left Calm, and wrote my third book. It was also a logistically intensive year, with our toddler heading to preschool, more work travel, and a bunch of other little bits and pieces. Here is my year in review summary. I love to read other folks year-in writeups – if you write one, please send it my way!| lethain.com
Occasionally folks tell me that I should “write full time.” I’ve thought about this a lot, and have rejected that option because I believe that writers who operate (e.g. write concurrently with holding a non-writing industry role) are best positioned to keep writing valuable work that advances the industry. This is a lightly controversial view, so I wanted to pull together my full set of thoughts on the topic. The themes I want to work through are:| lethain.com
Early in my career, I navigated most decisions by simple hill climbing: if it was a more prestigious opportunity and paid more, I took it. As I got further, and my personal obligations grew, I started to think about navigating a 40-year career, where a given job might value pace rather than prestige. Over the last few years, what I’ve come to appreciate is that there’s another phase: purpose. Purpose isn’t intrinsically the third phase of a career, but it certainly has been for me, as I...| lethain.com
We often try to force all usecases onto a single internal platform, and I think doing this causes our platforms to age poorly and with an excess of accidental complexity. This post suggests some alternatives| lethain.com
In Staff Engineer’s chapter on Managing Technical Quality, one of the very last suggestions is creating a centralized process to curate technical changes: Curate technology change using architecture reviews, investment strategies, and a structured process for adopting new tools. Most misalignment comes from missing context, and these are the organizational leverage points to inject context into decision-making. Many organizations start here, but it’s the last box of tools that I recommend...| lethain.com
The Silicon Valley narrative centers on entrepreneurial protagonists who are poised one predestined step away from changing the world. A decade ago they were heroes, and more recently they’ve become villains, but either way they are absolutely the protagonists. Working within the industry, I’ve worked with quite a few non-protagonists who experience their time in technology differently: a period of obligatory toil required to pry open the gate to the American Dream.| lethain.com
In small organizations, it’s easy for folks to be aware of what others are doing and to recollect how you’ve previously approached similar problems. This hive mind and memory creates a consistency to decision making that correlates strongly with quality. The subtle slide into inconsistency is often one of the most challenging aspects of evolving from a small team into a much larger one.| lethain.com
There is a moment in every company’s growth when top-level planning shifts from discussing specific projects to talking about goals. This can be a very empowering moment because goals decouple the “what” from the “how”, but it can also be a confusing transition for everyone involved: writing clear goals takes a bit of practice. This post takes a look at how to write effective goals and how to use them during planning.| lethain.com
I wrote a book, An Elegant Puzzle, which will be available in late May, 2019. This is something I’ve been working on over the past year, and which I’m extraordinarily excited to share!| lethain.com
Tidy First? by Kent Beck captures the spirit of Ousterhout’s A Philosophy of Software Design while also recognizing the inherent tensions of developing software within a team and business. You can also read it in about two hours. Recommended! A Philosophy of Software Design by John Ousterhout is one of my favorite books on software design. When I heard that Kent Beck had a new book out, Tidy First?, that was deliberately engaging with similar content but a markedly different pedagogy, I kne...| lethain.com
The Value Flywheel Effect is a worthwhile read. It’s imperfect, but a fascinating look into real-world application of Wardley mapping, and a rare view of a company’s engineering strategy. I’m currently diving into the topic of engineering strategy, and a sub-topic that I’ve not previously spent much time on is Wardley maps. As I dug into it a bit more, The Value Flywheel Effect by Anderson, McCann, and O’Reilly was recommended as a primer, so I bought a copy and spend some time work...| lethain.com