The AI Knowledge Work Stack -

The Five Layers of Durable AI Leverage

In the early 1400s, the city of Florence had a problem. The stunning Santa Maria del Fiore cathedral was complete but for one little problem: a 143-foot hole, wider than the Pantheon, sitting 180 feet up on existing walls.

It sat roofless for decades because nobody could figure out how to span it. Traditional wooden centering would have required more timber than existed in Tuscany.

In 1418, Filippo Brunelleschi, a goldsmith by training, proposed building two nested domes with no centering at all and was hired to build it. It had never been done before. He spent 16 years to finish what is now known as “Il Duomo.”

Brunelleschi’s dome, Florence Cathedral. Photo: Thomas Roessler / CC-BY-SA 3.0

His biggest innovations weren’t in his original proposal. They were invented on site in response to problems discovered during construction.

He developed a herringbone brickwork pattern that let each course transfer its weight laterally to the nearest rib, solving the problem of laying masonry on a curve without any support underneath. He designed an ox-powered hoist with the first known reverse gear so the animals never had to turn around. It lifted 37,000 tons of material over the course of the project. He embedded stone-and-iron chains inside the dome like barrel hoops to resist the outward thrust that would otherwise push the walls apart.

Brunelleschi’s use of a herringbone brick pattern in the outer dome.

In the way we think about construction roles today, he was simultaneously architect, engineer, and construction supervisor. He was part of the master builder tradition that had existed for millennia: Imhotep, the medieval cathedral masons, Vitruvius. The split between designer, engineer, and builder didn’t yet exist as a distinct concept, much less as a set of contracts or roles.

The split wasn’t preordained. It was the result of specific technological and economic conditions. The Industrial Revolution changed both what we were building and how big it was. Cast iron, then steel, then reinforced concrete were materials whose structural behavior couldn’t be worked out by rule of thumb. Railway networks, factories, skyscrapers were projects whose scale couldn’t be absorbed by a single master-builder. You needed specialist calculation, and the depth of that specialization grew past anything a generalist could feasibly handle.

At the same time, the coordination technology needed to transact across those new specialist boundaries arrived: blueprints (Herschel’s cyanotype, 1842) let a builder work from the same drawing the architect drew, and standardized contracts (the AIA Uniform Contract, 1888) gave the handoffs legal structure without case-by-case negotiation. Once specialist quality mattered more than integration, and transacting between specialists got cheap, the split became the dominant arrangement.

Ronald Coase made the argument that a firm exists because coordinating work inside it is sometimes cheaper than transacting for every task on the market. You split one job into two jobs when one person can’t do both well enough, and the cost of coordinating between two specialists is worth the quality gain. With the Industrial Revolution, we saw that fragmentation.

We have a new technological paradigm of Agentic AI. AI is no longer merely a chat bot but something which has both the capacity to act in the digital world and the ability to write and read its own memory.

If AI tools make you 90th percentile at design, product thinking, and engineering, then the gap between you and a dedicated specialist narrows. Maybe a specialist designer is still better. But is she enough better to justify the coordination costs? Every handoff between people costs context. Every sync meeting is time not spent building. When one person can cover 80% of the quality across three roles, the coordination savings from not splitting the work start to dominate.

This is already happening. Boris Cherny, the head of Claude Code, mentioned that everyone on his team at Anthropic including designers, product managers, and finance people now code. “The title of software engineer is going to start to go away not because engineers aren’t needed but because what was traditionally three roles (engineering, product, design) are going to become a single role.”

Software is the canary in the coal mine. It’s where AI tools are most mature, so it’s where role boundaries are shifting first. But there’s no reason this stops at engineering. Anywhere the bottleneck has been “I need a specialist who knows how to do X” rather than “I need someone with good judgment about what X to do” — that boundary is going to move. The unit of “one person’s worth of work” is changing shape. Brunelleschi’s mode of working is coming back, because the economic conditions that pulled those roles apart are unwinding.

What becomes scarce is the integration of the roles: holding the vision together, knowing which sub to call for which job, making the trade-offs when the plumbing runs into the wall. At the margin, the value is less in the specialized work. The value is in holding vision and execution together the way a master builder does: in the mix, adapting as you go, solving problems as they emerge rather than handing off blueprints and walking away.

We are entering an era of freestyle work. It’s a human-AI collaboration where the machine proposes and the human disposes. Claude Code suggests an edit; I accept or reject. It drafts a contractor proposal; I revise and send. The skill isn’t domain expertise alone. It’s knowing how to direct the AI, when to push back, and how to structure the collaboration.

The people who figure out how to structure their work around them early will have a compounding advantage over those who wait. Not because they’ll be using better models (everyone gets the same models), but because they’ll have a better way of thinking about using them and building with them. There is a new knowledge work stack.

The AI Knowledge Work Stack

Layer 1: The Model

At the bottom of the stack is AI, specifically, a type of AI known as large language models (LLMs). These LLMs are the base models using next-token prediction. Given a sequence of text (the “context window”), they use statistical models to predict what comes next.

The model is what you interact with when you type something into ChatGPT or Claude on the web, and it responds with some text.

What’s astonishing about LLMs is that given three paragraphs about your business, your market, and your resources, the same next most likely word prediction process often produces something that reads like uncannily good strategic advice. It’s as if auto-complete chugged four Red Bulls, went Super Saiyan, and got access to the nuclear codes.

It’s visualized as a ‘big blob of compute’ representing how AI progress has primarily been driven by scaling up raw computing power, data, and training duration, rather than clever algorithmic tweaks.

Layer 2: The Harness

In its pure chat-based form, the models can’t do anything other than produce text output. One large unhobbling of the last few years has involved giving the models tools: the ability to write software and use the filesystem on your computer.

The harness is represented by Agentic CLI tools like Claude Code and Codex which give the model hands: filesystem access, terminal commands, the ability to read and write files. Without it, the model is a chatbot. With it, the model can actually do things on your computer.

One of the obvious (and most widely used) versions of that is to ask it to write software: it can build your own CRM, call booking tool, or financial analysis template.

But, it can also do pretty much anything else we would categorize as knowledge work: from building project plans to drafting email replies and writing call summaries.

Layer 3: Personal Scaffolding

While the harness allows the model to act, the model itself is a general or managerial intelligence. Talking with AI models often feels like talking to someone who was always having their first day on the job. Their background was impressive: They had the equivalent of a PhD in mathematics, physics, and history. But, they don’t really know anything about your specific work or situation.

The innovation of Claude Code was that by introducing the ability for the AI to write files, it could create what amounted to persistent memory rather than relying on transient (session) memory. The two most important types of files are CLAUDE.md files and Skill files.

CLAUDE.md files are instructions the AI reads at every session start. Think of it as an employee handbook everyone reads on day one: what applies universally to this workspace. Because the AI starts each chat session with no memory of prior sessions, the CLAUDE.md file gives it that context.

Claude Code automatically loads the CLAUDE.md file from every directory level above whatever file you’re working on. That means that if you organize it intelligently, it always has the right context at the right time.

Say you’re running a business and writing a newsletter to go out for a marketing campaign, a logical sequence of folders with CLAUDE.md files would be:

A general CLAUDE.md on your computer that contains information on how you like Claude to work, voice, tools you have wired up
A vault CLAUDE.md with how your files are organized for all your notes and projects.
A folder for your business with a CLAUDE.md explaining what the business does, who the clients are, how you price, etc.
A folder for marketing inside the business with a CLAUDE.md with details on channels you run, what’s working, and your brand voice.
The specific marketing project with its own CLAUDE.md giving the goals and status.

When you open anything in the marketing project folder, Claude Code walks up the tree and pulls all five into context automatically. Global → vault → business → marketing → project.

That means the very first message in a chat about that marketing project is going to be like talking to someone who knows you, your business, your marketing history, etc.

CLAUDE.md files load every session, so you only include things that apply broadly.

For domain knowledge or workflows that are only relevant sometimes, you would use skills instead. Claude loads them on demand without bloating every conversation. Skills extend Claude’s knowledge with information or define repeatable workflows.

You can create a skill with instructions for how to write a newsletter, close the monthly books, or format a client proposal.

A skill can pull in supporting files like templates, examples, reference material to do the task better.

The AI model uses the instructions in this file like someone at your company might use a standard operating procedure. CLAUDE.md files represent the always-on context — what’s true about this place. Skills are the specialist knowledge — what the AI should know when it’s doing a specific kind of task. Together they cover the two questions that scaffolding has to answer: “Where am I?” and “What am I doing?”

These are all just text files on your computer that any LLM can read and use. The analogy to how knowledge workers work should be pretty obvious.

The Personal Scaffolding is where people start to differentiate. Everyone running Claude Code gets the same models and the same harness. The model running in Claude will give me different results when I ask it to do something than when you do.

This is the layer where people should be focusing: how to architect their scaffolding in a way that makes them better (and, as we’ll see later, makes their agents better.)

Layer 4: Utilities and Materials

So far, everything we’ve talked about involves using files on your local computer. But a lot of your work and data lives on a server somewhere — your email, calendar, and CRM all live in external services like Google or Salesforce.

There are tools that can connect those services to agentic CLIs like Claude Code and Codex so that you can access them. Three kinds of connector do this work.

APIs — application programming interfaces — are the underlying plumbing: every major service like Gmail, Salesforce, or Stripe publishes one, which is a defined set of requests any program with the right credentials can make on your behalf.
CLIs — command-line interfaces — are small programs that let you use an API by typing commands in the terminal instead of writing code.
MCPs — Model Context Protocols — are the newest, built specifically for AI tools: they give Claude Code or Codex a standardized way to see what a service can do, so the model can use a new tool the moment it’s connected.

Once you’ve given access to these tools, it enables a workflow which combines all the components: the intelligence of the model, the hands of the harness and the context of the personal scaffolding.

A simple example: when an email comes in requesting a meeting, the model checks my calendar, drafts a reply, and creates the event. I never open a browser.

Or here’s a more robust workflow: Say you run a B2B company and you want to know how a customer is doing. Right now, maybe you’d open HubSpot to check deal history, then Mixpanel for product usage, then Stripe for invoicing, then Gmail for recent support threads. You’d shuffle through four tabs trying to piece together the story.

With these connectors in place and a well-architected scaffold, you might create a skill that triggers when you say “Give me a full picture of Acme Corp.”

The model pulls from all four APIs, merges the context, and gives you one coherent answer. That merged output can become the starting point for more work. You could have the model build a simple dashboard for your account managers showing profitability and status per client. You could set it up to flag at-risk accounts: a client whose product usage has dropped while support tickets have spiked is probably about to churn and you could reach out to them proactively to see if you can help fix their issue.

It could merge all client accounts and generate a quarterly business review deck from the merged data. Each of these would have been a custom integration project or manual workflow, now it’s a natural language conversation with an agentic CLI.

Once you have ways of connecting to external applications, the model can, in principle, do anything a human can do on a computer.¹

Layer 5: Agents

Once the infrastructure is in place, you can deploy agents that use it autonomously. An agent is just a Claude Code session running on its own: you define the goal; it executes using the same tools and context it would if you were sitting there.

Let’s say you have a plan to update a few hundred blog posts on your website. A typical and effective workflow is that a (thoughtful) human spends time manually working on building out a skill or process and tests it on a small subset (say 5 blog posts in this example) and once it’s reliably producing a good output, you let the agents rip and do the rest.

In the case of building a client dashboard, you might spend a while manually figuring out how to build one client’s dashboard in a way that makes sense for your use case. Then you’d use agents to do the other clients and the monthly update.

An agent-based workflow is something like: you write a project document then leave your computer or let it run in the background. While you’re gone, the agent works through the document, building features, updating the project doc as it goes. You come back, review the output, and refine it. Hopefully, the first pass gets 80–90% right; you give it a bit of feedback and let it rip again or do the last bit yourself.

Agents are sexy in the same way terms like “passive income” are sexy: they suggest you can get something for nothing. The truth about most things that look like “passive income” was they were more accurately “front-loaded income” where someone did a lot of work to set up a system that could work without them (for a time).

In the same way, agents are only as good at their task as the model and instructions allow them to be. Just like over-hiring and under-training employees can lead to lots of people doing useless (or actively harmful things), so too with agents.

Agents that are able to use a thoughtfully architected scaffolding system are vastly more effective at actually doing things that produce real value than those that aren’t. Using a bunch of agents doesn’t necessarily make you more productive if those agents aren’t actually doing economically valuable or otherwise useful work.

I see many people that seem to be in the same trap Eli Goldratt described in his book on manufacturing efficiency, The Goal. Factories were optimizing for machine utilization with the goal of every machine running 24 hours a day. Goldratt pointed out that what manufacturing facilities really cared about wasn’t machine utilization, but finished products they could sell.

Imagine a three-station assembly line: stations 1 and 2 can each crank out 200 widgets an hour, but station 3 can only handle 100. Running stations 1 and 2 flat-out doesn’t ship more product, it just piles up half-finished widgets in front of station 3 and creates more work managing the pile. Running fifty agents in parallel is the same fallacy at higher resolution: if the bottleneck in your work is figuring out what’s worth doing, more agents just generate more output for you to wade through.

Just as a well-run factory produces widgets effectively, a well-architected scaffold generates real leverage.

The Master Builder

With the rise of agentic AI, I think we are coming back to a paradigm of work like Brunelleschi’s master builder mode. The most valuable component is the person solving problems as they emerge while holding both the vision and execution components together.²

The pure architect approach doesn’t work. “Claude, rebuild my CRM, make no mistakes” is not effective. You need to be on site. You need to know which sub-contractor (agent/skill) to call for which job, and whether the work is good when it’s done (QA/QC). You’re not producing most of the work, but you’re sequencing it, inspecting it, and managing the scope. When someone (agent) says “while we’re at it, can we move this wall?” you need to know that the answer is “that’s load-bearing, here’s what that actually entails.”

At the same time, it would be dumb not to recognize the tremendous leverage that being able to use both the AI models and their agentic ‘hands’ produces. While there is a strain of Luddite romanticization around the use of AI, the interesting question for those of us who don’t want to play the martyr is one of understanding what to delegate and how to work with AI.

Two components of the Master Builder analogy are worth emphasizing here: the importance of local knowledge and meta-rational work.

Local Knowledge: What’s Not in the Training Data

First is the importance of local knowledge. A good master builder has to know the sub-contractors, the building codes, the soil conditions, and the inspector’s quirks for their area.

That knowledge doesn’t transfer easily. A master builder who spent twenty years on commercial projects in Southern California can’t just move to residential work in Minnesota. Different climates, different building codes, different subcontractor networks, different client expectations. The broad skills translate, but the stem of deep local knowledge doesn’t.

This is why good general intelligence isn’t the same thing as intelligence becoming a commodity. Out of the box, Claude Code can do a bit of everything and can do it pretty well. But, two people with the same model and the same harness will produce wildly different results depending on the local knowledge they’ve built around it.

Your job as a knowledge worker using these tools is to understand what knowledge it needs and when and to use tools like CLAUDE.md or skill files to supply that.

Economist Friedrich Hayek’s most famous insight about economies was that the knowledge required to coordinate them can’t be held by any central planner; it’s dispersed across millions of people who each know something about their particular circumstances — what they value, what’s scarce, what works.

Hayek was writing at a time when this was a somewhat unpopular view. In the early decades of the Cold War, it was intellectually fashionable to believe that the USSR’s centrally planned economy would eventually outcompete and win.

In his 1938 paper “On the Economic Theory of Socialism,” Oskar Lange had proposed a “market socialism” where a central planning board iteratively sets prices via trial and error. Abba Lerner extended this, and virtually all economists acclaimed the Lange-Lerner solution.

The Soviet collapse was read as Hayek’s vindication and mainstream economics quietly moved back toward the Hayekian position.

Central planning failed not because planners were dumb but because the knowledge they’d need to plan well is structurally not available to them. Markets work because prices compress that distributed knowledge without needing to aggregate it. If a tin mine collapses somewhere in the world, the price of tin rises and every user of tin economizes without any of them needing to know the mine collapsed, or why. The local information never has to travel; the price carries the signal.

Foundation models like Claude, ChatGPT and Gemini can eat the trunks of knowledge work: the routine, easily codifiable parts of legal review, financial analysis, standard coding tasks. What they can’t eat are the twigs: the local, contextual, fast-changing specifics that drive a lot of value in many particular situations.

A model trained on the internet doesn’t have your clients, your voice, your constraints, your past decisions and why you made them. That knowledge lives in your head, your files, and your past work and was never available to the training corpus. It also doesn’t (necessarily) have great judgement about how to use that information.

Meta-Rationality: Coding as the Canary

Second, the master builder analogy emphasizes the importance of what might be called meta-rational work. Rational work is problem-solving inside a given problem statement. If you can write down the problem and what good looks like, AI can usually solve it. Math, formal logic, syntax checking, and standard financial analysis are all things it excels at.

Meta-rational work is looking at all the context, complexity, and nebulosity around a domain. It’s figuring out what the problem actually is, whether the problem statement is sensible, and which parts of the situation matter. Brunelleschi was a goldsmith by training, but a meta-rational master builder in practice.

For some knowledge work, “good” is relatively objective. Math most obviously, and to some extent disciplines like standard financial modeling where formal criteria for correctness exist. AI is very good where the criteria are fairly objective.

But, a lot of knowledge work is meta-rational. What makes a good house, a good article, or a good strategic decision? These are all highly context dependent where purposes shape what “good” means, and figuring out the purposes is part of the work.

Many software engineers are now fully meta-rational workers who haven’t written any code by hand for months. Coding is the canary in the coal mine. A few years ago, writing code was the job. Increasingly, software engineering looks more like being a master builder. They specify what they want, review what the agents produce, decide which trade-offs matter, figure out which failure modes are load-bearing. That’s meta-rational work.

The rational core of coding got eaten, and what remains are the more meta-rational components of architecting the requirements, scope, and QA. Engineers don’t disappear, but the rational layer of the role has been largely automated, and what’s left is meta-rational. Every other knowledge-work domain is next in line.

Invest in Your Scaffolding

The practical implication is that the highest-leverage investment is your scaffolding: the local knowledge layer that encodes how you work, what you mean by good, which constraints matter for your particular clients and projects, which failure modes you’ve already seen.

If it’s well constructed then each skill you build, each workflow you refine, each piece of context you add should aim to make every future session or agent more capable.

We should expect at a bare minimum that the models will keep getting smarter at the rational work and they’ll continue to shrink the rational portion of every domain until it looks like coding already does.

Will AI eventually close those gaps too? Maybe! In principle, there’s no reason models can’t handle context, purposes, and integration. Arguments that start “AI can never do X” usually rely on some romantic hand waving about something being “fundamentally human.” But they aren’t very good at it right now, and, to the extent good is not easily measured, it may be some time before they are.

Brunelleschi’s Back, alright

The master builder is an old tradition. Imhotep, the medieval cathedral masons, Vitruvius, Brunelleschi himself — for most of history, the person who held the vision and the person who solved the problem on site were the same person. What was unique about Brunelleschi wasn’t any single innovation; it was his ability to hold a dozen domains in his head at once and see, in real time, how they had to fit together. The herringbone brickwork, the iron tension chains, the ox-hoist, the sequencing of the inner and outer shells — none of those were the dome. The dome was the pattern of how they fit.

That pattern wasn’t in his original proposal. It was invented on site, in response to problems that only became visible once construction was underway. It wasn’t a brilliant up-front plan executed faithfully, but the integrative judgment to keep adapting the plan as the situation reveals what the plan didn’t anticipate.

Any individual skill or CLAUDE.md note is a brick — a rational unit, increasingly something a model could write for you. The leverage isn’t in the bricks. It’s in the pattern of how they fit together: which context loads when, which skill calls which, which constraint shows up where, what happens when an agent hits the wall it was always going to hit.

That pattern is meta-rational work, and it’s almost entirely yours, because it’s specific to your clients, your constraints, your past decisions, and the failure modes you’ve already lived through. The models will keep eating the trunks, but the local knowledge and judgement of how to use it will remain yours. We are all master builders again.

Footnotes

¹ In practice, there are lots of things that models do not seem to be good at, at least not yet. ↩

² I think director or showrunner is a potentially better way of thinking about it, because it encompasses both the creative element of the project and the operational element of getting the project done. I’ve never worked in Hollywood or lived in Los Angeles, so I don’t really have a good sense of how that industry works. I live in a house that occasionally has things break that need to get fixed, so I have had a part-time job as a general contractor for a while. ↩