Goldenberry

Writing  ·  Essay  ·  12 min

Why senior engineers are the hardest to upskill on AI.

12 April 2026

A black Underwood manual typewriter, close-up
Photo: Wes Hicks · Unsplash

At the third week of an embedded engagement in a Berlin fintech, I sat with a Principal Engineer named Andrei while he tried, for eleven minutes, to write a prompt that would extract structured data from a PDF. Twelve years at the same company. Ph.D. in distributed systems. He had shipped, two years before, a real-time fraud engine that handled fourteen thousand requests per second under a strict latency budget. He could not, that morning, get a model to read an invoice.

Andrei is not unusual. Across the four embedded engagements we ran last year, the senior engineers were uniformly the slowest to get traction with AI tools. The junior engineers picked it up in days. The mid-levels in a couple of weeks. The seniors took two months, often three, sometimes longer than the engagement lasted.

The simple explanation is that they don’t trust the output. That is true but trivially true. Anyone with a background in correctness, invariants, and tested code finds the probabilistic behavior of language models uncomfortable. Discomfort, though, is not the bottleneck. The deeper problem is the mental model.

A senior engineer has spent fifteen, twenty, twenty-five years sharpening one skill. The skill is reasoning from first principles about a system you can fully observe. Given a function with a bug, you read the function. You trace the input. You inspect the state. You hold the entire control flow in your head. The system has no hidden behavior. If your tests pass and your reasoning is sound, the code works. If it doesn’t work, you find the disconnect between your reasoning and the system, and you fix one of them.

This skill does not transfer. A language model is a function, technically, but it is not a function you can read. It has hidden state. It has hidden capabilities. It has failure modes you discover the way you discover the failure modes of a person, by working with it for long enough. You don’t reason about what it will do. You probe it, observe its behavior, build an empirical sense of where it works and where it falls over. The shape of the skill is closer to running an experiment than writing software.

That is hard for seniors in a way it isn’t for juniors. Juniors are used to working with systems they don’t fully understand. They debug by running things and seeing what happens. They have a long tolerance for the ambiguity of partial knowledge. The shift to working with a model that doesn’t always do the same thing twice is a smaller shift, because they were already doing something close to that. The senior is being asked to abandon the practice they got hired for.

Two failure modes show up over and over.

The first is over-engineering. A senior engineer is asked to extract entity types from a stream of customer emails. They reach for tools they trust. A grammar. A parsing library. A small state machine. They build something elegant. It handles forty percent of the cases that a five-line prompt to a model would have caught. They ship it because shipping is what they do, and because the elegant version is the one they would defend in a code review. Six months later the team is maintaining a parser that nobody on the team understands except the senior who built it, and the senior has moved to another project, and the model would have, by then, gotten significantly better.

The second is over-evaluation. A senior engineer is asked to wire a model into the support tool. They start with the evaluation framework. They want a regression suite, a held-out set, baselines, ablations. They have a hundred-page document on what “good” looks like before they have talked to a single support agent. Three months in they have a beautiful eval. They have not shipped a single thing the agents use. They will tell you, correctly, that without an eval you cannot iterate responsibly. They will not tell you, because they cannot see it, that without a thing in production there is nothing to iterate on.

Both failures are the same failure. Both are the senior engineer trying to do the work the way the work used to be done.

We have tried to fix this with training. Training does not work. The senior engineers sit in the back row. They take notes. They ask sharp questions about edge cases that won’t come up. They go back to their desks and do not change their behavior. Training assumes the gap is knowledge. The gap is not knowledge. The gap is identity. A workshop on retrieval augmentation does not address the question, sitting under the surface, of whether the work is still craft.

What works, the only thing that works in our experience, is putting a senior engineer in a small group of three or four people, one of whom is a peer who has already crossed over. A real problem the team owns. A four-week constraint, with a definition of done that means a thing in production. The senior watches the peer prompt the model, fail, prompt again, accept a result that is seventy percent of what they would have wanted, and ship it anyway. They watch a person they respect doing work that violates the standards they hold themselves to. And then, after a few iterations, the senior starts to do the same.

The unlock is permission. The senior engineer knew, at some level, that the new work would require shipping things they didn’t fully understand. They did not know that they were allowed to. Watching a peer do it is the permission.

There is a second piece. Most senior engineers I have worked with had to abandon, or at least reframe, something they considered part of their craft. The principal who took longest at the Berlin fintech, six months in, said something I have heard a version of since. He said: “I always thought my job was to write code that I would be proud to read in five years. Now my job is to ship something that solves a problem this quarter, knowing I’ll probably rewrite it in eight months when the model gets better.” That was not a happy sentence for him to say. He said it the same week he stopped being slow.

If you are running a team that has an AI mandate and a roster of senior engineers who aren’t moving, the move is not another workshop. The move is a small problem, a four-week window, a peer they respect who is already across, and visible permission from leadership to ship things that look unfinished. The seniors will not love any of this. They will get there. The teams where they don’t get there are the ones where leadership didn’t give the permission, didn’t back the peer, didn’t protect the four-week window. Those teams’ AI initiatives stall. Not because the engineers can’t do it. Because the conditions for them to do it weren’t built.

The thing that took us by surprise, the first time we noticed this, was how fast it flips once it does. The senior engineer who was your bottleneck for three months is, in month four, the one running circles around the rest of the team. They have the judgment. They have the mental map of the codebase. They have the relationships. They were always going to be the team’s most valuable person on this work. The barrier was never capability. It was almost always permission, and a peer.