TDD Buddy - What 'Senior' Means When Typing Is Free

The traits the industry used to credential seniority were proxies all along, and agents just made the proxies free.

This is the second post in a two-part series, “The PR Is Eating Itself.” The first post, Where the Review Point Moved, traces how the arrival of agentic workflows shifted the meaningful review surface from line-level syntax to scenario coverage and contract shape. This post names what that shift exposes about which skills were always real and which ones we were mistaking for the real thing.

The Old Seniority Stack Was Mostly Proxies

Be honest about what senior compensation was actually buying.

It was buying typing speed. The ability to produce correct syntax faster than a junior, without looking things up, without second-guessing the semicolons. It was buying framework recall: knowing where Spring wires its dependencies, knowing which NuGet packages to reach for, knowing the twelve ways ASP.NET MVC resolves an action method. It was buying debugger fluency: the ability to set a breakpoint, read a stack trace, and trace an exception back to its origin without a tutorial. It was buying library knowledge: knowing that this class of problem has already been solved and knowing the name of the package.

These things were observable. A senior developer produced clean, compilable code faster. A senior developer reviewed a PR and caught the misconfigured dependency in twelve seconds. A senior developer traced a production outage to a thread-safety issue that a junior would have filed as “intermittent and unclear.” The observables were real. The behaviors were real. The speed gap was real.

What was never real was the assumption that those observables were the senior skill itself.

Typing speed was a proxy. Framework recall was a proxy. Debugger fluency was a proxy. They correlated with good judgment in the same way that good penmanship correlated with good writing before typewriters: strongly enough in the era when the skill mattered to make the correlation look causal, weakly enough that the correlation collapsed the moment the underlying constraint changed.

The constraint changed.

Those Proxies Were Always Standing In For Judgment

What the industry was actually trying to buy, and could not directly observe, was judgment.

Judgment about what to build. The ability to receive a requirement and ask, before typing a single line, whether the requirement was solving the right problem or encoding an organizational dysfunction into a codebase that would outlast the dysfunction by a decade. That question takes five seconds to ask and five years of scar tissue to ask well.

Judgment about where the seams go. The ability to look at a domain and sense, before the abstraction is named, where the natural fault lines are. Not every service needs its own repository. Not every domain concept needs its own microservice. Not every collaboration needs a message queue. The developer who could feel which cuts would age well and which ones would produce a migration in eighteen months was worth something concrete to a team. The skill was not visible in the commit. It was visible in the absence of a certain class of problem two years later.

Judgment about which abstractions earn their cost. Every abstraction in a codebase is a bet that the complexity it introduces is outweighed by the flexibility it creates. Most abstractions lose that bet. The developer who could evaluate a proposed abstraction and say “this encodes an assumption that will be wrong in six months” was saving the team from an entire category of future pain. That evaluation happened in a design conversation, not in a PR comment on a method signature.

Judgment about which tests would catch what the team actually feared. Coverage numbers are easy to manufacture and nearly meaningless. The question under them is harder: are the tests naming the behaviors that matter, or are they naming the behaviors that were easy to test? A test suite that covers sixty percent of the implementation but catches eighty percent of the production incidents is more valuable than a suite at ninety percent coverage that misses the incidents. Knowing the difference, and being willing to invest test effort in proportion to risk rather than in proportion to coverage tooling, was a judgment call. It looked like “they write good tests.” It was actually “they model risk correctly.”

None of those judgments were directly observable. They could not be screened for in an interview. They were not visible in a diff. They showed up as outcomes, eventually, in a codebase that aged gracefully instead of one that needed a rewrite every three years. The industry used the proxies because the proxies were visible and the thing they were proxying for was not.

The Proxies Got Free

Agents type. They type faster than any human ever typed, without fatigue, without wrist strain, without looking up syntax or tab-completing package names. Framework recall is free: any agent that has been trained on the ecosystem knows the APIs, the configuration patterns, and the runtime behaviors of every major framework in any language a team is likely to use. Debugger fluency translates directly: show an agent a stack trace and it will locate the failing assumption, propose a fix, and write the test that would have caught it. Library knowledge: trivially free. Not only does the agent know the library, it knows three alternatives and has an opinion about which one fits the current context.

Every skill that was observable about seniority, every behavior that got a developer to the higher compensation band, is now available at the cost of a prompt.

This is not a prediction about a future state. It is a description of what any reasonably capable coding agent does today, in 2026, in production use, across teams that have already reorganized their workflows around it. The developers in those teams who were paid senior rates for typing speed and framework recall have already had their value proposition quietly repriced. Most of them have not updated their self-models to match.

The market rate for the proxies is collapsing toward zero. Whether any individual developer has updated their resume yet does not change the direction of travel. The proxy skills are becoming table stakes, the way basic literacy became table stakes: still required, no longer differentiating.

What stayed differentiating is not in any job posting yet, because the industry has not finished figuring out what it is.

What Stayed Scarce Was Taste

Taste is the word for it. It is imprecise, and that imprecision is load-bearing.

Taste in seams: the instinct for where a boundary should be drawn before the cost of the wrong boundary becomes visible. The developer who looks at a proposed service boundary and says “that cut is going to force two deployments every time this feature changes” is exercising taste. There is no algorithm for that judgment. It draws on accumulated pattern-matching across every system the developer has ever watched age.

A nose for which abstractions will rot: the ability to read a proposed design and sense, without full analysis, that it is encoding an assumption the business will violate within two quarters. Abstractions that rot have a smell. They over-specify the wrong thing. They create a seam that looks clean today and forces a hack tomorrow. Developers with taste smell this before writing a line. Developers without it write the line and wonder why the maintenance burden grew.

The ability to name a domain concept correctly the first time. Naming is not a soft skill. It is a precision engineering task. A name that captures what a concept is, not what it does, not what it looks like from the outside but what it actually is in the domain model, prevents years of confusion. A wrong name does not just make code harder to read. It causes the wrong abstractions to get built around it, because the wrong name pulls the concept into the wrong semantic neighborhood and the developers who follow inherit a model that was shaped by a word rather than by the domain.

The willingness to delete code. Deletion is a senior act. Junior developers add. Mid-level developers add better. Senior developers delete. The discipline to look at a feature, a method, a service, an abstraction, and say “this no longer earns its place, the cost of maintaining it exceeds the value it produces, and the right action is removal” is rare. It is uncomfortable. It generates no visible output. It shows up in future metrics as faster velocity, lower incident rate, shorter onboarding time. It is entirely invisible in the week it happens.

The refusal to add a feature that pollutes the model. Every codebase has a moment when a product request arrives that would require encoding an exception to the domain model. The feature is reasonable in isolation. The implementation is straightforward. The business case is clear. The senior developer who says “if we add this the way it’s specified, we introduce an inconsistency in the model that will cost us more than the feature is worth, so let’s find a different shape” is doing the highest-value work available to them that day. That work is invisible in any sprint velocity chart. It shows up as a codebase that still makes sense two years later.

None of that is free. None of it is within reach of an agent that has not been given the right constraints, the right vocabulary, the right scenario coverage to work from. All of it requires a human who has cultivated the pattern-matching through accumulated exposure to consequences.

Consider naming in concrete terms. A customer returns a physical item. The warehouse receives it. The unit becomes available for sale again. Something has happened in the domain that the codebase needs to track. What do you call it?

Three candidates:

InventoryReturn is the obvious choice. It names the direction: something went back into inventory. It is neutral, broadly understandable, and will confuse nobody. It will also confuse nobody into thinking hard about what it means. In two years, when the team needs to distinguish between a customer return that goes back to primary inventory and a return that goes to a secondary liquidation pool, InventoryReturn provides no guidance. The team will add InventoryReturnType, an enum with values named for whatever felt obvious at the moment, and the model will quietly fracture.

StockRestitution is the reach for precision. It encodes the financial dimension: something was taken out, now it is put back, the ledger balances. This is accurate for some returns and wrong for others. A defective item returned to inventory for destruction is not a restitution. A warranty replacement is not a restitution. The name has imported a financial claim into a physical event, and the claim will be wrong in exactly the cases that matter.

UnitRestored names what the domain actually cares about. A unit that was unavailable is now available. The event is about the state of a unit, not about who moved it or why. It composes: aUnit().restored() reads naturally in a test. It tolerates extension: a restored unit can be tagged with a RestorationReason without the name becoming incoherent. It stays honest about what it knows: it knows a unit is available again, and it does not claim to know whether the return was a customer refund, a vendor recall, a warehouse audit correction, or a defect disposition.

The difference between InventoryReturn and UnitRestored is not a matter of taste in aesthetics. It is a matter of which concept the name will organize the model around for the next three years. InventoryReturn organizes around the movement. UnitRestored organizes around the state. The model that grows from InventoryReturn will have methods that ask “where did this return come from” because the name frames the event as a movement with an origin. The model that grows from UnitRestored will have methods that ask “what is this unit’s current availability state” because the name frames the event as a state change with consequences. One of those models is easier to extend when the business adds a second inventory pool. The other will need a refactor at that moment, and the refactor will touch everything downstream.

Naming is not a style preference. It is the first design decision, and it shapes every decision after it. The developer who gets the name right the first time, without extensive deliberation, by having internalized enough of the domain to feel what the concept actually is: that is the skill that was always rare and is still rare.

Taste is the thing that does not get cheaper when typing is free.

The New Senior Surface Does Not Look Like Coding

What a senior developer actually does in a morning in 2026 looks different from what it looked like in 2024.

The 2024 whiteboard had a dependency graph, a list of classes to write, a reminder to add the migration. The work was: open the IDE, type until the tests pass, open a PR. A senior developer’s value showed up in how fast that loop ran and how clean the result was.

The 2026 whiteboard has something else. It has a contract for a service that does not exist yet, sketched before any agent has touched the keyboard. It has a set of scenario names, written in domain language, that define what the new capability must do, what it must not do, and what happens at the edges. It has two or three candidate names for a new domain concept, with a paragraph under each one arguing for why the name is or is not right.

The 2026 senior developer reviews what the agent produced against the scenario coverage that was specified before the agent ran. The review is not about syntax. It is not about whether a lambda could be simplified. It is about whether the agent encoded the right behavior for the cases the team was actually afraid of, and whether it invented vocabulary that will drift from the team’s shared domain language over the next six months.

The 2026 senior developer kills builders that encode incidental complexity. When the test API grows a method that exists because the agent needed a shortcut, not because the domain concept it represents is real, that method gets deleted before it becomes the vocabulary the next hundred tests are written against.

The 2026 senior developer refactors the test API. The builders, factories, and domain types that agents compose against are the highest-leverage surface in the codebase. An improvement to aLoyaltyMember() propagates through every future test that uses it. Neglect in that API propagates the same way. Someone has to own it. Someone has to recognize when aLoyaltyMember().withFirstYearStatus().duringPromotion() is getting awkward because the underlying domain model does not actually have a clean separation between membership tier and promotion eligibility. That recognition, and the refactoring that follows, is senior work.

None of it produces a large commit. None of it is visible in velocity metrics. All of it determines whether the codebase is getting better or worse.

Mid-Level Developers Are at the Sharpest Edge

Do not soften this.

Juniors are being coached directly into the new stack. The developers who are entering the profession now have never known a workflow where typing speed was the differentiator. They are learning to specify before they implement, to work with agents, to own the scenario coverage and the domain vocabulary. The transition is not comfortable, but it is clean: they have no habits to unlearn.

Senior developers who actually had the judgment they were being paid for are in reasonable shape. They discovered years ago that naming was hard, that seam design was consequential, that deletion was discipline. The agent is a new collaborator. The work is faster. The judgment still applies. Some of what they knew by feel now has to be made explicit, and that is an adjustment, but not a dislocation.

Mid-level developers are facing something different.

The mid-level developer built a career by being better at the proxy skills than juniors, faster at framework recall, cleaner at syntax, more fluent with the debugger, broader in library knowledge. Those skills produced visible output. They justified the compensation band. They made the mid-level developer genuinely more valuable in the workflow that existed before agents.

That workflow is gone. The mid-level developer who coasted on the proxies, who never cultivated a nose for abstractions, who never invested in naming as a precision skill, who relied on speed-of-typing as the signal of competence: that developer’s value proposition has been automated. Not partially. Not in a narrow use case. In the workflow that their team is now running or will run within eighteen months.

The labor-market dislocation from the agent transition is not going to land evenly across levels. Juniors get scaffolded into the new stack. Seniors redirect their existing judgment. Mid-levels who were carrying the proxies are the ones who find the proxies are the part that got automated.

This is where the exposure is sharpest and where the retraining imperative is most urgent.

Train For It Like a Kata, Not a Crash Course

The skill that stayed scarce does not come from a weekend workshop or a certification program. It comes from reps.

The post Katas Are Rehearsal, Not Performance made the case for deliberate practice as the gap between knowing TDD and doing TDD under pressure. The same logic applies here, with the target shifted. The target is no longer “type the loop correctly.” The target is “name the concept correctly, feel where the seam should go, recognize when an abstraction is about to rot.”

These skills can be practiced in controlled conditions. Design-from-tests exercises, where the constraint is to specify the full scenario coverage before writing any implementation, build the habit of thinking in named behaviors instead of typed code. Run them against known katas so the problem is not the variable. The tenth time through the same exercise, when the problem is boring, is when the brain has bandwidth to watch itself naming.

Refactoring exercises against an explicit constraint build the deletion muscle. Take a working codebase and refactor it to reduce its vocabulary by twenty percent without removing any behavior. The first time through this exercise, the constraint feels arbitrary. By the fifth time, the developer starts to see where synonyms have crept into the model, where two concepts that were given different names are actually the same concept, where an abstraction that served a purpose three months ago is now pure weight.

Naming exercises, three candidates with a paragraph defending each, build the precision that the naming case study above illustrated. These do not have to be long. They do not have to be graded by anyone. The act of writing down why one name is better than another, in domain terms rather than stylistic terms, is the practice. It surfaces the reasoning that was previously implicit and makes it available for review.

None of this is the kind of training that produces a certificate or a badge. It does not show up on a LinkedIn profile. It shows up in the codebase two years from now, in a vocabulary that held together instead of fragmenting, in a model that absorbed new requirements instead of cracking under them.

The practice is not glamorous. It is the same advice the kata post gave about TDD reps: the value is in the repetition of the process, not in the novelty of the problem. The developer who practices naming ten times a week gets better at naming. The developer who reads about naming and considers the matter settled does not.

There is no shortcut from knowing that naming matters to knowing, by feel, which name is right.

The Skill Was Always There

The industry called it seniority and measured it wrong.

The actual senior skill, naming the right problem and shaping the seams that make it solvable, was always the rare thing. The proxies (typing speed, framework recall, debugger fluency) were load-bearing in an era when those skills were genuinely scarce and genuinely correlated with the judgment that was always the underlying value. The correlation looked like causation for long enough that the industry built compensation structures, interview rubrics, and career ladders around the proxies rather than the thing the proxies were pointing at.

Agents made the proxies free. The thing they were proxying for is now visible, because the scaffolding that obscured it has been removed. The gap between developers who had the judgment all along and developers who were paid as if they did is no longer hidden under a layer of typing speed and framework recall. It is in the open, in the shape of the abstractions that get built, in the vocabulary that either holds together or fragments, in the tests that either name what the team fears or name what was easy to test.

The traits the industry credentialed as seniority were proxies all along, and agents just made the proxies free. What the proxies were pointing at is still scarce, still hard to cultivate, and now the only thing that differentiates. The work of becoming genuinely senior was always the work of developing judgment, not the work of getting faster at typing it in.

That work is still available. It is just no longer optional.