The Pull Request Your Coding Agent Opened That Closed a Real One

June 3, 2026 · 11 min read

Software Engineer

Your coding agent opened a pull request at 3:14 on a Tuesday afternoon. The PR description was clean, the diff was small, the CI was green. It got squash-merged twenty minutes later. The teammate who came back from lunch at 1:20 the next day saw a notification: "PR #1247 was closed." Not merged. Closed. The branch was gone. The seventy-two review comments she'd left on it the previous week were gone too — collapsed under an "outdated" label on a PR that no longer existed in any active list. A senior engineer's design decisions, two rounds of back-and-forth with the security reviewer, and a careful migration plan that took a week to negotiate, all vanished into a footnote on a different PR that nobody had read closely. The squash commit's only trace of what happened was a one-line tag at the bottom: Closed by #1893.

This is the failure mode of trusting a coding agent to write its own pull request metadata. Not the code — the metadata. The diff was fine. The agent did good work. What it could not do was distinguish a fresh discussion from a stale one, and GitHub's auto-close machinery treats every closing keyword the agent writes as a load-bearing instruction. Your agent reads the comments to gather context, infers from a six-month-old reply that its work supersedes an older PR, writes Closes #1247 in the description it generates, and the merge does the rest — silently, mechanically, irrevocably from the perspective of anyone who wasn't watching the diff at the moment of squash.

The keywords your agent wields without understanding

GitHub's auto-close mechanism is a feature most engineers never think about because it sits in their muscle memory: type Closes #42 in your PR description, merge, watch the issue close. The documentation lists nine canonical keywords — close, closes, closed, fix, fixes, fixed, resolve, resolves, resolved — and the rules sound simple. The PR must target the default branch. The keyword must precede an issue or PR reference. The merge must be a real merge, not a closure-without-merge. When all three conditions hold, the linked artifact closes automatically the instant the merge lands.

What makes this dangerous in the hands of a coding agent is that the keyword is just text. There is no separate confirmation, no preview, no "are you sure you want to close PR #1247?" dialog. The keyword executes by virtue of having been written. The agent producing the PR description has no way to verify that #1247 is in fact superseded; it has only the input it was fed — usually a thread of comments, an issue body, and whatever the user typed in the prompt. If anywhere in that input a previous human wrote "let me try a different approach, I'll supersede this with a new PR," the agent will faithfully encode that intent in the description it generates, even if the supersession never actually happened and the original PR was the one that got the green light.

The native GitHub keywords nominally close issues, not PRs. But the same convention has been ported to pull-request supersession through community workflows — the most widely installed of which scans merged PR descriptions for the literal string supersedes #N and closes the referenced PRs. Once that workflow is in your repo, you have effectively granted every PR author — including your agent — the power to close any other PR by writing one sentence. The action does not check whether the author of the closed PR consents. It does not check whether the closed PR is in a different team's queue. It does not check whether the diffs are actually related. It checks the keyword and acts.

Stale context, parsed as current intent

The agent's input is a haystack. It is the issue description, the comment history, the linked PRs, the prior conversation in the thread, the contents of a roadmap doc somebody pasted six months ago. Coding agents are good at reading large contexts and bad at dating them. A comment from October that said "this is moot now, I'll roll it into the next architectural pass" reads in November like a fresh thought; in February it reads like ancient history; but the agent does not know that February is February. It reads tokens. The temporal stamp on each comment is metadata the agent's prompt template usually flattens or omits. To the model, the comment is just there, sitting in the context, indistinguishable from a comment posted ten minutes ago.

This is the same failure mode that bites agents elsewhere — memory that doesn't know the world has changed, runbooks that encode a now-fixed workaround, eval cases that certify a product scope nobody ships anymore. Stale context is a category, and "stale comment that triggered a stale auto-close keyword" is one of its sharpest expressions, because the consequence is not a wrong answer the user can re-prompt. The consequence is a destructive action against shared state that was already in motion. A week of human review is exactly the kind of work that lives only in the comments — it never made it to the code, because it was steering decisions about the code, and decisions don't show up in diffs.

The way out is not to ask the agent to be more careful. The agent will be exactly as careful as the input lets it be. The way out is to treat the auto-close keyword as an irreversible action — even though it is technically reversible — and put a gate in front of it that does not depend on the model's judgment.

The "closed not merged" label is a faint signal in a flood

In 2024, a senior engineer reviewing the PR queue every Monday morning would have noticed a closed-not-merged PR with two reviewers and seventy comments. They would have asked: what happened to this? They would have asked because the queue was small enough that anomalies stood out. By 2026, the platform looks different. GitHub itself has noted that automated PR creation from coding agents has surged enough to force a thirtyfold redesign of internal pipelines. Your team's queue is no longer a queue of human-authored PRs; it is a queue dominated by agent-opened drafts, agent-opened follow-ups, agent-opened fix-up PRs, and a thin layer of human-authored work mixed in. The signal-to-noise ratio on "PRs closed without merging" has collapsed.

A closed-not-merged label means a hundred different things in 2026. It can mean the agent's CI never went green and the agent closed its own draft. It can mean a duplicate the agent opened against a stale branch and then withdrew. It can mean an experimental PR that was never meant to land. The actual human-relevant case — "this was a real PR with real review work that got closed against its author's intent" — is now a needle in a stack of needles, and the visual treatment is the same for all of them.

The recovery story is technically benign and operationally brutal. GitHub restores deleted branches indefinitely from the closed PR's "Restore branch" button, and once the branch is back, the PR can be reopened. But reopening the PR does not reopen the discussion. The review comments anchored to commits that have since been displaced get marked outdated. The CI runs that took two hours each don't re-trigger automatically. The reviewers who already approved the now-closed version don't re-approve the reopened one — they have to be asked again, and the request comes after they have already context-switched to other work. The branch comes back. The week does not.

What an agent-aware PR pipeline actually looks like

The intuitive fix is to ban agents from writing auto-close keywords. This is the wrong fix. Auto-close keywords are useful, and the agent doesn't need to be banned from using them — it needs to be constrained to a safe subset of the syntax. The category of operations to gate is not "the agent writes a PR description" but "the agent writes a PR description that takes destructive action on other open work."

A handful of concrete patterns are emerging:

Strip closing keywords at the boundary. Before the agent's PR description gets posted, run it through a transform that detects closes #N, fixes #N, resolves #N, and supersedes #N for any N the agent did not explicitly receive as a confirmed target. The agent can write "this work addresses the concerns raised in #1247" — that's a reference, not an instruction — and a reviewer can convert it to a closing keyword by hand if appropriate. The instruction form is the thing humans should issue; the reference form is what agents can issue safely.
Required preview for destructive metadata. Before the merge button is clickable on an agent-authored PR, surface the list of issues and PRs that will close on merge as a checkbox-grid the human author must confirm individually. This is the same UX pattern as Stripe's confirmation modal for irreversible operations: the action remains technically the same, but the consent surface forces the human to read what's about to happen.
Stale-context detection. When an agent's PR description references another PR by number, check the timestamp of the most recent activity on that PR. If the PR has been actively reviewed in the last seven days, refuse to apply a closing keyword and surface a warning. This is not a perfect heuristic — active review is not the same as still-wanted — but it inverts the default from "auto-close anything you can" to "auto-close only what looks abandoned."
Audit the closed-not-merged list daily. Treat closed-not-merged PRs with more than a threshold of review comments as a paging signal. A PR with seventy review comments and a closed-not-merged status is almost never a routine cleanup; it is almost always somebody's work being lost. The threshold can be tuned per repo, but the principle is that human-invested PRs deserve a different alarm than agent-opened drafts.
Disable the supersedes-action for agent-authored PRs. If your repo has the supersedes #N community workflow installed, the simplest immediate mitigation is to scope it to PRs with the human-authored label. Agents can write the word; the workflow ignores it unless a human-author label is present.

None of these patterns require a new platform. They require treating the PR description as a place where the agent can write destructive instructions and the merge button as a place where those instructions execute without further review.

The pattern, beyond pull requests

The shape of this failure is general, not specific to GitHub. Anywhere an agent produces machine-readable metadata that other systems then act on without human intermediation, you have an instance of the same problem. Agent-authored email subject lines that trigger filter rules. Agent-authored Jira ticket descriptions with closes JIRA-1234 markers. Agent-authored Slack messages with @here mentions that page a team. Agent-authored database labels that downstream pipelines route on. Each of these is a coding-agent-opens-PR-that-closes-a-real-one scenario waiting to surface.

The defense is the same defense in every case. The agent writes prose freely. It writes references freely. It does not write instructions to other systems unless a human has confirmed the specific target. The dividing line is whether the text the agent produces, on its own, causes a side effect somewhere else. If it does, the text needs a gate. If it doesn't — if it is just a comment, just a reference, just a paragraph — the agent can produce it without constraint.

What you lose by drawing that line is some convenience. The agent can no longer one-shot a PR that closes the issue it was working from; a human has to click the link or type the keyword. What you gain is the difference between a week of careful review surviving to merge and a week of careful review vanishing into a footnote on a PR nobody read. That trade is not close.

References:

Let's stay in touch and Follow me for more thoughts and updates

Twitter LinkedIn Telegram Discord 小红书

The Pull Request Your Coding Agent Opened That Closed a Real One

The keywords your agent wields without understanding

Stale context, parsed as current intent

The "closed not merged" label is a faint signal in a flood

What an agent-aware PR pipeline actually looks like

The pattern, beyond pull requests

Recommended Reading

About Tian Pan

The keywords your agent wields without understanding​

Stale context, parsed as current intent​

The "closed not merged" label is a faint signal in a flood​

What an agent-aware PR pipeline actually looks like​

The pattern, beyond pull requests​

Recommended Reading

About Tian Pan

The keywords your agent wields without understanding

Stale context, parsed as current intent

The "closed not merged" label is a faint signal in a flood

What an agent-aware PR pipeline actually looks like

The pattern, beyond pull requests