How does AI write code, run tests, and fix bugs? From Copilot to Claude Code
About 10 min readAI-assisted programming has evolved through three distinct generations. Each one dramatically increased AI's autonomy and capability boundary, while reshaping the human developer's role.
Hover over each card. Watch the autonomy bar grow with each generation:
Predicts the next line of code based on context. Pioneered by GitHub Copilot (early versions).
Ask questions about code, generate functions from descriptions, explain logic. Powered by ChatGPT, Copilot Chat.
AI reads the codebase, plans changes, writes code, runs tests, fixes errors, creates PRs. Claude Code, Cursor Agent, OpenAI Codex.
From spell-checker (Gen 1) to translation assistant (Gen 2) to simultaneous interpreter (Gen 3) — AI's role in programming grows increasingly proactive with each generation.
A code agent doesn't generate code in one shot. It follows an iterative loop — much like a senior developer's daily workflow. It cycles through "read - plan - code - test - fix" until the task is complete.
Click each step to see what the agent does at that stage:
The code agent's loop mirrors a senior developer's workflow — read code to understand context, plan the approach, write code, run tests, fix bugs. The difference: the agent never gets tired and can iterate relentlessly.
What makes a code agent powerful is that it doesn't just "generate text" — it can invoke tools to interact with real development environments. File systems, terminals, browsers, Git... these tools give the agent true "hands-on" capability.
Click a tool card to see it in action:
A code agent's tools are like a developer's IDE + Terminal + Browser — except AI is the one operating them. More tools = more capability.
Let's watch a real example: a user tells Claude Code "Add dark mode to the settings page". Observe how the agent works through understanding the requirement to committing the code.
Click "Play" to watch Claude Code work:
Notice the key moment: when the test failed, the agent didn't give up. It read the error message, diagnosed the root cause, fixed the code, and re-ran the tests. This is precisely what distinguishes an "agent" from a "code generator."
Agentic coding is changing how software is built — the developer's role shifts from writing every line of code to reviewing AI-generated changes.
Claude Code is like a tireless junior developer — you describe the requirement, it writes the code, runs the tests, and fixes the bugs. Your job is to review.
Code agents are powerful, but far from perfect. Understanding their limitations helps you collaborate with them more effectively. At the same time, this field is evolving at breakneck speed.
Agents can get stuck in "fix bug, introduce new bug" loops, wasting tokens and time.
Large codebases can't fit entirely in context, causing the agent to miss important dependencies.
Sometimes overwrites working code or makes changes beyond the scope of the request.
Requires sandboxed execution, permission systems, and human approval for destructive actions.
Multiple agents working in parallel — one on frontend, one on backend, one on tests.
Agents that learn from past mistakes, gradually reducing repeat errors and becoming more efficient.
From requirements doc to complete project — architecture, code, tests, deployment, all generated.
From "write code" to "review code" and "define requirements" — developers become AI navigators.
AI coding evolved from passive autocomplete to active autonomous agents, each generation granting AI greater autonomy.
Code agents complete tasks through iterative cycles, not one-shot generation.
An agent's power is defined by the tools it can invoke — files, shell, Git, browser.
The developer's core value shifts to requirement definition, architecture decisions, and code review.
Code agents won't replace programmers — they'll replace programmers who don't use code agents.