Chapter 13: Code Agent

Code is the ideal domain for agentic systems because verification is cheap and objective. When a human reviews prose, they bring subjective judgment—is this paragraph clear? Is the argument convincing? When a test suite runs against code, it returns pass or fail. This binary signal creates tight feedback loops that drive rapid improvement. A code agent can generate, test, discover failures, fix, and test again in seconds. No human judgment required until the tests pass. This chapter builds a code agent that uses verification as its primary feedback signal. The agent takes specifications, plans implementations, writes code, runs tests, and iterates until either the tests pass or it exhausts its retry budget. Along the way, it learns patterns that improve future coding tasks.

13.1 Why Code Is Special

Several properties make code uniquely suitable for agentic automation. Verifiable correctness. Tests provide ground truth. Code either passes or it doesn’t. This eliminates the ambiguity that plagues other domains. A research report might be “good enough” by various subjective standards, but code has clear success criteria. Structured feedback. Test failures aren’t just binary signals—they explain what went wrong. “Expected 5, got 4 on line 42” tells the agent exactly where to look. Compiler errors pinpoint syntax problems. Type checkers identify interface mismatches. This structured feedback guides iteration effectively. Incremental verification. Code can be tested at multiple granularities. Unit tests verify individual functions. Integration tests verify component interactions. End-to-end tests verify complete workflows. This hierarchy allows agents to build confidence incrementally. Rich context. Codebases contain extensive documentation through types, comments, tests, and the code itself. An agent can read existing patterns and match them. Function signatures specify contracts. Test cases demonstrate expected behavior. This context grounds generation in reality. These properties make code agents highly effective. The challenge isn’t whether agents can write code—they demonstrably can. The challenge is building systems that write code reliably, handle edge cases gracefully, and improve over time.

13.2 System Architecture

The code agent has four main components organized around the verification loop.

┌─────────────────────────────────────────────────────────────────────────┐
│                           Code Agent                                     │
│                                                                         │
│   Specification                                                         │
│       │                                                                 │
│       ▼                                                                 │
│   ┌─────────┐                                                          │
│   │ Planner │────────────────────────────────────────┐                 │
│   └─────────┘                                        │                 │
│       │                                              │                 │
│       ▼                                              │                 │
│   ┌─────────────┐    ┌──────────────┐    ┌─────────────────┐          │
│   │ Implementer │───▶│   Verifier   │───▶│ Error Analyzer  │          │
│   └─────────────┘    └──────────────┘    └─────────────────┘          │
│       ▲                    │                    │                      │
│       │                    │                    │                      │
│       │              ┌─────▼─────┐              │                      │
│       │              │   Pass?   │              │                      │
│       │              └─────┬─────┘              │                      │
│       │                    │                    │                      │
│       │              Yes   │   No               │                      │
│       │               │    │    │               │                      │
│       │               ▼    │    └───────────────┘                      │
│       │          ┌────────┐│         │                                 │
│       │          │ Output ││         │                                 │
│       │          └────────┘│         │                                 │
│       │                    │         ▼                                 │
│       └────────────────────┴────── Retry ◀──────────────────────       │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

The Planner analyzes the specification and codebase context to create an implementation plan. It identifies which files need changes, what the approach should be, and what edge cases to consider. The Implementer writes code according to the plan. It generates or modifies files, following codebase conventions and respecting existing patterns. The Verifier runs tests and other checks against the implementation. It captures pass/fail status plus detailed error information. The Error Analyzer processes verification failures to understand what went wrong. It transforms raw error output into actionable guidance for the next implementation attempt. The loop continues until verification passes or the retry budget exhausts.

13.3 The Codebase Artifact

Code agents operate on a codebase—a semantic artifact with structure, conventions, and relationships.

interface Codebase {
  rootPath: string;
  language: string;
  files: Map<string, CodeFile>;
  dependencies: Dependency[];
  testFramework: string;
  conventions: CodeConventions;
}

interface CodeFile {
  path: string;
  content: string;
  language: string;
  exports: Export[];
  imports: Import[];
  tests: TestFile | null;  // Associated test file if any
}

interface CodeConventions {
  indentation: 'spaces' | 'tabs';
  indentSize: number;
  namingStyle: 'camelCase' | 'snake_case' | 'PascalCase';
  testPattern: string;  // e.g., "*.test.ts"
  importStyle: 'relative' | 'absolute' | 'alias';
}

To let the agent navigate and modify the project safely, the codebase artifact exposes operations for reading, writing, and searching files while enforcing repository conventions:

export default class CodebaseManager extends AgenticSystem {
  @field codebase: Codebase;

  async readFile(path: string): Promise<string> {
    const file = this.codebase.files.get(path);
    if (!file) {
      throw new Error(`File not found: ${path}`);
    }
    return file.content;
  }

  async writeFile(path: string, content: string) {
    const existing = this.codebase.files.get(path);

    // Validate content matches conventions
    this.validateConventions(content);

    // Update or create file
    this.codebase.files.set(path, {
      path,
      content,
      language: this.detectLanguage(path),
      exports: this.extractExports(content),
      imports: this.extractImports(content),
      tests: existing?.tests || null
    });
  }

  async findRelatedFiles(path: string): Promise<string[]> {
    const file = this.codebase.files.get(path);
    if (!file) return [];

    const related: string[] = [];

    // Find test file
    if (file.tests) {
      related.push(file.tests.path);
    }

    // Find files that import this one
    for (const [otherPath, otherFile] of this.codebase.files) {
      if (otherFile.imports.some(i => i.source === path)) {
        related.push(otherPath);
      }
    }

    // Find files this imports
    for (const imp of file.imports) {
      if (this.codebase.files.has(imp.source)) {
        related.push(imp.source);
      }
    }

    return related;
  }

  async searchCode(query: string): Promise<SearchResult[]> {
    const results: SearchResult[] = [];

    for (const [path, file] of this.codebase.files) {
      const lines = file.content.split('\n');
      for (let i = 0; i < lines.length; i++) {
        if (lines[i].toLowerCase().includes(query.toLowerCase())) {
          results.push({
            path,
            line: i + 1,
            content: lines[i],
            context: lines.slice(Math.max(0, i - 2), i + 3).join('\n')
          });
        }
      }
    }

    return results;
  }

  private validateConventions(content: string) {
    const conventions = this.codebase.conventions;

    // Check indentation
    const lines = content.split('\n');
    for (const line of lines) {
      const indent = line.match(/^(\s*)/)?.[1] || '';
      if (conventions.indentation === 'spaces' && indent.includes('\t')) {
        throw new Error('Code uses tabs but codebase convention is spaces');
      }
    }
  }

  private detectLanguage(path: string): string {
    const ext = path.split('.').pop()?.toLowerCase();
    const languageMap: Record<string, string> = {
      'ts': 'typescript',
      'tsx': 'typescript',
      'js': 'javascript',
      'jsx': 'javascript',
      'py': 'python',
      'rs': 'rust',
      'go': 'go'
    };
    return languageMap[ext || ''] || 'unknown';
  }

  private extractExports(content: string): Export[] {
    // Simplified - real implementation would use AST parsing
    const exports: Export[] = [];
    const exportRegex = /export\s+(const|function|class|interface|type)\s+(\w+)/g;
    let match;
    while ((match = exportRegex.exec(content)) !== null) {
      exports.push({ name: match[2], type: match[1] as any });
    }
    return exports;
  }

  private extractImports(content: string): Import[] {
    const imports: Import[] = [];
    const importRegex = /import\s+.*\s+from\s+['"](.+)['"]/g;
    let match;
    while ((match = importRegex.exec(content)) !== null) {
      imports.push({ source: match[1] });
    }
    return imports;
  }
}

interface Export {
  name: string;
  type: 'const' | 'function' | 'class' | 'interface' | 'type';
}

interface Import {
  source: string;
}

interface SearchResult {
  path: string;
  line: number;
  content: string;
  context: string;
}

13.4 The Verification Loop

The heart of the code agent is the verification loop—generate, test, analyze errors, retry.

export default class CodeAgent extends AgenticSystem {
  @field codebase: CodebaseManager;
  @field thinking: stream<string>('');
  @field code: stream<string>('');
  @field testOutput: stream<string>('');
  @field promptGuidance: { do: string[]; avoid: string[] } = { do: [], avoid: [] };

  async implement(spec: Specification): Promise<ImplementationResult> {
    this.thinking.append(`Starting implementation: ${spec.title}\n`);

    // Plan
    const plan = await this.plan(spec);
    this.thinking.append(`Plan created with ${plan.steps.length} steps\n`);

    // Implementation loop
    let attempt = 0;
    const maxAttempts = 5;
    let lastError: VerificationError | null = null;

    while (attempt < maxAttempts) {
      attempt++;
      this.thinking.append(`\n--- Attempt ${attempt}/${maxAttempts} ---\n`);

      // Generate code
      const code = await this.generateCode(spec, plan, lastError);

      // Apply to codebase
      await this.applyCode(code);
      this.code.set(code.content);

      // Verify
      const verification = await this.verify(spec);
      this.testOutput.set(verification.output);

      if (verification.passed) {
        this.thinking.append(`Verification passed!\n`);
        return {
          success: true,
          code,
          attempts: attempt,
          spec
        };
      }

      // Analyze error for next attempt
      this.thinking.append(`Verification failed: ${verification.summary}\n`);
      lastError = await this.analyzeError(verification, code);
    }

    this.thinking.append(`Max attempts reached. Implementation failed.\n`);
    return {
      success: false,
      code: null,
      attempts: attempt,
      lastError,
      spec
    };
  }

  private async plan(spec: Specification): Promise<ImplementationPlan> {
    // Gather context
    const relevantFiles = await this.findRelevantFiles(spec);
    const existingPatterns = await this.analyzePatterns(relevantFiles);

    const response = await llm.complete([
      { role: 'system', content: `You are an expert software developer.
        Create an implementation plan for the given specification.
        Consider existing code patterns and conventions.
        When possible, follow these best practices:
        ${this.promptGuidance.do.map(p => `- ${p}`).join('\n')}
        
        Avoid these failure patterns:
        ${this.promptGuidance.avoid.map(p => `- ${p}`).join('\n')}` },
      { role: 'user', content: `Specification: ${spec.description}

        Relevant existing code:
        ${relevantFiles.map(f => `--- ${f.path} ---\n${f.content}`).join('\n\n')}

        Codebase patterns:
        ${JSON.stringify(existingPatterns, null, 2)}

        Create a plan with:
        1. Files to create or modify
        2. Implementation approach
        3. Edge cases to handle
        4. Tests to add or modify

        Format your response as JSON:
        {
          "steps": ["..."],
          "files": ["..."],
          "approach": "High-level explanation of the strategy"
        }` }
    ]);

    const content = response.choices[0].message.content;
    return this.parsePlan(content);
  }

  private async generateCode(
    spec: Specification,
    plan: ImplementationPlan,
    previousError: VerificationError | null
  ): Promise<GeneratedCode> {
    const errorContext = previousError
      ? `\n\nPrevious attempt failed with:\n${previousError.message}\n\nSpecifically:\n${previousError.details}\n\nFix this issue in your implementation.`
      : '';

    const response = await llm.complete([
      { role: 'system', content: `You are an expert software developer.
        Generate code that implements the specification.
        Follow the plan and codebase conventions exactly.
        When possible, follow these best practices:
        ${this.promptGuidance.do.map(p => `- ${p}`).join('\n')}
        
        Avoid these failure patterns:
        ${this.promptGuidance.avoid.map(p => `- ${p}`).join('\n')}
        
        ${
          this.codebase.codebase.conventions
            ? `Codebase conventions (JSON): ${JSON.stringify(this.codebase.codebase.conventions)}`
            : ''
        }

        Format your response as strict JSON:
        {
          "files": [
            { "path": "string", "content": "string" }
          ]
        }` },
      { role: 'user', content: `Specification: ${spec.description}

        Plan:
        ${JSON.stringify(plan, null, 2)}

        ${errorContext}

        Generate the complete code. Include all necessary imports.
        Do not include backticks or any extra commentary—only valid JSON matching the specified schema.` }
    ]);

    const content = response.choices[0].message.content;
    return JSON.parse(content);
  }

  private async applyCode(code: GeneratedCode) {
    for (const file of code.files) {
      await this.codebase.writeFile(file.path, file.content);
    }
  }

  private async verify(spec: Specification): Promise<VerificationResult> {
    const results: VerificationResult = {
      passed: true,
      output: '',
      failures: [],
      summary: ''
    };

    // Run type checking
    const typeCheck = await this.runTypeCheck();
    results.output += `Type Check:\n${typeCheck.output}\n\n`;
    if (!typeCheck.passed) {
      results.passed = false;
      results.failures.push({ type: 'typecheck', details: typeCheck.output });
    }

    // Run tests
    const testRun = await this.runTests(spec.testPatterns);
    results.output += `Tests:\n${testRun.output}\n\n`;
    if (!testRun.passed) {
      results.passed = false;
      results.failures.push({ type: 'test', details: testRun.output });
    }

    // Run linting (if configured)
    const lint = await this.runLint();
    results.output += `Lint:\n${lint.output}\n\n`;
    if (!lint.passed) {
      results.passed = false;
      results.failures.push({ type: 'lint', details: lint.output });
    }

    results.summary = results.passed
      ? 'All checks passed'
      : `Failed: ${results.failures.map(f => f.type).join(', ')}`;

    return results;
  }

  private async analyzeError(
    verification: VerificationResult,
    code: GeneratedCode
  ): Promise<VerificationError> {
    const response = await llm.complete([
      { role: 'system', content: `Analyze this verification failure.
        Identify the root cause and suggest a specific fix.
        Be precise about what line or logic needs to change.` },
      { role: 'user', content: `Generated code:
        ${code.files.map(f => `--- ${f.path} ---\n${f.content}`).join('\n\n')}

        Verification output:
        ${verification.output}

        What specifically went wrong and how should it be fixed?` }
    ]);

    const content = response.choices[0].message.content;

    return {
      message: verification.summary,
      details: content,
      failures: verification.failures
    };
  }

  @tool({ description: 'Run TypeScript type checking' })
  private async runTypeCheck(): Promise<{ passed: boolean; output: string }> {
    // In a real implementation, this would run the TypeScript compiler
    // against the current codebase and interpret the result.
    //
    // For example, in a Node environment you could use:
    //
    //   const { execa } = await import('execa');
    //   const subprocess = await execa('npx', ['tsc', '--noEmit'], {
    //     cwd: this.codebase.codebase.rootPath
    //   });
    //   const passed = subprocess.exitCode === 0;
    //   const output = subprocess.stdout + '\n' + subprocess.stderr;
    //
    //   return { passed, output };
    //
    // The agent then uses `passed` as a boolean signal and `output` as
    // structured text for error analysis.
    return { passed: true, output: 'No type errors' };
  }

  @tool({ description: 'Run tests matching pattern' })
  private async runTests(patterns: string[]): Promise<{ passed: boolean; output: string }> {
    // In a real implementation, this would invoke your test runner CLI
    // with the provided patterns and surface its results.
    //
    // For example, you might run Vitest in band with JSON reporting:
    //
    //   const { execa } = await import('execa');
    //   const args = ['vitest', '--runInBand', '--reporter', 'json'];
    //   for (const pattern of patterns) {
    //     args.push('--include', pattern);
    //   }
    //   const subprocess = await execa('npx', args, {
    //     cwd: this.codebase.codebase.rootPath
    //   });
    //   const passed = subprocess.exitCode === 0;
    //   const output = subprocess.stdout + '\n' + subprocess.stderr;
    //
    //   return { passed, output };
    //
    // The `output` string becomes the raw material for parsing which tests
    // failed and why.
    return { passed: true, output: 'All tests passed' };
  }

  @tool({ description: 'Run linter' })
  private async runLint(): Promise<{ passed: boolean; output: string }> {
    // In real implementation, would run eslint/prettier
    return { passed: true, output: 'No lint errors' };
  }

  private async findRelevantFiles(spec: Specification): Promise<CodeFile[]> {
    const relevant: CodeFile[] = [];

    // Search for keywords from spec
    const keywords = this.extractKeywords(spec.description);
    for (const keyword of keywords) {
      const results = await this.codebase.searchCode(keyword);
      for (const result of results) {
        const file = this.codebase.codebase.files.get(result.path);
        if (file && !relevant.includes(file)) {
          relevant.push(file);
        }
      }
    }

    return relevant.slice(0, 10); // Limit context size
  }

  private async analyzePatterns(files: CodeFile[]): Promise<CodePatterns> {
    // Extract common patterns from existing code
    return {
      errorHandling: 'try-catch with specific error types',
      asyncStyle: 'async/await',
      testStyle: 'describe/it blocks with jest',
      importStyle: 'named imports from relative paths'
    };
  }

  private extractKeywords(text: string): string[] {
    // Simple keyword extraction
    return text
      .toLowerCase()
      .split(/\W+/)
      .filter(w => w.length > 3)
      .slice(0, 10);
  }

  private parsePlan(response: string): ImplementationPlan {
    // Parse LLM response into structured plan. The system prompt requires
    // JSON of the form:
    //
    //   {
    //     "steps": ["..."],
    //     "files": ["..."],
    //     "approach": "..."
    //   }
    //
    // To keep the example realistic, we at least attempt to interpret these
    // fields while falling back to sensible defaults when missing.
    try {
      const plan = JSON.parse(response);
      return {
        steps: Array.isArray(plan.steps) ? plan.steps : [],
        files: Array.isArray(plan.files) ? plan.files : [],
        approach: typeof plan.approach === 'string' ? plan.approach : response
      };
    } catch {
      return {
        steps: [],
        files: [],
        approach: response
      };
    }
  }
}

interface Specification {
  title: string;
  description: string;
  testPatterns: string[];
  requirements: string[];
}

interface ImplementationPlan {
  steps: string[];
  files: string[];
  approach: string;
}

interface GeneratedCode {
  files: { path: string; content: string }[];
}

interface VerificationResult {
  passed: boolean;
  output: string;
  failures: { type: string; details: string }[];
  summary: string;
}

interface VerificationError {
  message: string;
  details: string;
  failures: { type: string; details: string }[];
}

interface ImplementationResult {
  success: boolean;
  code: GeneratedCode | null;
  attempts: number;
  lastError?: VerificationError;
  spec: Specification;
}

interface CodePatterns {
  errorHandling: string;
  asyncStyle: string;
  testStyle: string;
  importStyle: string;
}

13.5 Error Analysis as Gradient

The quality of error analysis determines how effectively the agent improves between attempts. Raw error output is often verbose and confusing. Good error analysis extracts the actionable signal.

async analyzeErrorDetailed(
  verification: VerificationResult,
  code: GeneratedCode
): Promise<DetailedErrorAnalysis> {
  const analysis: DetailedErrorAnalysis = {
    category: 'unknown',
    rootCause: '',
    location: null,
    suggestedFix: '',
    confidence: 0
  };

  for (const failure of verification.failures) {
    if (failure.type === 'typecheck') {
      const typeError = this.parseTypeError(failure.details);
      if (typeError) {
        analysis.category = 'type';
        analysis.rootCause = typeError.message;
        analysis.location = { file: typeError.file, line: typeError.line };
        analysis.suggestedFix = await this.suggestTypeFix(typeError, code);
        analysis.confidence = 0.9; // Type errors are usually clear
      }
    } else if (failure.type === 'test') {
      const testError = this.parseTestError(failure.details);
      if (testError) {
        analysis.category = 'logic';
        analysis.rootCause = `Expected ${testError.expected}, got ${testError.actual}`;
        analysis.location = { file: testError.testFile, line: testError.line };
        analysis.suggestedFix = await this.suggestLogicFix(testError, code);
        analysis.confidence = 0.7; // Logic errors require more interpretation
      }
    } else if (failure.type === 'lint') {
      const lintError = this.parseLintError(failure.details);
      if (lintError) {
        analysis.category = 'style';
        analysis.rootCause = lintError.rule;
        analysis.location = { file: lintError.file, line: lintError.line };
        analysis.suggestedFix = lintError.fix || 'Apply auto-fix';
        analysis.confidence = 0.95; // Lint errors are very specific
      }
    }
  }

  return analysis;
}

private parseTypeError(output: string): TypeErrorInfo | null {
  // Parse TypeScript error format: "src/file.ts(10,5): error TS2322: ..."
  const match = output.match(/(.+)\((\d+),(\d+)\):\s*error\s+TS(\d+):\s*(.+)/);
  if (match) {
    return {
      file: match[1],
      line: parseInt(match[2]),
      column: parseInt(match[3]),
      code: `TS${match[4]}`,
      message: match[5]
    };
  }
  return null;
}

private parseTestError(output: string): TestErrorInfo | null {
  // Parse Jest error format
  const expectedMatch = output.match(/Expected:\s*(.+)/);
  const actualMatch = output.match(/Received:\s*(.+)/);
  const locationMatch = output.match(/at\s+.+\((.+):(\d+):\d+\)/);

  if (expectedMatch && actualMatch) {
    return {
      expected: expectedMatch[1],
      actual: actualMatch[1],
      testFile: locationMatch?.[1] || 'unknown',
      line: locationMatch ? parseInt(locationMatch[2]) : 0
    };
  }
  return null;
}

private async suggestTypeFix(error: TypeErrorInfo, code: GeneratedCode): Promise<string> {
  const file = code.files.find(f => f.path.includes(error.file));
  if (!file) return 'Check type definitions';

  const lines = file.content.split('\n');
  const errorLine = lines[error.line - 1];

  const response = await llm.complete([
    { role: 'system', content: 'Suggest a specific fix for this TypeScript type error.' },
    { role: 'user', content: `Error: ${error.message}
      Line: ${errorLine}
      Context: ${lines.slice(Math.max(0, error.line - 5), error.line + 5).join('\n')}

      What specific change fixes this?` }
  ]);

  return response.choices[0].message.content;
}

interface DetailedErrorAnalysis {
  category: 'type' | 'logic' | 'style' | 'runtime' | 'unknown';
  rootCause: string;
  location: { file: string; line: number } | null;
  suggestedFix: string;
  confidence: number;
}

interface TypeErrorInfo {
  file: string;
  line: number;
  column: number;
  code: string;
  message: string;
}

interface TestErrorInfo {
  expected: string;
  actual: string;
  testFile: string;
  line: number;
}

13.6 Learning from Code Tasks

The code agent improves by learning from both successes and failures. Patterns that lead to passing tests get reinforced; patterns that cause failures get flagged. On later tasks, the agent injects promptGuidance.do as explicit “best practices” in the system prompt (for example, “prefer pure functions and dependency injection”), and includes promptGuidance.avoid as explicit warnings (for example, “do not mutate shared global state in tests”).

@field learnings: CodeLearning[] = [];

async learnFromResult(result: ImplementationResult) {
  const learning: CodeLearning = {
    spec: result.spec,
    success: result.success,
    attempts: result.attempts,
    timestamp: Date.now(),
    insights: []
  };

  if (result.success) {
    // Extract successful patterns
    const patterns = await this.extractSuccessPatterns(result);
    learning.insights.push(...patterns.map(p => ({
      type: 'success' as const,
      pattern: p,
      context: result.spec.description
    })));
  } else if (result.lastError) {
    // Extract failure patterns to avoid
    const antiPatterns = await this.extractFailurePatterns(result);
    learning.insights.push(...antiPatterns.map(p => ({
      type: 'failure' as const,
      pattern: p,
      context: result.spec.description,
      error: result.lastError?.message
    })));
  }

  this.learnings.push(learning);

  // Apply learnings to future prompts
  await this.updatePromptGuidance();
}

private async extractSuccessPatterns(result: ImplementationResult): Promise<string[]> {
  if (!result.code) return [];

  const response = await llm.complete([
    { role: 'system', content: 'Extract reusable patterns from this successful implementation.' },
    { role: 'user', content: `Specification: ${result.spec.description}

      Implementation (passed all tests):
      ${result.code.files.map(f => `--- ${f.path} ---\n${f.content}`).join('\n\n')}

      What patterns or approaches made this implementation successful?
      List them as brief, reusable insights, one per line.` }
  ]);

  const content = response.choices[0].message.content;
  return content.split('\n').filter(line => line.trim().length > 0);
}

private async extractFailurePatterns(result: ImplementationResult): Promise<string[]> {
  const response = await llm.complete([
    { role: 'system', content: 'Identify patterns that led to implementation failure.' },
    { role: 'user', content: `Specification: ${result.spec.description}

      Final error: ${result.lastError?.message}
      Details: ${result.lastError?.details}

      What patterns or approaches should be avoided?
      List them as brief warnings, one per line.` }
  ]);

  const content = response.choices[0].message.content;
  return content.split('\n').filter(line => line.trim().length > 0);
}

private async updatePromptGuidance() {
  // Aggregate recent learnings into prompt guidance that influences
  // future planning and code generation prompts.
  const recentLearnings = this.learnings.slice(-50);

  const successPatterns = recentLearnings
    .flatMap(l => l.insights.filter(i => i.type === 'success'))
    .map(i => i.pattern);

  const failurePatterns = recentLearnings
    .flatMap(l => l.insights.filter(i => i.type === 'failure'))
    .map(i => i.pattern);

  this.promptGuidance = {
    do: [...new Set(successPatterns)].slice(0, 10),
    avoid: [...new Set(failurePatterns)].slice(0, 10)
  };
}

interface CodeLearning {
  spec: Specification;
  success: boolean;
  attempts: number;
  timestamp: number;
  insights: {
    type: 'success' | 'failure';
    pattern: string;
    context: string;
    error?: string;
  }[];
}

13.7 Handling Stuck States

Not every implementation succeeds. The agent must recognize when it’s stuck and handle failure gracefully.

async implementWithEscalation(spec: Specification): Promise<ImplementationResult | EscalationRequest> {
  const result = await this.implement(spec);

  if (result.success) {
    return result;
  }

  // Analyze why we're stuck
  const stuckAnalysis = await this.analyzeStuckState(result);

  if (stuckAnalysis.canRetryWithChanges) {
    // Try a different approach
    this.thinking.append(`Trying alternative approach: ${stuckAnalysis.alternativeApproach}\n`);
    const alternativeSpec = {
      ...spec,
      description: `${spec.description}\n\nApproach: ${stuckAnalysis.alternativeApproach}`
    };
    return this.implement(alternativeSpec);
  }

  // Escalate to human
  return {
    type: 'escalation',
    spec,
    attempts: result.attempts,
    lastError: result.lastError,
    analysis: stuckAnalysis,
    suggestedActions: stuckAnalysis.humanActions
  };
}

private async analyzeStuckState(result: ImplementationResult): Promise<StuckAnalysis> {
  const response = await llm.complete([
    { role: 'system', content: `Analyze why this implementation failed after ${result.attempts} attempts.
      Determine if an alternative approach might work, or if human intervention is needed.` },
    { role: 'user', content: `Specification: ${result.spec.description}

      Final error: ${result.lastError?.message}
      Details: ${result.lastError?.details}

      Analyze:
      1. Is this a fundamental problem with the specification?
      2. Is there an alternative implementation approach?
      3. What human actions might unblock this?` }
  ]);

  const content = response.choices[0].message.content;

  // Parse analysis
  const canRetry = content.toLowerCase().includes('alternative approach');
  const alternativeMatch = content.match(/alternative approach[:\s]+(.+?)(?:\n|$)/i);

  return {
    reason: content,
    canRetryWithChanges: canRetry,
    alternativeApproach: alternativeMatch?.[1] || null,
    humanActions: this.extractHumanActions(content)
  };
}

private extractHumanActions(analysis: string): string[] {
  const actions: string[] = [];

  if (analysis.includes('specification')) {
    actions.push('Clarify or modify the specification');
  }
  if (analysis.includes('dependency') || analysis.includes('library')) {
    actions.push('Install or update required dependencies');
  }
  if (analysis.includes('permission') || analysis.includes('access')) {
    actions.push('Grant necessary permissions or access');
  }
  if (analysis.includes('unclear') || analysis.includes('ambiguous')) {
    actions.push('Provide more detailed requirements');
  }

  return actions.length > 0 ? actions : ['Review implementation approach'];
}

interface StuckAnalysis {
  reason: string;
  canRetryWithChanges: boolean;
  alternativeApproach: string | null;
  humanActions: string[];
}

interface EscalationRequest {
  type: 'escalation';
  spec: Specification;
  attempts: number;
  lastError?: VerificationError;
  analysis: StuckAnalysis;
  suggestedActions: string[];
}

Key Takeaways

Code is ideal for agentic automation because verification is cheap and objective
The verification loop—generate, test, analyze, retry—drives rapid improvement
Structured error feedback (type errors, test failures) provides clear guidance
Error analysis transforms raw output into actionable fix suggestions
Learning from both successes and failures improves future implementations
Recognizing stuck states and escalating gracefully prevents infinite loops

Transition

Chapter 13 showed verification-driven development where tests provide an objective feedback signal and the agent can rely on pass/fail results plus structured error messages. Chapter 14: Customer System addresses a different challenge—handling human-facing interactions where feedback is subjective, quality is harder to measure, and the system must balance automation with appropriate escalation to humans.

Elements of Agentic System Design

Part 1: Elements

Part 2: Applications

Appendices

Chapter 13: Code Agent

13.1 Why Code Is Special

13.2 System Architecture

13.3 The Codebase Artifact

13.4 The Verification Loop

13.5 Error Analysis as Gradient

13.6 Learning from Code Tasks

13.7 Handling Stuck States

Key Takeaways

Transition

Elements of Agentic System Design

Part 1: Elements

Part 2: Applications

Appendices

​13.1 Why Code Is Special

​13.2 System Architecture

​13.3 The Codebase Artifact

​13.4 The Verification Loop

​13.5 Error Analysis as Gradient

​13.6 Learning from Code Tasks

​13.7 Handling Stuck States

​Key Takeaways

​Transition

13.1 Why Code Is Special

13.2 System Architecture

13.3 The Codebase Artifact

13.4 The Verification Loop

13.5 Error Analysis as Gradient

13.6 Learning from Code Tasks

13.7 Handling Stuck States

Key Takeaways

Transition