Chapter 6: Artifacts

The previous chapter ended with a system that looks almost social. You have multiple agents, each with a role, passing messages and tools and partial results. From the outside, it feels like a team. From the inside, it is still one thing at a time: a model in a context, producing the next token. That picture omits the shared objects that human teams use to coordinate work. Human teams rarely coordinate by talking only to each other. They work through shared objects: a task board, a draft document, a code repository, a design file. These objects are not just passive data. They remember decisions, constrain what happens next, and make progress visible. When a person leaves the room, the kanban board is still on the wall. When a reviewer goes off‑line, the pull request still exists, with comments and status. In LLM systems, we tend to forget this. We wire agents together with messages and hope the conversation itself will carry all necessary state. It does not. Messages evaporate as soon as the next prompt is built. The team appears to be collaborating, but the shared object is repeatedly reconstructed from JSON in context windows rather than maintained as a stable, external structure. In this architecture, you need to decide where the “thing being worked on” lives. If each agent is stateless between calls and messages are transient, some other component must carry the work forward. Your design has to specify where reports, plans, and codebases persist within the system. You can answer “in a database” or “in a JSON field,” but that ducks the real issue. The object is not just bytes; it has structure, constraints, and a lifecycle. It moves from outline to draft to review to final. Tasks move from backlog to in‑progress to done. Code moves from change to review to merge. That evolution is where much of the apparent intelligence lives. The first design issue is whether to introduce a shared document when agents can already talk to each other. What changes when you stop thinking in terms of messages and start thinking in terms of artifacts?

6.1 Shared Documents Instead of Messages

[DEMO: Two “agents” alternately update a shared report. The UI shows their message exchange on one side and the evolving document on the other. When you kill one agent mid‑run, the messages stop but the document remains, showing accumulated structure and progress.] You can build a multi‑agent system entirely with messages. A researcher sends bullet points to a writer. The writer sends a draft back. The reviewer sends comments. Each message contains everything the recipient needs at that moment: context, instructions, and some representation of the “current” work. The work itself is not clearly located: it might appear to live in the last message, the union of all prior messages, or a database row that you overwrite on each step. This design choice raises further questions. When components already exchange messages, introducing a shared document raises design questions. If the document holds state, agents may reduce to pure functions over that state. With stateless agents and persistent artifacts, you need to be explicit about where you consider the “intelligence” to reside.

Artifacts externalize progress. Messages are transient; artifacts accumulate. When agents modify a shared document, the document captures all progress—if an agent fails, the document remains. Agents become stateless functions that read current state and produce updates. Intelligence is distributed: agents provide reasoning; artifacts provide memory and structure.

A minimal artifact can be represented by a simple data structure:

interface ReportSection {
  id: string;
  heading: string;
  content: string;
  status: 'empty' | 'draft' | 'reviewed';
}

interface ReportArtifact {
  title: string;
  sections: ReportSection[];
  status: 'outline' | 'drafting' | 'review' | 'final';
}

@field report: ReportArtifact = {
  title: '',
  sections: [],
  status: 'outline'
};

Now imagine two agents that never message each other directly. They only read and update this report.

// Researcher: fills in empty sections with findings
async function researcherStep(system: AgenticSystem) {
  const empty = system.report.sections.filter(s => s.status === 'empty');
  if (empty.length === 0) return;

  const section = empty[0];
  const findings = await system.callModel({
    role: 'system',
    content: `You are a researcher. Produce bullet-point findings for: ${section.heading}`
  });

  section.content = findings;
  section.status = 'draft';
}

// Writer: rewrites draft sections into prose
async function writerStep(system: AgenticSystem) {
  const drafts = system.report.sections.filter(s => s.status === 'draft');
  if (drafts.length === 0) return;

  const section = drafts[0];
  const prose = await system.callModel({
    role: 'system',
    content: `You are a writer. Turn these findings into well-written prose:\n\n${section.content}`
  });

  section.content = prose;
  section.status = 'reviewed';
}

No message queues, no “sendToWriter,” no “sendToResearcher.” Each step looks at the current artifact, performs one small transformation, and writes the result back. This design has a few important consequences. First, progress is localized in one place. If the system crashes after the researcher updates a section and before the writer runs, the report still contains the new content. You do not reconstruct state from a log of messages; the artifact is the state. Second, agents become stateless. They do not need to remember which sections were processed; they infer that from the artifact itself (status === 'draft' versus status === 'reviewed'). That makes them easier to scale, restart, and reason about. Third, the “intelligence” you perceive shifts. The model still does the thinking inside each step, but the pattern of steps—the fact that research precedes writing, that only empty sections get filled, that reviewed sections are left alone—is encoded in the artifact’s structure and the operations you define over it. The illusion of a persistent, thoughtful assistant emerges from this combination. The artifact records past changes, and agents compute the next updates to that state.

6.2 Structure as Coordination: Avoiding Conflicting Writers

[DEMO: A shared task board with two simulated workers. When the board is a simple list, they step on each other—both pick the same task. When the board has columns, WIP limits, and explicit “claim” operations, they naturally distribute work without messaging.] Once you have a shared document, a new problem appears. If multiple agents can edit it, what stops them from colliding? What prevents the researcher from rewriting a section while the writer is mid‑edit? If the artifact is the source of truth, what happens to each agent’s local understanding when the artifact changes under it? At first glance, this suggests you might need locks, leases, or distributed consensus to coordinate updates. Or, more subtly, it suggests having each agent keep its own local view of the document and reconcile later, which puts you back in the world of messaging and conflict resolution. There is also a more philosophical question: if a task board coordinates many workers without explicit messages, is the board itself starting to look like an agent? It certainly constrains behavior and drives decisions. Where do you draw the line?

Artifacts coordinate through structure. A task board with columns and WIP limits prevents conflicts by design—only one worker can claim a task. Agents don’t maintain local state; they read current artifact state each time. The artifact isn’t an agent (it doesn’t reason), but it encodes coordination logic: what’s allowed, what’s blocked, what transitions are valid.

Consider a basic task board artifact:

interface Task {
  id: string;
  title: string;
  status: 'ready' | 'in-progress' | 'review' | 'done';
  assignee?: string;
}

interface TaskBoardArtifact {
  tasks: Task[];
}

Two naive workers might try to “take” work like this:

async function naiveWorkerStep(system: AgenticSystem, workerId: string) {
  const task = system.taskBoard.tasks.find(t => t.status === 'ready');
  if (!task) return;

  // Race: another worker might grab the same task between these lines
  task.assignee = workerId;
  task.status = 'in-progress';

  await doWorkOn(task);
}

You could try to fix this with low-level locking, but that misses the point. The board itself has no concept of “claiming” a task; workers are just mutating fields. You need to move the semantics into the artifact.

interface TaskBoardArtifact {
  tasks: Task[];
}

class TaskBoardSystem extends AgenticSystem {
  @field board: TaskBoardArtifact = { tasks: [] };

  claimNextTask(workerId: string): Task | null {
    const task = this.board.tasks.find(t => t.status === 'ready' && !t.assignee);
    if (!task) return null;

    task.assignee = workerId;
    task.status = 'in-progress';
    return task;
  }

  completeTask(taskId: string, workerId: string): void {
    const task = this.board.tasks.find(t => t.id === taskId);
    if (!task) throw new Error('Task not found');
    if (task.assignee !== workerId) {
      throw new Error('Only the assignee can complete this task');
    }
    task.status = 'done';
  }
}

Workers now interact only through these operations:

async function workerStep(system: TaskBoardSystem, workerId: string) {
  const task = system.claimNextTask(workerId);
  if (!task) return;

  await doWorkOn(task);
  system.completeTask(task.id, workerId);
}

Several things changed:

The artifact’s structure encodes that tasks have status and assignee.
The artifact’s operations encode the valid transitions: only ready, unassigned tasks can be claimed; only the assignee can complete a task.
There is no local worker state about “what I own” that needs reconciling. A worker can always re-derive its responsibilities by scanning the board for tasks with assignee === workerId.

The board is not an agent. It does not decide which worker should act. But it does constrain what counts as a valid action. That is coordination logic, moved out of prompts and into code. You can go further by encoding constraints like WIP (work in progress) limits into the artifact:

interface Column {
  name: string;
  tasks: Task[];
  wipLimit?: number;
}

interface KanbanBoardArtifact {
  columns: Column[];
  assignments: Record<string, string>; // taskId -> workerId
}

class KanbanBoardSystem extends AgenticSystem {
  @field board: KanbanBoardArtifact = {
    columns: [
      { name: 'ready', tasks: [] },
      { name: 'in-progress', tasks: [], wipLimit: 3 },
      { name: 'done', tasks: [] }
    ],
    assignments: {}
  };

  claimTask(workerId: string): Task | null {
    const readyColumn = this.getColumn('ready');
    const inProgress = this.getColumn('in-progress');

    if (inProgress.wipLimit !== undefined &&
        inProgress.tasks.length >= inProgress.wipLimit) {
      return null; // Board enforces WIP; no new claims
    }

    const task = readyColumn.tasks.shift();
    if (!task) return null;

    this.board.assignments[task.id] = workerId;
    inProgress.tasks.push(task);
    return task;
  }
}

Now conflicts are avoided not by careful prompt engineering but by the shape of the artifact. At any given moment, it is impossible—according to the rules encoded in operations—for two workers to “own” the same task or exceed WIP limits. Agents remain simple: they repeatedly read the current board, ask for a claim, do work, and update. The artifact remains simple as well: a data structure plus a small set of operations that define what “claim,” “move,” and “complete” mean. The interesting behavior emerges from their interaction.

6.3 Designing Artifacts That Capture Workflow

[DEMO: A document builder where you can change the artifact schema on the fly. When the document is a plain string, agents constantly break it. As you add sections, per‑section status, and lifecycle states, the same agents suddenly behave “intelligently” because their actions become constrained and meaningful.] Once you treat artifacts as coordination surfaces, you need criteria for what counts as good artifact design. It is easy to add fields and call it structure. It is harder to encode domain semantics in a way that actually helps your agents. If structure constrains what is representable, you need to understand how that helps rather than limits your system. If operations encode domain semantics, you need to decide what domain knowledge you are actually capturing. If lifecycle states gate which operations are valid, you need to specify who enforces those gates and where that enforcement lives. You can answer “the code enforces it,” but that is too vague. You want a systematic way to go from “how work happens in this domain” to “what the artifact looks like and what you can do with it.”

Artifacts encode domain workflow. Structure defines what exists (sections, tasks, findings). Operations define what you can do (addSection, claimTask). Lifecycle defines valid states and transitions (draft → review → final). Good artifact design captures the actual workflow of the domain—agents operate through meaningful actions rather than raw data manipulation.

Take a document artifact as a concrete pattern.

interface Section {
  id: string;
  heading: string;
  content: string;
  status: 'empty' | 'draft' | 'reviewed' | 'approved';
}

interface DocumentArtifact {
  title: string;
  sections: Section[];
  status: 'outline' | 'drafting' | 'review' | 'final';
}

The structure alone tells you a lot:

Documents have a title and sections.
Sections have headings, content, and their own status.
The document itself also has a status, separate from its sections.

Now, define operations that match how people actually work with documents:

class DocumentSystem extends AgenticSystem {
  @field doc: DocumentArtifact = {
    title: '',
    sections: [],
    status: 'outline'
  };

  addSection(heading: string): string {
    if (this.doc.status === 'final') {
      throw new Error('Cannot modify a finalized document');
    }

    const section: Section = {
      id: crypto.randomUUID(),
      heading,
      content: '',
      status: 'empty'
    };

    this.doc.sections.push(section);
    return section.id;
  }

  editSection(sectionId: string, newContent: string): void {
    this.requireEditable();

    const section = this.findSection(sectionId);
    section.content = newContent;
    section.status = 'draft'; // Any edit returns to draft
  }

  markReviewed(sectionId: string): void {
    this.requireReviewing();

    const section = this.findSection(sectionId);
    if (section.status !== 'draft') {
      throw new Error('Only draft sections can be reviewed');
    }
    section.status = 'reviewed';
  }

  approveSection(sectionId: string): void {
    this.requireReviewing();

    const section = this.findSection(sectionId);
    if (section.status !== 'reviewed') {
      throw new Error('Only reviewed sections can be approved');
    }
    section.status = 'approved';
  }

  advanceLifecycle(): void {
    if (this.doc.status === 'outline') {
      if (this.doc.sections.length === 0) {
        throw new Error('Cannot start drafting without sections');
      }
      this.doc.status = 'drafting';
    } else if (this.doc.status === 'drafting') {
      const anyEmpty = this.doc.sections.some(s => s.status === 'empty');
      if (anyEmpty) {
        throw new Error('Cannot move to review with empty sections');
      }
      this.doc.status = 'review';
    } else if (this.doc.status === 'review') {
      const unapproved = this.doc.sections.some(s => s.status !== 'approved');
      if (unapproved) {
        throw new Error('Cannot finalize while sections are unapproved');
      }
      this.doc.status = 'final';
    }
  }

  private requireEditable() {
    if (this.doc.status === 'final') {
      throw new Error('Document is finalized');
    }
  }

  private requireReviewing() {
    if (this.doc.status !== 'review') {
      throw new Error('Can only review sections during review stage');
    }
  }

  private findSection(id: string): Section {
    const section = this.doc.sections.find(s => s.id === id);
    if (!section) throw new Error('Section not found');
    return section;
  }
}

Notice how each part of the design corresponds to pieces of the workflow:

Structure: Sections, section statuses, document status.
Operations: Add, edit, review, approve, advance lifecycle.
Lifecycle: Outline → drafting → review → final, with explicit rules at each step.

An agent that works with this artifact does not need to internalize the entire workflow in its prompt. It just needs to know which operations are available. For example, a “drafting agent” might run this loop:

async function draftingAgentStep(system: DocumentSystem) {
  if (system.doc.status !== 'drafting') return;

  const emptySections = system.doc.sections.filter(s => s.status === 'empty');
  if (emptySections.length === 0) return;

  const section = emptySections[0];
  const content = await system.callModel({
    role: 'system',
    content: `Write a detailed section for heading: ${section.heading}`
  });

  system.editSection(section.id, content);
}

The agent is simple. It relies on the artifact to tell it what’s left to do (status === 'empty'), and the artifact’s operations to enforce invariants. If you later change the rules—say, editing a reviewed section requires resetting its status to draft—that logic lives in one place: the artifact. Good artifact design is similar to modeling a small application. You are deciding:

Which entities exist (sections, tasks, findings).
Which actions are meaningful in that domain (approve, escalate, claim, merge).
Which states the entities can occupy and how they move between them.

That design will outlive any particular agent; you can swap out prompts or models and the workflow remains.

6.4 When to Use Artifacts (and When Not To)

[DEMO: A control that lets you switch between “single‑pass generation” and “artifact‑based incremental generation” for creating a multi‑section report. For simple one‑paragraph responses, the single pass wins. For a complex, multi‑section report with review steps, the artifact path produces more coherent, inspectable progress.] With all this structure on the table, you need to decide which parts of your system should be modeled as artifacts. If the artifact captures all progress, what work is left for explicit coordination mechanisms? If the document is the product, why not have a single agent write it in one shot? And if an artifact has structure, operations, and lifecycle, what makes one artifact design better than another? You should not model every piece of state as an artifact.

Use artifacts when the output is complex, multi-part, and benefits from incremental progress. A report with sections, a codebase with files, a plan with steps—these naturally fit artifact-centric design. Single-pass generation works for simple outputs; artifacts work for outputs that accumulate through multiple contributions and revisions.

A simple, direct generation might look like this:

async function generateOneShotReport(system: AgenticSystem, topic: string): Promise<string> {
  const response = await system.callModel({
    role: 'system',
    content: `Write a comprehensive report on: ${topic}`
  });

  return response;
}

For questions like “Explain quicksort” or “Summarize this article,” this is entirely appropriate. The output is a single blob of text, produced in one pass, with no intermediate structure that needs sharing. Compare that with an artifact‑based approach:

async function generateReportWithArtifact(system: DocumentSystem, topic: string) {
  // Step 1: Outline
  const headings = await system.callModel({
    role: 'system',
    content: `Propose 5-7 section headings for a report on: ${topic}`
  });

  const lines = headings.split('\n').filter(Boolean);
  for (const line of lines) {
    system.addSection(line.replace(/^\d+[\.\)]\s*/, '').trim());
  }
  system.doc.status = 'drafting';

  // Step 2: Draft each section over time
  while (system.doc.sections.some(s => s.status === 'empty')) {
    await draftingAgentStep(system);
  }

  // Step 3: Move to review and run a reviewing agent
  system.doc.status = 'review';

  while (system.doc.sections.some(s => s.status !== 'approved')) {
    await reviewAgentStep(system);
  }

  // Step 4: Finalize
  system.advanceLifecycle();
}

This is overkill for a one‑paragraph answer. It shines when:

The output is naturally divided into parts that can be worked on independently (sections, files, tasks).
Multiple agents, or multiple runs of the same agent, will contribute over time.
You care about partial results, intermediate states, and auditability.
You expect the workflow to evolve, and you want the evolution expressed in code, not buried in prompts.

You can also mix approaches. A “simple Q&A” path returns a string directly. A “project” path instantiates an artifact and a set of agents that work on it over hours or days. A useful heuristic:

If the question you are answering fits comfortably in one context window and one sentence like “return X,” use single‑pass generation.
If you find yourself wishing you could “come back later and keep working on this thing,” you probably want an artifact.

Artifacts vs. plain state

It is also worth distinguishing artifacts from plain state. Not all persistent data in your system is an artifact. Plain state:

@field lastHeartbeatAt: number = 0;
@field pendingJobIds: string[] = [];
@field retryCount: number = 0;

These fields help the system run. They do not represent a domain object that multiple agents collaborate on. You will rarely build a UI around them. Artifacts:

@field report: DocumentArtifact;
@field board: TaskBoardArtifact;
@field research: ResearchArtifact;

These are the domain objects. They are what the user cares about. They are what your agents discuss in natural language. They are the things that have lifecycles and histories. Only some pieces of state warrant a modeled lifecycle or artifact representation. Reserve explicit lifecycles and artifacts for state that represents user-facing work products or multi-step workflows. Trying to force every bit of state into artifact form will slow you down and blur this distinction. Save artifacts for the things that need to be shared, evolved, and inspected in their own right.

Bridge to Chapter 7

Artifacts give your agents something concrete to work on together. Instead of flinging messages back and forth and hoping that context windows remain coherent, you plant a shared object in the middle of the system and let agents orbit it. The illusion of a persistent assistant emerges from the artifact’s structure and lifecycle as much as from the model’s next‑token predictions. Coordination (Chapter 5) and artifacts (this chapter) now give you two complementary tools: explicit routing between agents, and implicit coordination through shared objects. So far, though, everything has been reactive. Agents wake up when you call them. Systems move when you push them. Some work should not wait for a user click or a manual trigger. A monitoring system should run checks on its own schedule. A research system should follow up on leads overnight. A workflow should advance when its artifact’s state allows it, not when a human operator remembers to press “next.” The next chapter looks at autonomy: how to make these systems initiate action, maintain long‑running goals, and keep working even when nobody is watching. The question shifts from “how do I structure this work?” to “who decides when to take the next step?”

Elements of Agentic System Design

Part 1: Elements

Part 2: Applications

Appendices

Chapter 6: Artifacts

6.1 Shared Documents Instead of Messages

6.2 Structure as Coordination: Avoiding Conflicting Writers

6.3 Designing Artifacts That Capture Workflow

6.4 When to Use Artifacts (and When Not To)

Artifacts vs. plain state

Bridge to Chapter 7

Elements of Agentic System Design

Part 1: Elements

Part 2: Applications

Appendices

​6.1 Shared Documents Instead of Messages

​6.2 Structure as Coordination: Avoiding Conflicting Writers

​6.3 Designing Artifacts That Capture Workflow

​6.4 When to Use Artifacts (and When Not To)

​Artifacts vs. plain state

​Bridge to Chapter 7

6.1 Shared Documents Instead of Messages

6.2 Structure as Coordination: Avoiding Conflicting Writers

6.3 Designing Artifacts That Capture Workflow

6.4 When to Use Artifacts (and When Not To)

Artifacts vs. plain state

Bridge to Chapter 7