Flagship case study / 2026

Shipped

Transforming a Design System into a Living Knowledge Platform

Our AI could access the entire codebase, but it still couldn’t make design decisions. I built a machine-readable knowledge layer that helps agents choose components, patterns, flows, and UX decisions across design and development workflows.

Role: [Lead Product Designer]
Focus: Design Systems + AI
Duration: [duration needed]
Outcome: Knowledge platform

Documentation

Knowledge

Decisions

The project was not a chatbot or a single MCP server. It was a shift from publishing information to operationalizing design judgment.

01 / The gap

AI understood the code. Not the design.

The company already had an official MCP that could navigate the codebase, find components, and explain their APIs. That solved access to implementation. It did not solve design judgment.

An agent could discover that a dialog component existed, inspect its props, and generate valid code. It still could not tell whether a dialog was the right interaction, when a destructive action needed an extra confirmation step, what should happen after failure, or which exception had already been approved. Those answers lived across long guides, examples, review conversations, and the memories of design-system contributors.

Code-aware MCP

How to implement it

Component names and source code
Properties, types, and examples
Existing implementation references
Technical constraints

Design knowledge layer

What should be built

Patterns and complete flows
Semantic rules and principles
Required states and edge cases
Exceptions and decision rationale

The two systems were complementary: implementation context made answers executable; design context made them intentional.

This changed the problem definition. The goal was not to duplicate the official MCP or build a better search box. It was to model the layer of knowledge that turns available components into coherent product decisions.

02 / Maturity

Design systems keep evolving.

The familiar maturity story ends with documentation. In practice, documentation was only another passive layer people could ignore, misread, or never discover.

Components

Reusable interface building blocks.

Tokens

Shared visual decisions encoded as variables.

Documentation

Guidance that explains APIs and usage.

Patterns

Repeatable solutions to product problems.

Flows

Complete scenarios, states, and recovery paths.

Agent-ready knowledge

Structured judgment that software can retrieve and apply.

Each stage preserves the previous one, but moves the system closer to reusable judgment rather than reusable pixels.

Components made interfaces reusable. Tokens made visual decisions reusable. Documentation explained how both worked. Patterns and flows then captured larger product decisions: not only which control to use, but how an entire scenario should guide a user through uncertainty, risk, error, and recovery.

The AI layer was the next maturity step. It adapted those assets for a new class of consumer: agents that need explicit, retrievable, bounded instructions. The design system became capable of participating in decisions instead of waiting for a person to open a documentation page.

03 / Knowledge model

Components were only one layer of the system.

To make design knowledge useful to agents, I separated it by the kind of decision it could support. This prevented every question from loading the entire design system into context.

Components

Button, dialog, navigation, form field

Patterns

Search, filtering, confirmation, progressive disclosure

Flows

Onboarding, creation, deletion, recovery

States

Empty, loading, error, success, overflow

Principles

Hierarchy, clarity, interruption cost

Exceptions

Approved deviations and the reason they exist

This distinction matters because a correct component can still create the wrong experience. A deletion journey is not solved by finding the destructive button. It includes consequence copy, confirmation, permissions, progress, failure, recovery, and the state left behind. Similarly, onboarding is not a collection of tooltips; it is a flow shaped by progressive disclosure and the user’s current level of context.

04 / Platform

MCP was the first interface, not the final product.

The durable asset was a shared knowledge layer. MCP exposed it to agents, while skills, chat, Figma, Slack, and a future CLI could reuse the same rules without creating new knowledge silos.

01Shipped

Knowledge layer

Components, patterns, flows, principles, and exceptions.

02Shipped

MCP interface

Typed access to design-system knowledge for development agents.

03Experiment

Point-of-work clients

Chat and design-tool prototypes reuse the same source.

04Next

CLI and wider distribution

Cheaper retrieval and more specialized workflows.

Current status is explicit: the core knowledge and MCP shipped; point-of-work clients were explored; lower-cost distribution remains the next platform step.

Shipped

Structured design-system knowledge, typed MCP tools, and task-specific guidance available to development agents.

Experiment

In-context question answering for designers and team communication surfaces, tested as prototypes or partial flows.

Skills, routing instructions, evaluation suites, and a focused CLI to reduce repeated context and token consumption.

This platform framing also reduced bus factor. Routine questions could be answered from approved knowledge at the point of work. Ambiguous or policy-level decisions could still escalate to a design-system owner with the relevant context attached. The system scaled access to expertise without pretending that every design decision could be automated.

05 / Tool design

One goal. Three tools.

Icon selection exposed an important AI product-design principle: the same user goal can require different levels of certainty, context, and cost.

recommend_icon

Turns an intent such as “export data” into a short semantic candidate list.

Context cost

Low

Use when

The agent needs direction but has little product context.

match_icon

Checks whether a candidate already exists and identifies the closest approved asset.

Context cost

Medium

Use when

A likely symbol is known and duplication must be avoided.

select_icon

Makes a final contextual choice and returns rationale, constraints, and usage notes.

Context cost

High

Use when

Meaning depends on workflow, neighboring actions, or exceptions.

A single “find an icon” tool looked simpler, but it forced the agent to pay for deep context even when a lightweight semantic suggestion was enough. Splitting the workflow made the trade-off explicit. The agent could stop after a recommendation, verify that an asset existed, or spend more context on a final decision only when the product situation demanded it.

Reconstructed and anonymized tool trace

User

Choose an icon for exporting a table as CSV.

recommend_icon

Shortlist: Download, FileDown, Export. Exclude Save because the action creates an external file rather than persisting an in-session edit.

[tokens]

match_icon

Download already exists in the approved set. FileDown is not part of the current library.

[latency]

select_icon

Use Download. It matches the established export convention and avoids introducing a second symbol for the same intent.

[cost]

Real icon-tool benchmark

Prove that specialized routes changed token use, retries, or first-pass acceptance.

Required source: Anonymized agent traces for comparable recommend, match, and select tasks.
Anonymization: Remove company names, repository paths, proprietary icon names, and user identifiers.
Recommended format: 1600 × 1000 px transcript and a compact table covering tokens, latency, retries, and outcome.

06 / Prototyping

The highest leverage appeared before designs existed.

When an agent was implementing an existing mockup, the design system mostly improved accuracy. When no mockup existed, the knowledge layer helped shape the experience itself.

Starting prompt

“Design a flow for deleting a workspace that contains active projects and multiple members.”

Component-level response

A dialog and a red button

Finds an Alert Dialog component
Adds a destructive primary action
Asks the user to confirm
Stops at the happy path

System-guided response

A complete destructive flow

Explains impact before confirmation
Checks permissions and blocking conditions
Includes progress, failure, and recovery
Defines the post-deletion empty or redirect state

The difference was not visual polish. It was scenario coverage and the quality of the product decisions made before implementation.

Retrieved decision package

Pattern

Destructive action with explicit consequences

States

Blocked, confirming, processing, failed, complete

Components

Inline warning, dialog, progress, notification

Rules

Do not rely on color; preserve a safe exit

Exception

Skip re-auth only for low-risk sandbox data

Recovery

Explain retention and restoration policy

This moved the design system upstream. Instead of checking compliance after a screen had been designed, it could guide early exploration: compare several visual approaches, choose an interaction model, identify missing states, and explain how the product should lead a user through the flow.

07 / Feedback

The design system started talking back.

Traditional documentation broadcasts guidance and waits for people to report problems. Agent interactions can produce structured evidence about where that guidance succeeds, conflicts, or disappears.

Interaction

An agent applies a component, pattern, or flow.

Feedback

It records uncertainty, conflict, modification, and outcome.

Pattern update

A maintainer adds a rule, example, or approved exception.

Evaluation

With-guidance and baseline outputs are compared.

Release

Human-approved improvements return to the shared layer.

The loop accelerates maintenance without delegating policy changes to an autonomous system.

Short feedback records could reveal questions that surveys rarely captured: which rule was missing, where two sources contradicted each other, what an agent changed after a user request, and which workaround repeatedly appeared. This made design-system quality observable at a scale that depended less on people remembering to send feedback.

Human governance remains explicit.

AI can identify clusters, propose guidance, and run comparisons. Design-system owners approve changes that alter principles, policy, or product behavior.

Feedback taxonomy and evaluation result

Show one concrete case where recurring agent feedback exposed missing or conflicting design-system guidance.

Required source: Anonymized interaction logs plus one before-and-after guidance update.
Anonymization: Aggregate by topic and remove prompts, product names, repository identifiers, and personal data.
Recommended format: Two 1400 × 900 px visuals: a feedback-cluster view and a with-guidance versus baseline evaluation.

08 / Adoption

It looked like vibe coding. Then it became a standard.

The strongest resistance came from engineers who saw AI-generated design-system guidance as less trustworthy than conventional tooling. Adoption changed when the system proved that it constrained agents rather than giving them more freedom to improvise.

Skepticism

AI output was associated with generic UI and fragile code.

Focused pilot

Narrow tools answered concrete design-system questions.

Visible proof

Decisions became more consistent and easier to review.

Default workflow

The approach moved from experiment toward company standard.

Replace this qualitative timeline with dated internal milestones before publication.

The project did not remove disagreement. It made disagreement more useful. The system could answer established questions immediately, expose the source behind an answer, and escalate low-confidence or disputed cases. Design-system experts spent less time repeating settled guidance and more time deciding the situations the system did not yet understand.

For design leadership, this was the larger organizational change. Adoption no longer depended only on training sessions, office hours, or individual reviewers catching the same mistakes. Approved guidance could travel with the work, while unresolved questions arrived with enough evidence to improve the system rather than disappear inside another one-off conversation.

Adoption proof and team quotation

Substantiate the shift from resistance to routine use without overstating company-wide adoption.

Required source: Dated rollout milestones, usage records, and an approved quote from an engineer or design leader.
Anonymization: Remove names and product identifiers unless written permission is available.
Recommended format: 1600 × 700 px timeline with one short quotation and a source note.

09 / Outcomes

The platform made design-system impact measurable.

Exact production numbers still need to be cleared for publication. The measurement model is already defined, so the final case can distinguish demonstrated impact from future potential.

Decision quality

[metric needed]

First-pass design-system approval or review correction rate.

Self-service speed

[metric needed]

Median time from a design-system question to a usable answer.

Token efficiency

[metric needed]

Tokens per resolved task, split by tool and confidence level.

State coverage

[metric needed]

Required default, empty, error, success, and recovery states included.

Expert interruptions

[metric needed]

Routine questions resolved without involving a design-system owner.

Feedback volume

[metric needed]

Structured gaps and conflicts captured from agent interactions.

These measures connect design-system work to decisions rather than page views. Documentation traffic says that someone opened a guide. First-pass acceptance, state coverage, token cost, escalation rate, and recurring conflict topics show whether the system actually changed how products were designed and built.

10 / Reflection

“The design system stopped being a place people had to visit. It became an active participant in how products were designed and built.”

The project expanded the role of a design-system team beyond components and documentation. It required knowledge architecture, AI tool design, routing, evaluation, observability, and governance. It also created a practical path for carrying the same expertise into engineering agents, design tools, chat, and future workflows.

The next design-system interface is not necessarily a website. It is the decision layer available wherever work happens.