Baseline Design System — Implementation Plan

Updated 21 March 2026 against actual current state. Phased approach where complexity is earned through building experience. Designed so that stopping after any phase still leaves something real and useful.


Guiding principle

Baseline's own progressive enhancement principle applies to building Baseline itself. The base layer is complete and useful on its own. Each subsequent layer enriches without being required.


Current state (what already exists)

Before the phases: an honest inventory. Baseline is further along than a typical v1 plan would assume.

Token file: aleris-tokens.css — ~600 lines of CSS custom properties in a three-layer architecture (primitives → semantic → component). Covers color (brand + feedback + chart palette), spacing (perfect fifth ratio, 3xs–2xl), typography (Museo Sans, 7 sizes, 4 weights, compositions for h1–h4/lead/body/label/small), radius (5 levels), shadows, grid (12-col, 4 breakpoints), motion (4 durations, 2 easings), z-index (8 levels), images (4 aspect ratios, 3 radii). Component-level token clusters exist for buttons (6 variants, including confirm), tabs (18 tokens), cards, inputs (22 tokens), badges, tables (16 tokens).

18 markdown documents across five folders (governance, reference, voice, source, planning), all with YAML frontmatter carrying type, status, depends_on, propagates_to, tokens_referenced, open_questions, and last_verified.

Governance infrastructure: Update protocol with dependency mapping, five update rules, and verification checklist. Anti-patterns document (10 behavioral guardrails). Open questions tracker (19 open, 15 resolved, with blocking relationships mapped to workstreams).

Web front-end: index.html dogfooding the token system. Built in Cowork sessions. Uses the instrumental surface (sand-50) for the documentation itself. Currently a single-file front-end, not yet a rendered-from-markdown site.

Design concepts documented: Surface temperature (communicative vs. instrumental), scanning vs. attending density, three-tier radius, confirm green distinction, progressive enhancement as foundational principle, animation tiers, two page surfaces.

What doesn't exist yet: Generated token JSON. AI instruction file for consuming repos. Component spec files with the rigid template. Pattern/composition documentation. Manifest index. Audit scripts.


Phase 1: Close the AI-readability gap

Goal: Make the system instantly consumable by AI coding tools. The web view is already underway. The missing pieces are the machine-readable token layer and the instruction file that tells Claude Code how to use it.

Context: The web view is being built in Cowork. The incoming front-end developer is settling into his role — early AI-driven building sessions will be the first real tests. These deliverables need to be ready for those sessions.

1.1 Generate the token JSON

What: baseline-tokens.json — auto-generated from aleris-tokens.css. The CSS file remains the single source of truth. The JSON is a derived artifact for machine consumption.

Structure:

{
  "meta": {
    "version": "0.1.0",
    "generated_from": "aleris-tokens.css",
    "last_updated": "2026-03-21"
  },
  "color": {
    "petrol-500": {
      "value": "#004851",
      "usage": "Primary text, headings, structural elements",
      "constraint": "Default text color. Only specify when deviating."
    },
    "orange-500": {
      "value": "#F58C61",
      "usage": "Primary CTAs, accent, active indicators",
      "constraint": "One primary CTA per screen maximum. Never use for decoration."
    },
    "sand-100": {
      "value": "#F2ECE4",
      "usage": "Page background for communicative surfaces",
      "constraint": "Communicative pages only. Instrumental uses sand-50."
    },
    "sand-50": {
      "value": "#faf8f6",
      "usage": "Page background for instrumental surfaces",
      "constraint": "Tools, admin, dashboards. Cards optional on this surface."
    },
    "confirm": {
      "value": "#27ae60",
      "usage": "Interactive confirm actions ('I'm done')",
      "constraint": "Distinct from goal-achieved green. Confirm is interactive, goal-achieved is indicator."
    }
  },
  "spacing": { },
  "typography": { },
  "radius": { },
  "motion": { },
  "elevation": { },
  "grid": { },
  "z-index": { }
}

The constraint field encodes design judgment as machine-readable rules. Prioritize writing constraints for the tokens where AI errors are most costly: color (wrong surface background, status colors used decoratively), spacing (fabricated values), and radius (wrong tier). Typography and motion constraints can follow.

How to generate it: Write a Node script that parses the CSS custom properties and outputs the JSON skeleton. The usage and constraint fields should live as structured comments in the CSS file itself (e.g. /* @usage Primary text | @constraint Default text color */) so that there's one place to update. The script extracts them. If the comment is missing, the JSON field is empty — visible gap rather than invisible omission.

Governance: Generated on build. Never hand-edited. The CSS file is the authority. If the JSON and CSS disagree, the JSON is wrong and should be regenerated.

Effort: Half a day for the script + populating constraint comments for the highest-risk tokens (~30–40 tokens across color, spacing, radius).

1.2 Write the AI instruction file

What: BASELINE.md — the file that lives in any repo building with the system. Claude Code (or Cursor, or any AI coding tool) reads it at session start.

Content — derived from existing governance and anti-patterns documents:

# Baseline — AI Instructions

Before writing or modifying any UI code, read the relevant
section of the design system and use only tokens from
aleris-tokens.css. When uncertain, flag as an open question
rather than guessing.

## Hard rules (from aleris-anti-patterns.md)
- Never use raw hex colors. Use CSS custom properties.
- Never use raw pixel values for spacing. Use spacing tokens.
- Never use raw Tailwind color classes.
- Page background: sand-100 (communicative) or sand-50 (instrumental). Never white.
- Card/surface background: white. Never sand.
- One primary CTA (orange) per screen.
- Status colors for state only, never decoration.
- Never use all caps. Sentence case for everything.
- Only animate transform and opacity. Respect prefers-reduced-motion.

## Surface temperature
Communicative (patient-facing, marketing): sand-100 page background,
generous spacing, full type scale, cards mandatory for content structure.

Instrumental (tools, admin, dashboards): sand-50 page background,
tighter layout, compressed type range, cards optional.

The mode is determined by the situation, not the user role.

## Component shapes
- Containers (cards): radius-s (4px)
- Interactive elements (buttons, inputs): radius-m (8px)
- Indicators (badges): radius-full (100px)
- No full-pill buttons exist in Aleris interfaces.

## Typography
- Font: Museo Sans 500/700 → Arial fallback
- Body: 18px / 1.6 line-height
- Lead text: 27px, regular weight, secondary color
- Two weights in the system: 500 (regular) and 700 (bold)
- 300 and 900 are exceptional use only

## When you don't know
If a design decision isn't covered by the tokens or documents,
add it as a comment: /* OPEN: [description of the decision] */
A visible question is better than an invisible assumption.

Governance: Version-controlled in the repo. Updated when anti-patterns or token rules change. The update protocol's propagates_to field should include this file for any governance document.

Effort: One hour. The content exists across the anti-patterns document and the governance document — this concentrates it for AI consumption.

1.3 Continue the web view

What's happening: The HTML front-end is being built in Cowork sessions, dogfooding the token system on the instrumental surface (sand-50).

Target site structure (adapted to what Baseline actually contains):

/                          → Homepage routing three audiences
/foundations/              → Color, typography, spacing, grid, animation, images
/governance/               → Anti-patterns, privacy JTBD, update protocol
/principles/               → Progressive enhancement, surface temperature, den nära experten
/voice/                    → Core voice guide, document types, digital voice
/tokens/                   → Browsable reference rendered from baseline-tokens.json
/status/                   → Open questions, maturity levels, what's stable vs hypothesis

The homepage routes three audiences:

  • Building with Baseline → Tokens + foundations + AI instruction file
  • Understanding Baseline → Principles + governance + voice
  • Raw files for AI tools → Direct links to markdown, CSS, and JSON

The /status/ page shows what's fixed, what's anchor, what's hypothesis, and what's living. Derived from frontmatter. This is the honesty mechanism — tells consumers where they can trust the system and where they need judgment.

Governance: The site should eventually render from markdown source, not maintain separate content. The current HTML is a legitimate v1 — it works and it dogfoods the system. The transition to build-from-markdown can happen when the developer is available.

Effort: Ongoing in Cowork sessions. The IA above is a target, not a blocker.

Phase 1 summary

Deliverable Status Effort remaining
Token CSS file (3-layer architecture) ✅ Done
YAML frontmatter on all docs ✅ Done
Update protocol + dependency mapping ✅ Done
Anti-patterns (10 guardrails) ✅ Done
Web front-end (HTML, dogfooding tokens) 🔄 In progress (Cowork) Ongoing
baseline-tokens.json (generated) ⬜ Not started Half day
BASELINE.md (AI instruction file) ⬜ Not started 1 hour

Phase 1 is functional when: An AI coding session can start by reading BASELINE.md, reference baseline-tokens.json for token lookup, and produce output that uses the correct surface background, spacing tokens, and component shapes without correction.


Phase 2: Component specs — documenting what the tokens already define

When to start: After the first AI-driven building sessions using Phase 1 artifacts. The errors Claude Code makes — and the errors it doesn't — tell you which components need spec files first.

Gate: Don't write component specs speculatively. Start when an AI session produces a wrong result that a spec file would have prevented.

Revised component readiness rule: The token file already defines stable component boundaries — buttons have 6 variants with dedicated tokens, inputs have 22 tokens, tabs have 18. The token cluster is the evidence of a stable boundary. A component gets a spec file when its token cluster is stable in the CSS, regardless of how many products use it.

2.1 Component spec files

What: One markdown file per component with a token cluster in the CSS. Rigid template so the structure is learnable by both humans and AI.

Template:

---
name: Button
category: component
surface: [communicative, instrumental]
status: stable
variants: [primary, secondary, outline, ghost, confirm, small]
related: [Input, Card]
---

Prose sections (in order):

  1. When to use — one sentence, the decision point
  2. When NOT to use — explicit anti-patterns (prevents misapplication)
  3. Variants — each variant with its purpose and when to choose it (including the confirm/goal-achieved distinction)
  4. Surface behavior — how it adapts between communicative and instrumental
  5. States — default, hover, active, focus, disabled, error
  6. Token reference — which CSS custom properties govern this component
  7. Code example — minimal, copy-pasteable

Priority order (by token cluster size and AI error risk):

Component Token count Priority Reason
Button 6 variants High Most common AI error: wrong variant, wrong radius, decorative orange
Input 22 tokens High Complex state management, easy to get wrong
Card Token cluster High Surface-dependent behavior (mandatory on communicative, optional on instrumental)
Table 16 tokens Medium Density model (scanning vs. attending) is unusual
Tab navigation 18 tokens Medium New component, spec prevents drift
Badge Token cluster Lower Simpler, less error-prone

Governance: Spec files live in a components/ folder. Frontmatter is indexable by the manifest (2.2). The spec is authoritative for "when to use" guidance; the CSS is authoritative for token values. They reference each other, they don't duplicate.

Effort: 1–2 hours per component. Button and Card first. Content is largely extractable from existing patient product design guidelines and governance — needs restructuring into the rigid template, not writing from scratch.

2.2 The baseline manifest

What: baseline-manifest.json — auto-generated index of all documents and component specs, queryable by frontmatter fields.

{
  "documents": [
    {
      "name": "Anti-patterns",
      "type": "governance",
      "status": "fixed",
      "path": "governance/aleris-anti-patterns.md",
      "open_questions": []
    }
  ],
  "components": [
    {
      "name": "Button",
      "category": "component",
      "status": "stable",
      "surface": ["communicative", "instrumental"],
      "variants": ["primary", "secondary", "outline", "ghost", "confirm", "small"],
      "spec_file": "components/button.md",
      "tokens_referenced": ["--btn-primary-bg", "..."]
    }
  ]
}

Why it matters: The lookup table. Claude Code scans it to find relevant specs before reading prose. The web view builds filtered navigation from it. A future MCP server exposes it as a queryable resource.

Governance: Generated by script from frontmatter. Runs on build. Never hand-edited.

When to build: After 5+ component specs exist. Before that, the token JSON and instruction file provide adequate machine-readable structure.

Effort: Half day for the generation script.

2.3 Expand the web view

Add /components/ section. Each page renders the spec file with the standard anatomy. Group by function (layout, input, feedback, navigation, data display).

Add /status/ page — maturity across the system. What's fixed, anchor, hypothesis, living. Derived from frontmatter plus the open questions tracker.

Effort: 1–2 days of developer time once specs exist.

Phase 2 summary

Deliverable Gate Effort
Component spec files (6 components) AI session produces correctable error 1–2 hours each
baseline-manifest.json 5+ specs exist Half day
Component pages in web view Specs + manifest ready 1–2 days
/status/ page Frontmatter on all docs (✅ done) Half day

Phase 3: Patterns, composition rules, and enforcement

When to start: After someone other than you has built with Phase 1 + Phase 2 artifacts. Their mistakes reveal where the system transfers knowledge and where it fails.

Gate: Three composition patterns have repeated across products or AI sessions. If every layout is bespoke, there's nothing to formalize.

3.1 Pattern files

What: Composition recipes for recurring scenarios. Which components compose it, in what order, with what spacing, under what decision rules.

Frontmatter graduates emotional_mode and surface to structured fields — this is where Baseline's most distinctive metadata becomes machine-readable.

---
name: Pre-visit checklist
category: pattern
emotional_mode: anxiety-driven
surface: communicative
components_used: [PhaseIndicator, ChecklistItem, StatusBanner, ChatAccess]
---

Three conditions before a pattern is documented:

  1. Implemented at least twice
  2. Someone other than you has needed to implement it
  3. You can describe it without referencing a specific product

3.2 Emotional mode and privacy metadata on components

What: Promote emotional_mode and privacy_behavior from prose to frontmatter fields on component and pattern specs.

Why Phase 3: These fields are the most likely to be wrong if formalized early. The emotional mode framework has been tested against two product contexts. Getting it wrong in prose is a documentation issue. Getting it wrong in a machine-readable field is a system error that propagates silently.

Signal that you're ready: The vocabulary has stabilized — you're applying existing terms, not inventing new ones. And at least one AI session has produced a layout using the wrong emotional mode default.

3.3 Token audit script

What: Scans CSS/Tailwind files for hardcoded values, flags them with the correct Baseline token. Returns exit code 1 for CI.

Why Phase 3: Enforcement infrastructure earns its cost when multiple people or frequent AI sessions are writing CSS and drift is real. Right now, the AI instruction file and code review catch violations. The token JSON provides the lookup table the audit script will need — the data structure is already in place.

3.4 The MCP server

When: Not Phase 3. Possibly Phase 4. Requires stable, accurate, maintained data layers beneath it.

What Phases 1–3 do for MCP readiness: Consistent frontmatter schema, generated manifests, clean source/derived separation. If the data is structured correctly, wrapping it in an MCP server is days-not-weeks.

Signal that you're ready: Someone who isn't you can query the manifest, find the right spec, and produce correct output without reading the full foundation documents.


What to watch for

Signals to advance

Phase 1 → Phase 2:

  • An AI building session produces a design error that the token JSON and instruction file didn't prevent
  • You correct the same component misuse pattern for the second time
  • Token clusters for buttons, inputs, cards, and tabs are stable — no tokens added or renamed in the last two sessions

Phase 2 → Phase 3:

  • A layout pattern gets rebuilt from scratch that already existed elsewhere (duplication = undocumented pattern)
  • You explain the same composition rule for the third time
  • Emotional mode vocabulary has stabilized

Signals to slow down

  • More time maintaining documentation than building products
  • Metadata describes states no product has reached
  • External attention before the system is stable

Risks

Single-author dependency. Design judgment lives in one head. The developer's ability to catch errors in the metadata — not just consume it — is the test of judgment transfer.

AI as first user, human as second. AI-driven building sessions find mechanical failures (wrong token, missing constraint). Judgment failures (unclear when to use what, ambiguous composition) only surface when a human reads a spec and makes a decision from it. Keep the developer test in mind.

Consensus without commitment. Showing Baseline to stakeholders before it has demonstrated value through shipped products risks the familiar Aleris pattern: agreement without adoption. Ship products first.

Over-engineering the AI layer. The research shows what dedicated design system teams build. Aleris has you. Build what one person can maintain. If infrastructure can't survive two weeks of being ignored, it's too complex.

Governance ahead of content. The update protocol, dependency mapping, and frontmatter schema are Phase 2–level infrastructure supporting Phase 1–level content. That's fine — it means the system absorbs growth without restructuring. But if someone sees more process than substance, the honest answer is: substance comes from building products with it.


The honest framing

Baseline has strong foundations (600-line token file, 18 governed documents, dependency-tracked frontmatter) and a clear gap (no component specs, no composition patterns, no generated machine-readable artifacts). The foundations are more sophisticated than most design systems at this stage. The gap is the normal gap — component documentation comes from building.

Phase 1 closes the AI-readability gap with two small deliverables (token JSON, instruction file) while the web view continues in Cowork. Hours of work, not days.

Phase 2 adds component specs when AI building sessions reveal what needs documenting. The token clusters already define component boundaries — the specs add the judgment layer.

Phase 3 is the ambition: composition patterns, emotional mode metadata, and enforcement infrastructure. It's what makes Baseline genuinely novel. It waits until the simpler layers have proven out.

The phasing isn't a project plan with deadlines. It's a maturity model. Each phase starts when the previous one earns it.