Vibe Coding Reality Check: I Shipped a Design System Update Without Engineering

Last Tuesday at 3pm, I updated our design token system and pushed the changes to production. No Slack messages to engineering. No “hey can you take a look at this?” No two-week sprint backlog wait. Just me, Claude Code, and about 4 hours of focused work. :artist_palette::sparkles:

I’m still not sure if I should feel proud or terrified.

What Actually Happened

I just got back from the AI Design Systems Conference 2026 (shoutout to Into Design Systems—amazing event). The big theme? Vibe coding: designers using AI tools like Cursor IDE, Claude Code, and Figma MCP to ship code directly to production. Not prototypes. Not “design specs.” Actual. Production. Code.

Here’s what I did:

  1. Updated our color tokens in Figma (routine design system maintenance)
  2. Fired up Claude Code and asked it to generate the corresponding CSS variables + React component updates
  3. Ran the changes through Storybook to verify visual consistency
  4. Pushed a PR, watched the CI pass, merged to main
  5. Boom. Design system updated. Tokens propagated to 8 product teams.

Total time: 4 hours.

In the old workflow, this would’ve taken two weeks minimum—waiting for an engineer to be available, explaining the context, reviewing their implementation, going back and forth on edge cases. And honestly? The engineer would’ve been bored out of their mind doing what’s essentially token translation work.

The Part That Keeps Me Up At Night

Here’s the thing: it felt empowering. I wasn’t blocked. I didn’t have to justify the work or explain design theory to someone who just wanted a Jira ticket. I owned the full loop: design → implementation → deployment.

But also? Terrifying. :anxious_face_with_sweat:

Because there was no code review from engineering. No “hey, did you think about the bundle size impact?” No “this CSS specificity might conflict with the legacy theme system.” Just me, my design knowledge, and an AI tool that writes really confident code.

What if I broke something critical? What if there’s a performance regression I don’t know how to detect? What if accessibility got worse and I don’t have the testing tools to catch it?

The Questions I’m Wrestling With

1. Is this role collapse or role evolution?

Some folks say we’re eliminating the design-to-engineering handoff. But are we also eliminating the guardrails that engineers provide? The systems thinking? The performance and accessibility expertise?

2. Who should review designer-generated code?

If engineers are busy with “real” architecture work, who’s checking my tokens don’t blow up the bundle size? Do we need a new role—like “design engineer”—to bridge this gap? (I hear companies like Vercel are paying $200K+ for this.)

3. What happens to frontend engineers in this world?

If designers can ship UI code, what’s left for frontend engineers? Are they shifting to performance optimization and architecture? Or are we just… replacing them? That doesn’t sit right with me.

4. Are we optimizing for speed at the expense of system thinking?

Research shows developers using AI estimate they’re 20% faster but actually take 19% longer when you factor in debugging time. Are we shipping faster while understanding less? Is “move fast” actually just “move recklessly”?

What I Think We Need

I don’t think the answer is “designers shouldn’t touch code”—that ship has sailed. Tools like Claude Code and Cursor are too good, and the velocity gains are too real (some teams report 3x faster feature delivery).

But I also don’t think the answer is “designers can ship whatever they want.” That’s chaos.

I think we need:

  • Automated guardrails: CI checks for performance, bundle size, accessibility
  • New review models: Maybe not manual code review, but design review + automated testing?
  • Clearer ownership boundaries: Designers own tokens and styles, engineers own logic and performance
  • Design engineers as bridges: People who speak both languages and can review designer PRs

I Want Your Take

Am I overthinking this? Is this just the natural evolution of roles in an AI-enabled world? Or are we about to learn some hard lessons about the difference between “shipping code” and “building systems”?

If you’re a frontend engineer, how do you feel about designers merging PRs? Threatened? Relieved? Something else?

If you’re a product person, do you care WHO writes the code as long as we’re shipping faster?

If you’re in engineering leadership, what guardrails would you put in place?

I’m genuinely curious where this goes. Because right now, I’m sitting here with merge permissions and an AI that writes really convincing CSS. And I’m not 100% sure what the right thing to do is. :thinking:


P.S. The changes did NOT break production. Yet. :sweat_smile:

Maya, this is a governance question, not a capability question. And that’s actually good news—it means we can solve it.

I work in financial services, so my perspective might be different than yours. In my world, this workflow you described is literally impossible without triggering compliance alerts. We have SOC 2, PCI-DSS, and internal controls that require separation of duties and code review trails. But here’s the thing: those requirements aren’t about preventing designers from shipping code—they’re about preventing ANY single person from shipping unreviewed changes to production.

The real problem isn’t “should designers write code?” It’s “how do we review code in a world where implementation is cheap and anyone with domain expertise can generate it?”

What Guardrails Actually Look Like

At my company, we’ve started experimenting with a similar model for our design system team. Here’s what we learned:

1. Automated tests are non-negotiable

Before your PR can merge, it must pass:

  • Visual regression tests (we use Percy/Chromatic)
  • Bundle size checks (fail if you increase bundle by >5KB)
  • Accessibility scans (axe-core in CI)
  • Performance budgets (Lighthouse CI)

The beauty of this approach: the “reviewer” is automation, not a person. It catches the stuff engineers would catch (bundle size, a11y) without requiring their time.

2. Design engineers as reviewers, not blockers

We created a “design engineer” role (3 people on a 40-person team). They review designer PRs, but the SLA is 24 hours, not 2 weeks. Their job isn’t to rewrite the code—it’s to check:

  • Does this fit our architectural patterns?
  • Are there performance concerns automation missed?
  • Is this maintainable by the rest of the team?

Companies like Vercel are paying $200K+ for these roles because they’re force multipliers. One design engineer enables 5-8 designers to ship autonomously.

3. Clear ownership boundaries

This is where your instinct about “designers own tokens, engineers own logic” is exactly right. We codified this:

  • Designers own: color tokens, spacing, typography, icon updates, component variants (CSS/styling)
  • Engineers own: state management, data fetching, performance optimization, cross-component orchestration

The boundary is “if it changes the DOM structure or adds business logic, engineering review required.”

The Bigger Opportunity

Here’s what I think you’re missing: this is an opportunity to redefine what “code review” means.

Traditional code review is synchronous, manual, and bottlenecked by engineering availability. But what if code review becomes:

  1. Design review (does this meet our design standards?) → designers do this
  2. Automated testing (does this meet technical standards?) → CI does this
  3. Architecture review (does this fit our system?) → design engineers do this, asynchronously

In this model, you’re not “bypassing” code review—you’re distributing it across the right people with the right expertise.

My Question Back to You

You mentioned you ran Storybook to verify visual consistency and watched CI pass. What tests were in your CI pipeline? Because if the answer is “just linting and unit tests,” then yeah, you’re right to be nervous.

But if your CI includes:

  • Visual regression tests
  • Bundle size monitoring
  • Accessibility checks
  • Performance budgets

Then you didn’t bypass engineering review—you automated it. And that’s actually better than waiting two weeks for a human to eyeball your CSS.

The question isn’t “should designers touch code?” The question is “what guardrails do we need to make designer-generated code as safe as engineer-generated code?”

And the answer is: the same guardrails we should’ve had for engineering all along. :construction:

This is a velocity unlock at the organizational level, but Luis is right—it’s also a governance challenge. Let me frame this from a product strategy perspective. :bar_chart:

The Handoff Tax Is Real

Research from the design-to-code space shows that traditional handoffs eat nearly 50% of a frontend developer’s time. Not the actual coding—the translation layer. Explaining design intent, going back and forth on edge cases, waiting for designer approval on implementation details.

If Maya’s change genuinely saved 2 weeks of cycle time and didn’t break production, that’s a 3x velocity improvement on design system updates. Multiply that across every token change, icon update, spacing adjustment… you’re talking about compounding gains.

The teams I’ve talked to using tools like Cursor and Claude Code report 3x faster feature delivery. That’s not trivial—that’s a competitive advantage.

The New Division of Labor

Here’s what I think is happening, and why it’s not just “designers replacing engineers”:

Traditional model:

  • Designer creates mockup
  • Engineer translates mockup to code
  • Designer reviews implementation
  • Engineer fixes discrepancies
  • Repeat until pixel-perfect

Vibe coding model:

  • Designer creates mockup and implementation simultaneously
  • Engineer reviews architecture/performance/system impact
  • Designer owns experience layer
  • Engineer owns structure, logic, optimization

The key shift: implementation is cheap now. What matters is strategy and architecture.

The Risk Maya’s Identifying

But here’s where Maya’s instinct is dead-on: there’s a study from Anthropic showing that developers using AI tools estimate they’re 20% faster but actually take 19% longer when you factor in the subtle bugs and debugging time.

Are we shipping faster while understanding less? Is this a short-term velocity win that becomes a long-term technical debt disaster?

The Question for Product Leaders

As a VP of Product, I honestly care less about who writes the code and more about:

  1. Can we ship faster without sacrificing quality? (guardrails matter)
  2. Are we building the right things? (velocity on the wrong features is waste)
  3. Do we have the right skills on the team? (if designers need code literacy, that’s a hiring/training investment)

If Maya can ship a design system update in 4 hours instead of 2 weeks, and it doesn’t break production, and our customers get a better experience faster… I literally don’t care that she’s not a “frontend engineer.”

But if we’re accumulating technical debt, degrading performance, or creating accessibility issues because the guardrails aren’t there, then we’re optimizing for the wrong metric.

What This Means for Team Structure

From a product org perspective, this suggests we need:

  1. Invest in CI/CD infrastructure: Make the “automated reviewer” as good as a human reviewer
  2. Hire design engineers: Bridge roles are where the ROI is
  3. Redefine frontend engineering: Performance, accessibility, architecture—not pixel-pushing
  4. Measure outcomes, not process: If designer-generated code ships faster and works better, does it matter who wrote it?

The question isn’t “should designers touch code?” The question is: what’s the right balance between velocity and system understanding?

And I don’t think anyone has the answer yet. We’re all figuring it out in real-time. :man_shrugging:

Maya, I want to share a cautionary tale from my time at Twilio. Then I’ll tell you why I’m actually optimistic about where this is going.

The Story Nobody Tells You

About 5 years ago, we experimented with a “designer-friendly” CSS-in-JS tool that generated perfectly valid React components. Designers loved it. Product loved the velocity. Engineering was skeptical but didn’t push back hard.

Six months later, we discovered that one particularly “harmless” design system update caused a 3-second render delay on mobile devices. The generated code looked fine. It passed linting. It worked in desktop Chrome.

But the tool had generated deeply nested styled-components with complex CSS selectors. On mobile, the style recalculation was crushing performance. And none of the designers knew how to profile performance, so it shipped.

We didn’t catch it until customer complaints hit support. By then, it had been in production for 4 weeks.

The Architectural Risks Are Real

Design system changes have cascading impacts that aren’t always obvious:

  1. CSS specificity conflicts with legacy systems
  2. Bundle size impact (token files can balloon quickly)
  3. Performance implications (style recalculation, paint, layout thrashing)
  4. Accessibility regressions (color contrast, focus indicators, screen reader labels)
  5. Browser compatibility (especially for CSS Grid/Flexbox edge cases)

These aren’t hypothetical concerns. These are the things that frontend engineers think about every day, and they’re hard to test for without deep technical knowledge.

But Here’s Why I’m Optimistic

Luis is right about automated guardrails. The difference between 2021 and 2026 is that we now have the infrastructure to make this safe.

Here’s what we’ve implemented at my current company:

1. Performance Budgets in CI

  • Lighthouse CI runs on every PR
  • Fail if Time to Interactive increases >100ms
  • Fail if bundle size increases >10KB
  • Visual regression tests catch unexpected layout shifts

2. Accessibility Checks (Automated)

  • axe-core runs in CI and catches ~60% of accessibility issues
  • Color contrast validators
  • Focus management tests
  • Screen reader compatibility checks (we use pa11y)

3. Bundle Size Monitoring

  • We use bundlesize.io to track every change
  • Token files are split by theme to avoid bloat
  • Tree-shaking is validated in CI

4. Visual Regression Testing

  • Percy runs on every design system change
  • Catches unintended visual side effects across 200+ components

With this infrastructure, I trust designer-generated code as much as engineer-generated code. Because the review isn’t happening in GitHub comments—it’s happening in CI.

What Changes for Engineering

Here’s what I tell my engineering team: you’re not gatekeepers anymore, you’re platform builders.

Your job isn’t to manually review every CSS change. Your job is to build the tools, tests, and infrastructure that enable safe autonomy.

That means:

  • Building CI pipelines with real guardrails
  • Creating design system APIs that are hard to misuse
  • Writing performance budgets and automated tests
  • Monitoring production for regressions
  • Providing architectural guidance when needed

Good frontend engineers were never just “CSS translators.” They’re performance, accessibility, and architecture experts. If that’s not what your frontend engineers are doing, you’ve been wasting their time.

My Question to You, Maya

You mentioned you “watched the CI pass.” What specifically did your CI check?

If the answer is:

  • :white_check_mark: Visual regression tests
  • :white_check_mark: Bundle size monitoring
  • :white_check_mark: Accessibility checks
  • :white_check_mark: Performance budgets

Then you didn’t bypass code review—you automated it. And that’s better than a human eyeballing your CSS for 15 minutes and saying “looks good to me.”

But if your CI only checks linting and unit tests, then yeah, I’d be nervous too. Because you’re essentially flying blind on the things that matter most for design system changes.

The Path Forward

This transition is happening whether we like it or not. Tools like Claude Code and Cursor are too good, and the velocity gains are too real.

The question isn’t “should we allow this?” The question is “how do we make this safe?”

And the answer is: invest in infrastructure. Build the automation that catches what engineers would catch. Create the guardrails that make designer autonomy safe.

If you do it right, designers get autonomy, engineers get interesting work, and products ship faster. If you do it wrong, you ship fast and break things in ways you don’t discover until customers complain.

The choice is yours. :rocket: