🏛️ SF Tech Week Day 1: AI Regulation Reality Check - What Enterprises Are Actually Facing

Just came out of the “AI Regulation & Enterprise Readiness” panel at SF Tech Week (presented by Fenwick & West + PwC), and honestly, the gap between AI hype and AI reality is MASSIVE.

The Panel Setup

Speakers included:

  • Chief Compliance Officer from a Fortune 500 financial services company
  • VP of AI Governance at a major healthcare tech company
  • Partner from Fenwick specializing in AI regulatory law
  • PwC’s AI Risk & Compliance Practice Leader

The room was PACKED. Standing room only. Everyone wants to know: “How do I actually deploy AI without getting sued or fined?”

The Shocking Statistics They Shared

The PwC speaker dropped some bombs:

60% of AI leaders cite legacy system integration as their PRIMARY challenge when adopting agentic AI. Not “nice to have” - PRIMARY.

Source: IBM Think Insights - The 5 Biggest AI Adoption Challenges for 2025

That’s not a technology problem - that’s an architecture problem. Your shiny new AI can’t talk to your 20-year-old mainframe running COBOL.

42% of enterprises lack access to sufficient proprietary data for AI training.

Source: Deloitte AI Trends 2025: Adoption Barriers

Wait, what? We’ve been collecting data for decades, but it’s siloed, dirty, or in formats that AI can’t consume. The “data is the new oil” metaphor breaks down when your oil is contaminated and spread across 47 different storage tanks.

Only 1 in 4 AI initiatives actually deliver their expected ROI

Source: EPAM Study on AI Adoption for Enterprises

This one hit hard. 75% failure rate. Imagine pitching ANY other technology with those odds to your CFO.

The Regulatory Minefield

The Fenwick partner walked through what she called “The Regulatory Crescendo of 2025”:

United States:

  • July 23, 2025: White House issued “America’s AI Action Plan” to achieve global AI dominance
  • The plan directed OSTP (Office of Science and Technology Policy) to launch an RFI asking businesses what Federal regulations are HINDERING AI innovation
  • Translation: Government realizes they over-regulated and now wants feedback on how to dial it back

Source: Federal Register - Notice of Request for Information on AI Regulatory Reform

European Union:

  • August 2025: EU AI Act rules for General Purpose AI (GPAI) became effective
  • High-risk AI systems have transition period until August 2, 2027
  • But compliance costs are MASSIVE - estimated at millions of euros for large enterprises

Source: EU AI Act | European Commission

The Healthcare CCO said this:

“We’re anticipating a minimum of 18 months to implement effective AI governance models. That’s not 18 months to deploy AI - that’s 18 months just to build the GOVERNANCE FRAMEWORK to safely deploy AI.”

Source: AI Regulation: What Businesses Need to Know in 2025

18 MONTHS OF OVERHEAD before you can even start building.

The Skills Gap is Brutal

40% of enterprises lack adequate AI expertise internally to meet their goals

43% of companies plan to hire AI-related roles throughout 2025, with machine learning engineers and AI researchers being most in-demand

Source: Stack-AI: The 7 Biggest AI Adoption Challenges for 2025

But here’s the catch: Everyone is hiring for the same roles. The ML engineer who can navigate both technical implementation AND regulatory compliance? Unicorn. And they know it. Salaries are insane.

What Actually Works: The 3 Companies Who Got It Right

The panel highlighted 3 anonymous case studies:

Company A (Financial Services):

  • Started with compliance FIRST, not AI first
  • Built governance framework over 12 months
  • Then gradually introduced AI use cases
  • Result: 95% of AI projects approved by compliance, vs industry average of 40%

Company B (Healthcare Tech):

  • Created a “Red Team” - compliance officers embedded with AI engineering teams
  • Every sprint has compliance review, not just end-of-project
  • Result: Caught 200+ regulatory issues BEFORE production, avoided estimated $15M in fines

Company C (Retail):

  • Took “small bets” approach - lots of low-risk AI pilots
  • Built up expertise and governance gradually
  • Now deploying high-risk AI at scale with confidence
  • Result: 3 years to reach maturity, but now moving 3x faster than competitors

The FTC Warning That Made Everyone Nervous

The Fenwick partner mentioned that the Federal Trade Commission has “clearly signaled its intention to clamp down on exaggerated claims for enterprise AI.”

Source: PwC 2025 AI Business Predictions

Translation: If you’re marketing AI that doesn’t actually work as advertised, you’re getting sued. And the FTC is watching.

My Take: This is a 5-Year Problem, Not a 5-Month Problem

After this panel, I’m convinced that most companies are approaching AI regulation COMPLETELY WRONG.

They’re thinking: “Let’s build AI fast, then figure out compliance.”

The successful companies are thinking: “Let’s build compliance infrastructure first, then scale AI quickly within safe guardrails.”

It’s the difference between:

  • :cross_mark: “Move fast and break things” (2012-2021 startup mentality)
  • :white_check_mark: “Move fast within well-defined safety rails” (2025 enterprise reality)

Questions for This Community

  1. Is your company prioritizing AI governance or AI speed? Which is winning in practice?

  2. Have you dealt with EU AI Act compliance? What were the actual costs vs what vendors told you?

  3. What’s your strategy for the skills gap? Hiring, training, or outsourcing?

  4. 18-month governance timeline - realistic or pessimistic? Can it be done faster?

I’m spending Day 2 of SF Tech Week at more AI panels. Will report back on what I learn about open source vs closed source AI models (that’s tomorrow’s big debate).

All cited sources:

  • IBM Think Insights: The 5 Biggest AI Adoption Challenges for 2025
  • Deloitte: AI Trends 2025 - Adoption Barriers and Updated Predictions
  • EPAM: What Is Holding Up AI Adoption for Businesses (2025 Study)
  • Federal Register: Notice of RFI on AI Regulatory Reform (Sept 26, 2025)
  • EU Digital Strategy: EU AI Act Regulatory Framework
  • TechTarget: AI Regulation - What Businesses Need to Know in 2025
  • Stack-AI: The 7 Biggest AI Adoption Challenges for 2025
  • PwC: 2025 AI Business Predictions

@security_sam This is gold. I attended a different panel today - “AI at Scale: From Proof of Concept to Production” - and the message was eerily similar.

The 80/20 Problem Nobody Talks About

The keynote speaker (CTO of a Series C enterprise AI company) said something that made the whole room go quiet:

“AI POCs are easy. Production AI is hard. But nobody warned us that 80% of the work happens AFTER the POC succeeds.”

The brutal reality: Less than 20% of AI initiatives have been fully scaled across the enterprise.

Source: Top 5 AI Adoption Challenges for 2025 - Pellera Technologies

Think about that. Your POC works great. Demo goes well. Executives are excited.

Then you try to deploy it across 47 business units in 12 countries with 8 different legacy systems and 23 compliance frameworks.

Suddenly your 3-month POC becomes a 2-year integration nightmare.

The Legacy System Integration Crisis

@security_sam mentioned 60% cite legacy system integration as the primary challenge. Let me add color from what I heard today:

The speaker broke down WHY legacy integration is so hard:

  1. Data format hell: Legacy systems store data in formats that modern AI can’t consume without extensive ETL
  2. API limitations: Many legacy systems don’t HAVE APIs. You’re literally screen-scraping or using RPA bots
  3. Real-time requirements: AI needs real-time data. Legacy systems batch-process nightly
  4. Security boundaries: Legacy systems weren’t designed for AI access patterns - every integration is a security review
  5. Vendor lock-in: Your legacy vendor doesn’t WANT you using AI with their system (threatens their business model)

One attendee raised her hand and said: “We spent $2M on our AI POC. We’ve now spent $12M trying to integrate it with our ERP system and we’re only 60% done.”

The room nodded. Everyone has this story.

The Data Quality Problem is Worse Than You Think

@security_sam mentioned 42% lack sufficient proprietary data. The panel today went deeper:

It’s not just QUANTITY of data. It’s QUALITY.

According to the panel:

  • Average enterprise data quality score: 62/100 (source: IBM study cited in presentation)
  • 47% of data records have critical errors
  • Data labeling costs 10x more than expected
  • Ground truth verification requires domain experts (expensive and slow)

Real example from the panel:

A healthcare company tried to train an AI on their patient records. Problems:

  • 30% of records had incorrect diagnosis codes (billing optimization gaming)
  • 18% had missing critical fields
  • Data entry errors compounded over 15 years
  • HIPAA compliance meant they couldn’t use third-party data labeling services

Cost to clean the data: $4.5M
Cost to train the AI: $800K

You read that right. Data cleaning cost 5.6x more than AI training.

Nobody budgets for this.

The Skills Gap: It’s Not Just About Hiring

@security_sam mentioned 40% lack AI expertise and 43% are hiring. But here’s what the panel revealed:

Hiring ML engineers doesn’t solve the problem.

Why? Because enterprise AI requires THREE different skill sets:

  1. ML Engineering: Build the models (this is what everyone hires for)
  2. MLOps/Infrastructure: Deploy and maintain the models at scale (severely underestimated)
  3. AI Governance & Compliance: Navigate regulations and risk (almost nobody has this)

One company shared their hiring experience:

  • Posted ML Engineer role: 500 applications
  • Posted MLOps Engineer role: 50 applications
  • Posted AI Governance Specialist role: 3 applications

The market for #3 barely exists. You’re building that expertise from scratch (lawyers + engineers + compliance officers cross-training each other).

The ROI Reality Check

Only 1 in 4 AI initiatives deliver expected ROI.

The panel broke down WHY:

Hidden costs nobody told you about:

  • Data infrastructure: 3x budget
  • Compute costs: 5x initial estimates (inference at scale is EXPENSIVE)
  • Maintenance: AI models degrade, need retraining every 3-6 months
  • False positives: If your AI is 95% accurate, that means 5% of decisions need human review (massive operational overhead)
  • Integration: Already discussed, easily 10x initial estimates
  • Governance: 18-month overhead @security_sam mentioned

One attendee calculated:

  • AI project budget: $500K
  • Actual total cost over 2 years: $7.2M
  • ROI calculation was based on $500K investment
  • Actual ROI: Negative 40%

This is why executives are getting skeptical.

What Success Looks Like: The 30% Club

The panel kept referring to “technology-advanced companies” - the 30% who have successfully implemented AI at scale.

Source: Deloitte AI Trends 2025

What are they doing differently?

  1. They started 3-5 years ago (first-mover advantage in building governance and infrastructure)
  2. They treat AI as infrastructure investment, not project cost (different budget category, different expectations)
  3. They hired compliance people FIRST, then engineers (opposite of what most companies do)
  4. They built internal AI platforms (reusable infrastructure, not one-off projects)
  5. They accepted 18-24 month timelines (no shortcuts)

The Controversial Take from the Panel

The final speaker (CEO of an AI consulting firm) said something that made half the room angry:

“If you’re just starting your enterprise AI journey in 2025, you’re probably too late. The winners have already been working on this for 5 years. You’re not catching up - you’re entering a mature market with immature capabilities.”

Is he right?

I don’t know. But it explains why VCs are only funding AI companies with proven traction, not just ideas.

My Action Items from Today

Based on both @security_sam’s panel and mine:

  1. Stop pretending AI projects are software projects - They’re infrastructure + compliance + operations projects that happen to use AI
  2. Budget 5x what you think it will cost - You’ll still go over budget, but you’ll be closer
  3. Hire for governance roles before engineering roles - Controversial but makes sense given 18-month governance timeline
  4. Start with low-risk use cases - Build expertise before betting the company
  5. Accept that this is a 3-5 year transformation - Not a 6-month project

Questions for @security_sam and Community

  1. Did your panel discuss the FTC AI claims enforcement? What specific cases are they watching?

  2. The 18-month governance timeline - is that for greenfield AI governance or retrofitting existing systems?

  3. For the healthcare company in your Company B example - how did they structure the Red Team? Reporting structure? Metrics?

  4. Has anyone here successfully navigated EU AI Act compliance? What did it actually cost vs budget?

Tomorrow I’m attending the Open Source vs Closed Source AI debate. My hypothesis: Open source has compliance advantages (transparency, auditability) but closed source has safety advantages (vendor responsibility). Let’s see if the panel agrees.

Cited sources:

  • Pellera Technologies: Top 5 AI Adoption Challenges for 2025
  • Deloitte: AI Trends 2025 - Adoption Barriers and Updated Predictions
  • IBM Think Insights (cited in panel presentation)
  • Panel presentations and speaker case studies from SF Tech Week Day 1

This thread is eye-opening. I spent Day 1 at a completely different set of SF Tech Week sessions - focused on product-led AI adoption - and I’m seeing the same issues from a CUSTOMER perspective.

The Session: “AI Features That Users Actually Want (vs What We Think They Want)”

Panel included:

  • VP Product from Notion
  • Head of AI Product at Figma
  • Product Lead for ChatGPT Enterprise (OpenAI)
  • CPO from a major B2B SaaS company

The theme: There’s a HUGE gap between AI capabilities and AI that customers trust and adopt.

The Trust Problem Nobody is Solving

The Notion VP shared data that floored me:

When they launched AI features:

  • Marketing promised “10x productivity”
  • Engineering built sophisticated ML models
  • Compliance approved everything

User adoption after 6 months: 12%

Why? Users don’t trust it.

When you dig into the qualitative feedback:

  • “I don’t understand how it came to this conclusion”
  • “It was wrong once, so now I double-check everything (defeating the purpose)”
  • “My boss won’t accept AI-generated work, so why bother?”
  • “I’m afraid it’s going to leak sensitive data”

The Explainability Challenge

The Figma speaker went deep on this:

“Users don’t want a black box that magically solves their problem. They want a tool they understand and control.”

Example from Figma’s AI features:

Version 1: “AI automatically generates design variations”

  • User clicks button
  • AI does magic
  • User sees results
  • Adoption: 8%

Version 2: “AI suggests design improvements with explanations”

  • User clicks button
  • AI shows 3 options WITH reasoning (“This variation improves contrast for accessibility”)
  • User picks one and can see exactly what changed
  • Adoption: 34%

Same underlying AI. 4x difference in adoption just by adding explainability.

But here’s the catch @cto_michelle mentioned: Building explainability adds 6-9 months to the development timeline.

Why? Because:

  1. The AI itself doesn’t “know” why it made decisions (black box problem)
  2. You need to build a SEPARATE system to generate human-readable explanations
  3. Those explanations need to be accurate (can’t just make stuff up)
  4. Explanations need compliance review (in regulated industries)

So your 3-month AI POC becomes a 12-month production feature.

The Data Privacy Paradox

The OpenAI speaker (ChatGPT Enterprise) shared fascinating data about enterprise customer concerns:

Top 5 objections from enterprise sales calls:

  1. “Where is our data stored?” (89% of calls)
  2. “Who can access our data?” (76%)
  3. “Will our data train your models?” (71%)
  4. “Can you guarantee GDPR/SOC2 compliance?” (68%)
  5. “What if your AI leaks our trade secrets?” (52%)

Notice: NONE of these are about AI capability. They’re all about trust and compliance.

And these are objections to OPENAI - the most trusted AI brand in the world. Imagine if you’re a no-name startup.

The Cost of AI Features Users Don’t Use

The B2B SaaS CPO dropped some brutal math:

Their company built 12 AI features in 2024:

  • Development cost: $8.4M
  • User adoption rate (average across 12 features): 18%
  • Features with >50% adoption: 2 out of 12

Effective cost per successful feature: $4.2M

And this doesn’t include the opportunity cost of features they DIDN’T build while building AI features nobody wanted.

The post-mortem revealed:

The 2 successful features:

  • Solved a specific, narrow problem
  • Had clear before/after metrics
  • Gave users control and transparency
  • Failed gracefully (when AI was wrong, it didn’t break the workflow)

The 10 failed features:

  • Tried to solve broad, ambiguous problems
  • Used AI because “AI is hot” not because it was the right tool
  • Black box experience
  • When AI failed, it broke the user’s workflow

How This Connects to @security_sam and @cto_michelle’s Points

@security_sam mentioned 60% cite legacy integration as primary challenge.

From the product side: Legacy systems often have better UX because they’ve been refined over decades. Your new AI feature needs to be 10x better to overcome switching costs, not just “better because AI.”

@cto_michelle mentioned 80% of work happens after POC.

From the product side: POC is built for ideal conditions. Production means handling:

  • Edge cases (user does something unexpected)
  • Errors (AI is wrong, network fails, data is missing)
  • Integration (works in isolation, breaks when combined with 47 other features)
  • Accessibility (screen readers, keyboard navigation, color blindness)
  • Internationalization (works in English, breaks in Japanese)

Nobody demos these in the POC. But these are 80% of development time.

The Regulation Impact on Product

This is where @security_sam’s regulatory discussion hits product teams:

The OpenAI speaker said:

EU AI Act compliance meant they had to add:

  • User consent flows (adds friction)
  • Data retention controls (users can delete their data = AI model can “forget” = degrades performance)
  • Explainability features (AI must explain high-risk decisions)
  • Human review processes (for regulated industries)

Each of these reduces user satisfaction metrics (more clicks, more friction, slower experience).

But they’re legally required.

So you’re building a WORSE product (from a pure UX standpoint) because compliance demands it.

This is the paradox:

  • Better AI = More automation = Less user control = More regulatory scrutiny = More compliance requirements = Worse UX = Lower adoption

What Actually Works: The AI Features Users Love

The panel identified common patterns in successful AI features:

  1. Copilot, not Autopilot

    • AI suggests, user decides
    • Adoption: 3x higher than fully automated AI
  2. Narrow and Deep, not Broad and Shallow

    • AI that’s excellent at ONE thing beats AI that’s mediocre at 10 things
    • Example: Grammarly’s tone detector vs general “make my writing better”
  3. Fail Gracefully

    • When AI is unsure, it says “I don’t know” rather than hallucinating
    • Users prefer honest AI to confident-but-wrong AI
  4. Transparent Pricing

    • If AI costs money to run (inference costs), tell users upfront
    • Hidden costs destroy trust
  5. Local-First When Possible

    • On-device AI (even if less capable) often preferred for sensitive data
    • Example: Apple Intelligence strategy

My Controversial Take

After Day 1 of SF Tech Week, I’m convinced:

90% of AI features being built will fail.

Not because the AI is bad. But because:

  • Wrong problem (AI solution in search of a problem)
  • Wrong trust model (users don’t trust black boxes)
  • Wrong timeline (18-month governance overhead @security_sam mentioned)
  • Wrong expectations (promising 10x productivity, delivering 1.2x)

The 10% that will succeed:

  • Solve real, specific problems
  • Build trust through transparency
  • Accept compliance as a feature, not a bug
  • Set realistic expectations

Questions for Product People in This Community

  1. What’s your AI feature adoption rate? Are users actually using your AI features or ignoring them?

  2. How do you balance explainability vs performance? Explainable AI is often slower/less accurate.

  3. Have you had to remove AI features due to low adoption? What did you learn?

  4. How are you handling the EU AI Act compliance impact on UX? Is anyone finding creative solutions?

Day 2 agenda: I’m attending sessions on AI monetization and pricing strategies. Hypothesis: Most companies are giving away AI for free because they can’t figure out how to price it profitably. Let’s see if I’m right.

Sources:

  • SF Tech Week “AI Features That Users Actually Want” panel (Day 1)
  • Speaker presentations from Notion, Figma, OpenAI, and B2B SaaS companies
  • Panel case studies and user research data shared
  • EU AI Act compliance discussions from OpenAI speaker

Reading this thread while sitting in the SF Tech Week “AI Infrastructure at Scale” session, and I’m nodding so hard my neck hurts.

The Hidden Infrastructure Costs Nobody Talks About

The speaker (Head of ML Infrastructure at Stripe) just broke down their actual AI infrastructure costs:

For a single production ML model serving 10M requests/day:

Initial estimate (from POC):

  • Compute: $5K/month
  • Storage: $500/month
  • Total: $5.5K/month

Actual production costs:

  • Inference compute: $45K/month (9x higher)
  • Training compute: $25K/month (ongoing retraining)
  • Storage (model versions, training data, logs): $8K/month (16x higher)
  • Monitoring and observability: $12K/month (not budgeted)
  • Data pipeline infrastructure: $15K/month (not budgeted)
  • Redundancy and failover: $20K/month (not budgeted)
  • Total: $125K/month (23x initial estimate)

And that’s just ONE model.

Why @cto_michelle’s 5x Budget Rule is Actually OPTIMISTIC

@cto_michelle said budget 5x what you think. The Stripe speaker said:

“In my experience, production AI costs 10-20x your POC costs. If your finance team won’t accept that multiplier, don’t start the project.”

Why the massive difference?

  1. POC runs on toy data (1000 records)

    • Production runs on real data (100M records)
    • Scale factor: 100,000x
  2. POC has no reliability requirements

    • Production needs 99.99% uptime (4 nines)
    • That means redundancy, failover, chaos testing
    • Cost multiplier: 3-5x
  3. POC doesn’t handle edge cases

    • Production must handle every possible input
    • Cost multiplier: 2-3x (just in defensive coding and error handling)
  4. POC doesn’t retrain models

    • Production models degrade (data drift)
    • Must retrain every 3-6 months
    • Cost multiplier: Ongoing (not just initial training cost)
  5. POC doesn’t monitor performance

    • Production needs real-time monitoring, alerting, debugging
    • Tools cost money. Engineers debugging cost WAY more money.

The Skills Gap from an Engineering Manager Perspective

@security_sam and @cto_michelle mentioned the skills gap. Let me add the engineering hiring reality:

My team’s open roles (I’m actively hiring at SF Tech Week):

  • ML Engineer: 250 applications, 12 qualified, 3 offers made, 0 accepted (all got better offers)
  • MLOps Engineer: 42 applications, 6 qualified, 2 offers made, 1 accepted
  • AI Infrastructure Engineer: 18 applications, 2 qualified, 1 offer made, 0 accepted

The math doesn’t work:

We need to hire 10 engineers to support our AI roadmap.
At current acceptance rates, we need to:

  • Source 2,500 ML Engineer candidates to hire 10
  • Source 420 MLOps candidates to hire 10
  • Source 180 AI Infrastructure candidates to hire 10

That’s 3,100 candidates to fill 30 roles.

And we’re competing with OpenAI, Anthropic, Google, Meta - companies that can pay 2-3x what we can afford.

The “Just Use OpenAI API” Trap

Multiple people at this conference keep saying: “Why build your own AI? Just use OpenAI’s API!”

The Stripe speaker addressed this directly:

When OpenAI API works:

  • Low-stakes use cases (content generation, summarization)
  • Low volume (<100K requests/month)
  • Non-latency-sensitive (users can wait 2-5 seconds)
  • No sensitive data

When OpenAI API fails:

  • High-stakes decisions (financial, medical, legal)
  • High volume (>1M requests/month = $$$)
  • Latency-sensitive (<100ms required)
  • Sensitive data (can’t send to third party)

Real example:

A fintech company tried using OpenAI API for fraud detection:

  • POC: 1,000 transactions/day, $50/month, works great
  • Production: 500,000 transactions/day
  • Projected cost: $25,000/month
  • Actual cost after 1 month: $47,000/month (usage patterns differed from POC)
  • Latency: 200-400ms (unacceptable for real-time fraud, users timing out)
  • Data compliance: Legal team said “absolutely not, we can’t send customer data to OpenAI”

Solution:

  • Spent 6 months building in-house model
  • Cost: $800K development + $15K/month infrastructure
  • Latency: 12ms average
  • Data compliance: Approved

Break-even timeline: 18 months

But that doesn’t count the 6-month delay to market.

The Integration Hell @cto_michelle Mentioned

Let me give you a concrete example from my team:

Project: Integrate AI-powered search into our product

Sounds simple, right? AI startup vendors make it sound easy: “Just call our API!”

Actual integration requirements:

  1. Authentication and authorization

    • Our users have 47 different permission levels
    • AI needs to respect those permissions
    • Can’t just “index everything and let AI search it”
    • Took 2 months to build permission-aware indexing
  2. Data synchronization

    • Our data is in 8 different databases
    • Real-time sync vs batch sync trade-offs
    • Took 3 months to build reliable data pipeline
  3. Error handling

    • What if AI API is down? (It will be)
    • Fallback to regular search? Different UX
    • What if AI returns garbage? How do we detect and handle?
    • Took 1 month to build resilient error handling
  4. Performance

    • AI search is slower than regular search
    • Users expect <200ms response
    • Had to add caching layer, prediction, pre-fetching
    • Took 2 months to optimize performance
  5. Monitoring and debugging

    • When AI search is “wrong”, how do we debug?
    • Traditional search: query logs, index stats, straightforward
    • AI search: model version, embedding space, relevance scoring, vector search, black box
    • Took 1 month to build proper monitoring
  6. Cost management

    • AI search costs 10x more than traditional search
    • Can’t let one user’s expensive query bankrupt us
    • Rate limiting, cost tracking, alerting
    • Took 2 weeks to build cost controls

Total: 9.5 months to “just integrate an API”

POC took 2 weeks.

How This Relates to the 80% Post-POC Problem

@cto_michelle’s stat that 80% of work happens after POC is NOT because engineers are slow.

It’s because POC answers: “Can this work in ideal conditions?”

Production answers: “Can this work in ALL conditions, reliably, securely, cost-effectively, at scale, with monitoring, with error handling, with compliance, with integration into existing systems, and with acceptable user experience?”

POC is 1 question. Production is 50 questions.

The Controversial Solution: Don’t Build AI

The Stripe speaker’s most controversial slide:

“The best AI project is the one you DON’T build.”

He showed a decision tree:

  1. Can you solve this with deterministic code? → Don’t use AI
  2. Can you buy a solution? → Don’t build AI
  3. Can you outsource this to a vendor? → Don’t build AI
  4. Is the ROI clear and measurable? → Maybe build AI
  5. Do you have the skills, budget, and timeline? → Maybe build AI
  6. Is this a strategic differentiator? → Consider building AI

By his tree, only ~5% of AI projects should actually be built.

The other 95%? Use existing tools, buy solutions, or accept that the problem doesn’t need an AI solution.

My Team’s New Policy (as of Today)

After hearing all these panels, I’m implementing a new policy for my engineering team:

Before we start ANY AI project:

  1. :white_check_mark: Prove it works with deterministic code first (if possible)
  2. :white_check_mark: Get sign-off on 10x budget multiplier from finance
  3. :white_check_mark: Get commitment for 12-18 month timeline (no shortcuts)
  4. :white_check_mark: Hire compliance engineer BEFORE ML engineer
  5. :white_check_mark: Build full production infrastructure plan before writing ML code
  6. :white_check_mark: Define success metrics that aren’t “AI accuracy” (actual business metrics)

If we can’t check all 6 boxes, we DON’T START.

Questions for Engineering Leaders

  1. What’s your actual AI cost multiplier? POC vs production, be honest

  2. Has anyone successfully hired an AI Infrastructure Engineer? Where did you find them and what did you pay?

  3. For those using OpenAI API in production - what’s your actual monthly bill and request volume?

  4. What percentage of AI projects have you CANCELLED after POC but before production? Why?

Day 2: I’m attending the MLOps tooling showcase. Hypothesis: There are 100+ MLOps tools and none of them solve the actual problem (integration with legacy systems). Let’s see if I’m wrong.

Sources:

  • SF Tech Week “AI Infrastructure at Scale” session (Day 1)
  • Stripe ML Infrastructure team presentation
  • Real project cost and timeline data from my team and the panel
  • Engineering hiring data from my active recruiting efforts this week