AI-Native Biotech: From Lab Copilots to Generative Therapeutics

Just attended the a16z + Fenwick panel on AI-native biotech at LA Tech Week, and I need to process what I just heard. As someone who’s been in computational biology for over a decade, this is both the future I’ve been working toward AND arriving faster than I expected.

What ‘AI-Native Biotech’ Actually Means

The panel covered three major areas:

1. High-Throughput Experimental Design
AI systems that design experiments, predict outcomes, and iterate autonomously. We’re not just automating pipelines - we’re having AI propose hypotheses we wouldn’t have thought of.

2. Generative AI in Therapeutic Discovery
Models like AlphaFold3 (which just won the Nobel Prize in Chemistry!) are predicting protein structures with unprecedented accuracy. But now we’re going further - generative models are suggesting entirely new molecular scaffolds that have never existed in nature.

3. Lab-Native Copilots
AI assistants embedded directly in lab workflows - designing assays, troubleshooting failed experiments, even suggesting alternative protocols based on success rates from thousands of other labs.

The Excitement

What got me energized:

  • Speed: Drug discovery timelines could compress from 10+ years to 2-3 years
  • Undruggable targets: Diseases we couldn’t tackle before become tractable
  • Personalized medicine: AI makes it economically viable to design therapies for rare diseases or individual patients
  • Isomorphic Labs (DeepMind spinoff) raised $600M Series A in March 2025 - largest biotech Series A ever
  • AI is projected to generate $350-410B annually for pharma by 2025

The Terrifying Part

But here’s what keeps me up at night:

We’re trusting predictions we don’t fully understand. In my lab, I’ve seen AI suggest brilliant molecules AND complete nonsense. The models are powerful but not infallible.

Regulatory black holes. The FDA doesn’t know how to evaluate AI-designed drugs. What does “validation” mean when the design process is a neural network black box?

Dual-use concerns. What happens when AI generates a molecule that’s both therapeutic AND could be weaponized? This wasn’t discussed at the panel, but it should have been.

Lab reality check. I’ve spent the last 6 months validating AI predictions with wet lab experiments. Success rate: about 40%. That’s actually GOOD in drug discovery, but we need honest conversations about failure rates.

Real Talk from the Trenches

The panel was optimistic (VCs tend to be), but as someone actually using these tools:

  • AI predictions need rigorous experimental validation
  • We need better mechanistic interpretability - why did the model suggest this molecule?
  • Training data quality matters HUGELY (garbage in, garbage out applies to proteins too)
  • Integration with existing lab workflows is harder than it looks

The industry is moving FAST - billions in funding, major pharma partnerships with AI companies. But are we building the right safety rails?

Questions for the Community

Curious to hear from:

  • Other scientists: Are you using AI in wet lab settings? What’s your validation framework?
  • ML engineers: How do you think about interpretability for life-critical applications?
  • Security folks: How should we approach dual-use risk in generative biology?
  • Product people: What’s the right pace of deployment for tools that could save lives but also create risks?

This technology will revolutionize medicine. I’m certain of that. But we need to be thoughtful about HOW we get there.

#AIBiotech #DrugDiscovery #GenerativeAI #AlphaFold #Therapeutics

@dr_rachel_bio This is fascinating, and as someone who builds ML models (though not in biotech), your 40% validation success rate is both impressive and concerning.

The Data Science Perspective

A few things jump out:

Training Data Quality is Everything

You mentioned “garbage in, garbage out” - in biotech this is even more critical because:

  • Protein interaction datasets are SPARSE compared to vision/language
  • Negative results (failed experiments) are underreported in literature
  • Batch effects and experimental conditions vary wildly between labs

How are you handling training data curation? Are you incorporating failed experiments or only successful structures?

The 40% Success Rate Context

In traditional drug discovery, hit rates for high-throughput screening are often 0.1-1%. So 40% is genuinely revolutionary. But the question is: what’s the failure mode?

  • Does the AI suggest molecules that are synthetically impossible?
  • Do they fail at binding prediction?
  • Are they toxic in ways the model didn’t anticipate?

Understanding WHY the other 60% fail would actually be incredibly valuable data to feed back into the model.

Interpretability Challenges

You asked about interpretability for life-critical applications. In my experience with healthcare ML:

  1. Attention mechanisms help (“the model focused on these protein residues”)
  2. Ablation studies show which features matter
  3. Counterfactual explanations (“if we changed this amino acid, prediction changes”)

But honestly? For true mechanistic understanding, you probably need hybrid approaches - ML to narrow the search space, then molecular dynamics simulations to understand the physics.

Active Learning Could Be Key

Instead of batch training, have you considered active learning loops?

  1. AI suggests 100 candidate molecules
  2. You test the top 10 in wet lab
  3. Feed results back immediately
  4. Model learns what kinds of predictions are reliable vs not

This could improve your 40% much faster than waiting for massive datasets.

My Concern: Overfitting to “Druggable” Space

If AlphaFold and generative models are trained primarily on existing drugs and known proteins, are we just getting sophisticated interpolation within chemical space we already know?

The real breakthroughs might require TRUE exploration - molecules that look weird to the model AND to us. But those are exactly the ones that will fail validation most often.

How do you balance exploitation (refining known good approaches) vs exploration (genuinely novel scaffolds)?

Question: What’s your protocol for deciding when an AI prediction is “confident enough” to spend wet lab resources on? Do you have calibrated uncertainty estimates?

@dr_rachel_bio As an engineer who’s built data pipelines (but never touched a protein), I’m both amazed and slightly terrified by the infrastructure challenges this must create.

The Engineering Reality Check

Computational Requirements Must Be Insane

AlphaFold3 predictions require massive compute. Are you running this on-prem or cloud? How do you handle:

  • Model inference latency (waiting hours/days for predictions?)
  • Storage for molecular simulations (terabytes of trajectory data?)
  • Version control for models AND the predictions they generated
  • Reproducibility when the model gets updated

I’m guessing you can’t just git clone and npm install this stuff.

The “Lab Copilot” Integration Problem

You mentioned AI assistants embedded in lab workflows. Having built internal tools for engineers, I can tell you: the integration is always harder than the AI.

  • Are labs using standardized data formats? (My guess: no)
  • How do you connect AI predictions to LIMS (Lab Information Management Systems)?
  • When an experiment fails, how does that feedback loop work? Manual data entry?
  • Do researchers trust the AI enough to act on suggestions without paranoia?

The last point is huge. In software, we can deploy incrementally and roll back. In wet lab… you just wasted expensive reagents and weeks of time.

Questions I Have

1. Deployment Strategy

How do you deploy new model versions when researchers have experiments in flight?

If you’re halfway through validating 50 AI-suggested molecules and the model gets updated, do you:

  • Finish with old predictions (inconsistent)
  • Re-run everything (expensive)
  • Keep both versions running (operational nightmare)

2. Failure Modes

@data_rachel asked about the 60% failure cases - I’m wondering about the INFRASTRUCTURE failure modes:

  • What happens if the model goes down mid-experiment?
  • How do you monitor for model drift? (is the AI getting worse over time?)
  • Do you have fallback protocols when the AI is uncertain?

3. The Human-in-the-Loop Challenge

You said “AI proposes hypotheses we wouldn’t have thought of” - but how do you prevent it from:

  • Suggesting things that violate basic chemistry (even if they’re mathematically plausible)
  • Going down rabbit holes that waste lab time
  • Creating dependencies where researchers can’t work without the AI

In software, we have “GitHub Copilot fatigue” where junior engineers copy AI suggestions without understanding them. Is there “AlphaFold fatigue” where researchers just trust the model blindly?

The Bright Side

If you solve these problems, the infrastructure you’re building could be transformative:

  • Shared compute infrastructure for biotech startups (AWS for drug discovery?)
  • Standardized APIs for molecular predictions
  • Open datasets of validated AI predictions (positive AND negative results)

But man, the operational complexity of keeping this running reliably while people are trying to cure diseases? That’s a lot of pressure.

Real question: Do you have on-call rotations for when the AI models break? What’s the SLA for “scientist waiting for protein prediction”?

@dr_rachel_bio You mentioned dual-use concerns briefly, but I think this deserves WAY more attention than it got at the panel. As someone in security, the bioweapon implications of generative biology are terrifying.

The Security Nightmare Scenario

AI-Generated Pathogens

If AI can design novel therapeutic proteins, it can design novel pathogens. We’re talking about:

  • Optimizing viruses for transmissibility + lethality
  • Designing toxins that bypass existing treatments
  • Creating synthetic organisms that don’t exist in nature (no natural immunity)

The barrier to entry is dropping fast. What used to require a PhD and a BSL-4 lab might soon be possible with a gaming GPU and some wet lab equipment you can buy on eBay.

The Publication Dilemma

In security, we have “responsible disclosure” - you tell the vendor privately, give them time to patch, then publish.

But in biotech:

  • Publishing AI models for drug discovery ALSO makes them useful for bioweapons
  • You can’t “patch” biology like you patch software
  • Withholding research slows down legitimate medicine

How do we handle this? AlphaFold is open-source, which is great for democratizing science but also democratizes risk.

Access Control is Hard in Biology

In software security, we can:

  • Rate-limit API access
  • Monitor for suspicious queries
  • Require authentication

But with AI drug discovery:

Problem 1: How do you define “suspicious”?

  • Is someone designing a novel neurotoxin a researcher working on Alzheimer’s or a bioterrorist?
  • Legitimate research into pandemic preparedness looks identical to weaponization research

Problem 2: Detection is reactive

  • By the time you notice someone is designing dangerous molecules, they already have the sequences
  • DNA synthesis companies screen orders, but what about in-house synthesis?
  • Once a sequence is known, it’s information - can’t be deleted from the world

Problem 3: Decentralization

  • You might implement good security at Isomorphic Labs, but what about the 1000 other AI biotech startups?
  • Open-source models mean anyone can run them locally (no centralized monitoring)

What I Think We Need

1. Red-Teaming for Biotech AI

Just like we have penetration testing in cybersecurity, we need “biological red teams” that:

  • Try to generate dangerous molecules using public AI tools
  • Document what’s possible with what resources
  • Inform policy BEFORE someone weaponizes this

2. Synthetic Biology Security Standards

Similar to responsible AI deployment frameworks, but for biology:

  • Mandatory safety reviews before model release
  • Tiered access based on lab certifications
  • International agreements (like nuclear non-proliferation, but for AI biotech)

3. Detection, Not Just Prevention

We can’t prevent all misuse, so we need:

  • Environmental monitoring for novel pathogens (early warning systems)
  • Forensic tools to trace engineered organisms back to their source
  • Rapid-response therapeutic design (fight AI-designed threats with AI-designed countermeasures)

The Uncomfortable Truth

The same AI that could cure cancer could create the next pandemic. And unlike nuclear weapons (which require enriched uranium and specialized facilities), biology is:

  • Easier to hide
  • Cheaper to produce
  • Self-replicating once released

I’m not saying we should stop AI drug discovery. The medical benefits are too important. But the lack of security discussion at the LA Tech Week panel is concerning.

@cto_michelle - how do you think about threat modeling for life sciences AI in your organization?

@data_rachel @alex_dev - would your companies even know if an employee was using internal AI tools to design something dangerous?

This isn’t hypothetical. We’re building the tools right now.

@dr_rachel_bio This discussion has been incredible - from validation frameworks to infrastructure to biosecurity. As someone who’s had to make build-vs-buy decisions for AI, the strategic implications of AI-native biotech are fascinating and complex.

The Leadership Perspective

Build vs Buy is Even Harder in Biotech

In traditional software:

  • Build: Hire ML engineers, train models
  • Buy: Use OpenAI API, Anthropic, etc.

In biotech AI:

  • Build: Hire computational biologists + wet lab scientists + ML engineers + regulatory experts + biosecurity specialists
  • Buy: License AlphaFold derivatives? Partner with Isomorphic Labs? But then you’re dependent on vendors for life-saving discoveries

The talent requirements are insane. You need people who understand biology AND deep learning AND can work with experimental scientists. These folks are rare and expensive.

Organizational Readiness

@alex_dev’s point about lab integration is huge. From a leadership perspective:

Culture clash: Wet lab scientists work on 6-month experiment cycles. ML engineers deploy daily. How do you bridge that?

Risk tolerance: In software, we can A/B test and iterate. In drug discovery, one mistake could kill patients. The organizational pressure is different.

Funding timeline: VCs expect software companies to scale fast. Drug discovery takes YEARS even with AI acceleration. How do you manage investor expectations?

Responding to @security_sam’s Question

You asked about threat modeling for life sciences AI. Here’s how I’d approach it:

1. Internal Access Controls

  • Role-based access to generative biology models (not everyone needs access)
  • Audit logs for all model queries (who designed what molecule, when)
  • Two-person integrity for high-risk predictions (peer review for dangerous molecules)

2. External Partnerships

  • Vet all research collaborations for biosecurity expertise
  • Include biosecurity experts on scientific advisory boards
  • Partner with DNA synthesis screening companies (prevent weaponized sequences from being made)

3. Responsible Deployment

  • Staged rollout: Internal-only → Trusted partners → Public API
  • Rate limiting and query monitoring (similar to how OpenAI detects jailbreaks)
  • Publish safety frameworks ALONGSIDE model releases

But honestly? @security_sam is right that this is incredibly hard. If an employee wanted to misuse internal tools, detection is reactive. We need industry-wide standards.

The Strategic Bet

Here’s the uncomfortable calculus for any CTO/CEO in biotech:

Moving too fast = Risk deploying unsafe/ineffective drugs, potential biosecurity incidents, regulatory backlash

Moving too slow = Competitors get to market first, patients die waiting for treatments, investors lose patience

With AI-native biotech, the timeline compression is real:

  • Drug discovery: 10 years → 3 years
  • But FDA approval processes haven’t changed
  • And biosecurity frameworks don’t exist yet

So you have this weird situation where the SCIENCE is racing ahead but the SYSTEMS (regulatory, safety, security) are catching up.

What I’d Want to Know

@dr_rachel_bio - a few leadership questions:

1. Regulatory Strategy
How is the FDA actually handling AI-designed drugs? Are they creating new evaluation frameworks? Or forcing AI designs through traditional validation (which negates the speed advantage)?

2. Talent Retention
Your best computational biologists could go work at Isomorphic Labs for $600M Series A equity. How do you keep them when you’re a scrappy startup?

3. Long-term vs Short-term
Are you optimizing for:

  • Fast wins (incremental improvements on existing drugs)
  • Moonshots (totally novel therapeutic modalities)
  • Both?

Because the organizational structure and funding model are totally different for each.

Final Thought

This is the most exciting AND terrifying frontier in AI. Unlike LLMs (where the worst case is bad advice or misinformation), biotech AI has life-or-death consequences - both positive (curing diseases) and negative (bioweapons, failed drugs).

Companies that get this right will save millions of lives. Companies that get it wrong could cause catastrophic harm.

The pressure on leadership teams in AI-native biotech is IMMENSE. Respect to everyone building in this space responsibly.

WOW. This thread has become exactly the kind of cross-disciplinary conversation we NEED but rarely have. Let me respond to everyone’s thoughtful points:

@data_rachel - Data Science & Validation

Training Data Quality:
You nailed it. We ARE incorporating failed experiments, but it’s messy. Most labs don’t publish negative results, so we’re building our own proprietary dataset of “things that didn’t work.” It’s tedious but critical.

Failure Modes (the 60%):
Breakdown:

  • ~25% synthetically challenging (theoretically possible, practically expensive/slow)
  • ~20% binding prediction failures (protein flexibility issues AlphaFold doesn’t fully capture)
  • ~10% toxicity/ADME issues (absorption, distribution, metabolism, excretion)
  • ~5% just weird AI hallucinations (chemically implausible)

Active Learning:
YES! We’re doing exactly this. Every Friday we review the week’s wet lab results and retrain. Our hit rate has improved from 28% → 40% in 6 months using this loop.

Exploration vs Exploitation:
This is the hardest tradeoff. Right now we’re 70% exploitation (refining known scaffolds) / 30% exploration (wild ideas). The exploration stuff fails more but has higher ceiling.

Uncertainty Estimates:
We use ensemble predictions (5 model variants) and only pursue molecules where all 5 agree within a threshold. Still imperfect but better than single-model confidence.


@alex_dev - Infrastructure Reality

Compute:
We’re hybrid:

  • Cloud (AWS with GPU instances) for model training
  • On-prem cluster for routine predictions (too expensive to run 1000s of predictions in cloud)
  • Storage: ~40TB and growing fast (molecular dynamics trajectories are HUGE)

Latency:

  • AlphaFold3 structure prediction: 30min - 4 hours depending on protein size
  • Generative model suggestions: Minutes
  • Full molecular dynamics validation: 1-3 days

So yes, researchers are literally waiting days sometimes.

Lab Integration:
You’re RIGHT that this is harder than the AI. We built a custom LIMS integration that:

  • Auto-logs AI predictions
  • Links predictions to experimental results
  • Tracks which model version generated which prediction

Manual data entry is still required for nuanced experimental observations. It’s clunky.

Deployment Strategy:
We version-lock experiments. If you’re validating predictions from model v2.3, you KEEP using v2.3 until that batch completes. New experiments use the latest model. It’s the only sane approach.

“AlphaFold Fatigue”:
THIS IS REAL. We’ve had junior researchers blindly trust predictions that violated basic chemistry. Now we require senior scientist sign-off on any “weird” suggestions.

On-call:
Laughing/crying at this question because YES, we have on-call for the compute infrastructure. Our “SLA” is informal: <24hr for routine predictions, <4hr for urgent (clinical trial deadlines).


@security_sam - Biosecurity Concerns

This is the part of the panel that SHOULD have been 30 minutes but got 5.

You’re absolutely right about the dual-use problem. Our internal protocols:

  1. Pre-screening: Before designing anything, we ask “could this be weaponized?” If yes, extra scrutiny.
  2. Two-person rule: High-risk molecule designs require approval from me + our CSO.
  3. No autonomous synthesis: AI can SUGGEST molecules, but a human reviews before any synthesis happens.
  4. Partnership with SecureDNA: We screen all sequences through their platform before ordering synthesis.

But you’re right - this only works if EVERYONE does it. Open-source models with no guardrails are genuinely concerning.

Red-teaming: We should absolutely be doing this industry-wide. I’d support mandatory red-teaming before any generative biology model is released publicly.

Detection: Your point about environmental monitoring is key. We need a “biosecurity SIEM” - continuous monitoring for novel pathogens. This doesn’t exist yet at scale.


@cto_michelle - Leadership & Strategy

Build vs Buy:
We’re doing hybrid:

  • Built: Custom generative models for our specific therapeutic area (our secret sauce)
  • Bought: AlphaFold as a service, cloud infrastructure, some assay automation

Talent retention is BRUTAL. We lost 2 computational biologists to Isomorphic last year. Our retention strategy:

  • Equity with long vesting (golden handcuffs)
  • Publish-or-perish culture (researchers want papers, not just products)
  • Mission-driven focus (we’re working on rare diseases, which attracts people who care about impact)

Regulatory Strategy:
The FDA is… evolving. Right now:

  • They treat AI-designed drugs like ANY drug - same validation requirements
  • We’re working with them to develop “AI explainability” frameworks
  • Timeline advantage comes from DISCOVERY speed, but approval timeline is unchanged

So we get molecules faster, but still wait years for approval. The real win is in success rate (fewer failed trials).

Long-term vs Short-term:
Both! We have:

  • “Portfolio A”: Incremental improvements on existing drugs (3-5 year timeline, funds the company)
  • “Portfolio B”: Moonshots on undruggable targets (7-10 years, venture upside)

This keeps investors happy AND lets us do ambitious science.

Culture Clash:
Wet lab vs ML engineers is real. We do:

  • Cross-training (ML engineers spend time in the lab, biologists learn Python)
  • Joint standups (yes, standups for 6-month experiments - it’s weird but works)
  • Shared metrics (everyone cares about validated hit rate, not just model accuracy)

Final Reflections

This discussion proves why we need MORE of these conversations, not fewer. The LA Tech Week panel was exciting but surface-level. The real work happens when:

  • Data scientists understand biological constraints
  • Engineers grasp the stakes of life-sciences infrastructure
  • Security experts engage with scientific research EARLY
  • Leadership balances speed with safety

AI-native biotech will transform medicine. But only if we:

  1. Build rigorous validation frameworks
  2. Solve the infrastructure challenges
  3. Take biosecurity seriously from day one
  4. Create organizational structures that bridge disciplines

The fact that this forum discussion is more substantive than the panel itself says something. We need to keep having hard conversations about the HOW, not just celebrating the WHAT.

Thank you all for engaging so thoughtfully. This is exactly what the field needs.

P.S. - If anyone wants to continue this conversation, I’m happy to do a virtual roundtable or contribute to white papers on AI biotech safety/ethics. DM me.