Skip to main content

Enterprise Workflow Agents

· 3 min read

Key Themes and Context

Enterprise Workflows

  • Automation levels range from scripted workflows (minimal variation) to agentic workflows (adaptive and dynamic).
  • Enterprise environments, such as those supported by ServiceNow, involve complex, repetitive tasks like IT management, CRM updates, and scheduling.
  • The adoption of LLM-powered agents (e.g., API agents and Web agents) transforms these workflows by leveraging capabilities like multimodal observations and dynamic actions.

LLM Agents for Enterprise Workflows

  • API Agents
    • Utilize structured API calls for efficiency.
    • Pros: Low latency, structured inputs.
    • Cons: Depend on predefined APIs, limited adaptability.
  • Web Agents
    • Simulate human actions on web interfaces.
    • Pros: Greater flexibility; can interact with dynamic UIs.
    • Cons: High latency, error-prone.

WorkArena Framework

  • Benchmarks designed for realistic enterprise workflows.
  • Tasks range from IT inventory management to budget allocation and employee offboarding.
  • Supported by BrowserGym and AgentLab for testing and evaluation in simulated environments.

Technical Frameworks

Agent Architectures

  • TapeAgents Framework

    • Represents agents as resumable modular state machines.
    • Features structured logs (the "tape") for actions, thoughts, and outcomes.
    • Facilitates optimization (e.g., fine-tuning from teacher-to-student agents).
  • WorkArena++

    • Extends WorkArena with more compositional and challenging tasks.
    • Evaluates agents on capabilities like long-term planning and multimodal data integration.

Benchmarks

  • WorkArena: ~20k unique enterprise task instances.
  • WorkArena++: Focused on compositional workflows and data-driven reasoning.
  • Other tools: MiniWoB, WebLINX, VisualWebArena.

Evaluation Metrics

  • GREADTH (Grounded, Responsive, Accurate, Disciplined, Transparent, Helpful):
    • Prioritizes real-world agent performance metrics.
  • Task-Specific Success Rates:
    • For example, form-filling assistants evaluated at 300x lower cost than GPT-4 through fine-tuned students.

Challenges for Agents in Workflows

  • Context Understanding
    • Enterprise tasks require understanding deep hierarchies of information (e.g., dashboards, KBs).
    • Sparse rewards in benchmarks complicate learning.
  • Long-Term Planning
    • Subgoal decomposition and multi-step task execution remain difficult.
  • Safety and Alignment
    • Risks from malicious inputs (e.g., adversarial prompts, hidden text).
  • Cost and Efficiency
    • Shrinking context windows and modular architectures are key to reducing compute costs.

Future Directions

Augmentation Models

  • Centaur Framework:
    • Separates AI from human tasks (e.g., content gathering by AI, final editing by humans).
  • Cyborg Framework:
    • Promotes tight collaboration between AI and humans.

Unified Evaluation

  • Calls for a meta-benchmark to consolidate evaluation protocols across platforms (e.g., WebLINX, WorkArena).

Advancements in Agent Optimization

  • Leveraging RL-inspired techniques for fine-tuning.
  • Modular learning frameworks to improve generalizability.

Opportunities in Knowledge Work

  • Automation of repetitive, low-value tasks (e.g., scheduling, report generation).
  • Integration of multimodal agents into enterprise environments to support decision-making and strategic tasks.
  • Enhanced productivity through human-AI collaboration models.

This synthesis connects the theoretical and practical elements of enterprise workflow agents, showcasing their transformative potential while addressing current limitations.

Agents for Software Development

· 2 min read

Software’s Impact

  • Software is transforming industries, as predicted by Marc Andreessen (2011).
  • Potential impact of enabling everyone to write software to achieve their goals.

Software Development Workflow

  • Time allocation:
    • 17% Coding
    • 36% Bugfixing
    • 10% Testing
    • 8% Documentation/Reviews
    • 14% Communication
    • 15% Other tasks

Development Tools

  • Copilots:
    • Synchronous support for writing code (e.g., GitHub Copilot).
  • Development Agents:
    • Autonomous tools for coding (e.g., SWE-Agent, Aider) and broader tasks (e.g., Devin, OpenHands).

Challenges in Coding Agents

  • Defining the environment.
  • Designing observation/action spaces.
  • File localization and code generation.
  • Planning, error recovery, and ensuring safety.

Software Development Environments

  • Actual Environments:
    • Source repositories, task management software, office tools, communication tools.
  • Testing Environments:
    • Focused on coding, sometimes includes browsing tasks.

Metrics and Datasets

  • Pass@K (Chen et al., 2021): Measures success rates of generated code passing unit tests.
  • Semantic Overlap Metrics:
    • BLEU, CodeBLEU, CodeBERTScore.
  • Key Datasets:
    • HumanEval, ARCADE, SWEBench, Design2Code.

Solutions for File Localization

  1. User Input: Relies on experienced users to specify files.
  2. Search Tools: Integrated search capabilities (e.g., SWE-Agent).
  3. Repository Mapping: Prebuilt maps (e.g., Aider repomap).
  4. Retrieval-Augmented Generation: Combine retrieved code and LMs.

Planning and Recovery

  • Hard-coded Processes: Predefined steps for file localization, patch generation, etc.
  • LLM-Generated Plans: Use LMs for planning and execution (e.g., CodeR).
  • Revisiting Errors: Automated fixes based on error messages (e.g., InterCode).

Safety Measures

  1. Sandboxing: Limit execution environments (e.g., Docker).
  2. Credentialing: Principle of least privilege.
  3. Post-hoc Auditing: Security analysis using LMs and other tools.

Future Directions

  • Enhance agentic training methods.
  • Expand human-in-the-loop approaches.
  • Address broader software tasks beyond coding.

Resources

  • OpenHands Repository: GitHub

Agentic AI Frameworks

· 2 min read

Introduction

  • Two kinds of AI applications:

    • Generative AI: Creates content like text and images.
    • Agentic AI: Performs complex tasks autonomously. This is the future.
  • Key Question: How can developers make these systems easier to build?

Agentic AI Frameworks

  • Examples:

    • Applications include personal assistants, autonomous robots, gaming agents, web/software agents, science, healthcare, and supply chains.
  • Core Benefits:

    • User-Friendly: Natural and intuitive interactions with minimal input.
    • High Capability: Handles complex tasks efficiently.
    • Programmability: Modular and maintainable, encouraging experimentation.
  • Design Principles:

    • Unified abstractions integrating models, tools, and human interaction.
    • Support for dynamic workflows, collaboration, and automation.

AutoGen Framework

https://github.com/microsoft/autogen

  • Purpose: A framework for building agentic AI applications.

  • Key Features:

    • Conversable and Customizable Agents: Simplifies building applications with natural language interactions.
    • Nested Chat: Handles complex workflows like content creation and reasoning-intensive tasks.
    • Group Chat: Supports collaborative task-solving with multiple agents.
  • History:

    • Started in FLAML (2022), became standalone (2023), with over 200K monthly downloads and widespread adoption.

Applications and Examples

  • Advanced Reflection:
    • Two-agent systems for collaborative refinement of tasks like blog writing.
  • Gaming and Strategy:
    • Conversational Chess, where agents simulate strategic reasoning.
  • Enterprise and Research:
    • Applications in supply chains, healthcare, and scientific discovery, such as ChemCrow for discovering novel compounds.

Core Components of AutoGen

  • Agentic Programming:
    • Divides tasks into manageable steps for easier scaling and validation.
  • Multi-Agent Orchestration:
    • Supports dynamic workflows with centralized or decentralized setups.
  • Agentic Design Patterns:
    • Covers reasoning, planning, tool integration, and memory management.

Challenges in Agent Design

  • System Design:
    • Optimizing multi-agent systems for reasoning, planning, and diverse applications.
  • Performance:
    • Balancing quality, cost, and scalability while maintaining resilience.
  • Human-AI Collaboration:
    • Designing systems for safe, effective human interaction.

Open Questions and Future Directions

  • Multi-Agent Topologies:
    • Efficiently balancing centralized and decentralized systems.
  • Teaching and Optimization:
    • Enabling agents to learn autonomously using tools like AgentOptimizer.
  • Expanding Applications:
    • Exploring new domains such as software engineering and cross-modal systems.

History and Future of LLM Agents

· 2 min read

Trajectory and potential of LLM agents

Introduction

  • Definition of Agents: Intelligent systems interacting with environments (physical, digital, or human).
  • Evolution: From symbolic AI agents like ELIZA(1966) to modern LLM-based reasoning agents.

Core Concepts

  1. Agent Types:
    • Text Agents: Rule-based systems like ELIZA(1966), limited in scope.
    • LLM Agents: Utilize large language models for versatile text-based interaction.
    • Reasoning Agents: Combine reasoning and acting, enabling decision-making across domains.
  2. Agent Goals:
    • Perform tasks like question answering (QA), game-solving, or real-world automation.
    • Balance reasoning (internal actions) and acting (external feedback).

Key Developments in LLM Agents

  1. Reasoning Approaches:
    • Chain-of-Thought (CoT): Step-by-step reasoning to improve accuracy.
    • ReAct Paradigm: Integrates reasoning with actions for systematic exploration and feedback.
  2. Technological Milestones:
    • Zero-shot and Few-shot Learning: Achieving generality with minimal examples.
    • Memory Integration: Combining short-term (context-based) and long-term memory for persistent learning.
  3. Tools and Applications:
    • Code Augmentation: Enhancing computational reasoning through programmatic methods.
    • Retrieval-Augmented Generation (RAG): Leveraging external knowledge sources like APIs or search engines.
    • Complex Task Automation: Embodied reasoning in robotics and chemistry, exemplified by ChemCrow.

Limitations

  • Practical Challenges:
    • Difficulty in handling real-world environments (e.g., decision-making with incomplete data).
    • Vulnerability to irrelevant or adversarial context.
  • Scalability Issues:
    • Real-world robotics vs. digital simulation trade-offs.
    • High costs of fine-tuning and data collection in specific domains.

Research Directions

  • Unified Solutions: Simplifying diverse tasks into generalizable frameworks (e.g., ReAct for exploration and decision-making).
  • Advanced Memory Architectures: Moving from append-only logs to adaptive, writeable long-term memory systems.
  • Collaboration with Humans: Focusing on augmenting human creativity and problem-solving capabilities.

Future Outlook

  • Emerging Benchmarks:
    • SWE-Bench for software engineering tasks.
    • FireAct for fine-tuning LLM agents in dynamic environments.
  • Broader Impacts:
    • Enhanced digital automation.
    • Scalable solutions for complex problem-solving in domains like software engineering, scientific discovery, and web automation.

Building an AI-Native Publishing System: The Evolution of TianPan.co

· 3 min read

The story of TianPan.co mirrors the evolution of web publishing itself - from simple HTML pages to today's AI-augmented content platforms. As we launch version 3, I want to share how we're reimagining what a modern publishing platform can be in the age of AI.

AI-Native Publishing

The Journey: From WordPress to AI-Native

Like many technical blogs, TianPan.co started humbly in 2009 as a WordPress site on a free VPS. The early days were simple: write, publish, repeat. But as technology evolved, so did our needs. Version 1 moved to Octopress and GitHub, embracing the developer-friendly approach of treating content as code. Version 2 brought modern web technologies with GraphQL, server-side rendering, and a React Native mobile app.

But the landscape has changed dramatically. AI isn't just a buzzword - it's transforming how we create, organize, and share knowledge. This realization led to Version 3, built around a radical idea: what if we designed a publishing system with AI at its core, not just as an add-on?

The Architecture of an AI-Native Platform

Version 3 breaks from traditional blogging platforms in several fundamental ways:

  1. Content as Data: Every piece of content is stored as markdown, making it instantly processable by AI systems. This isn't just about machine readability - it's about enabling AI to become an active participant in the content lifecycle.

  2. Distributed Publishing, Centralized Management: Content flows automatically from our central repository to multiple channels - Telegram, Discord, Twitter, and more. But unlike traditional multi-channel publishing, AI helps maintain consistency and optimize for each platform.

  3. Infrastructure Evolution: We moved from a basic 1 CPU/1GB RAM setup to a more robust infrastructure, not just for reliability but to support AI-powered features like real-time content analysis and automated editing.

The technical architecture reflects this AI-first approach:

.
├── _inbox # AI-monitored draft space
├── notes # published English notes
├── notes-zh # published Chinese notes
├── crm # personal CRM
├── ledger # my beancount.io ledger
├── packages
│ ├── chat-tianpan # LlamaIndex-powered content interface
│ ├── website # tianpan.co source code
│ ├── prompts # AI system prompts
│ └── scripts # AI processing pipeline

Beyond Publishing: An Integrated Knowledge System

What makes Version 3 unique is how it integrates multiple knowledge streams:

  • Personal CRM: Relationship management through AI-enhanced note-taking
  • Financial Tracking: Integrated ledger system via beancount.io
  • Multilingual Support: Automated translation and localization
  • Interactive Learning: AI-powered chat interface for deep diving into content

The workflow is equally transformative:

  1. Content creation starts in markdown
  2. CI/CD pipelines trigger AI processing
  3. Zapier integrations distribute across platforms
  4. AI editors continuously suggest improvements through GitHub issues

Looking Forward: The Future of Technical Publishing

This isn't just about building a better blog - it's about reimagining how we share technical knowledge in an AI-augmented world. The system is designed to evolve, with each component serving as a playground for experimenting with new AI capabilities.

What excites me most isn't just the technical architecture, but the possibilities it opens up. Could AI help surface connections between seemingly unrelated technical concepts? Could it help make complex technical content more accessible to broader audiences? Will it be possible to easily produce multimedia content in the future?

These are the questions we're exploring with TianPan.co v3. It's an experiment in using AI not just as a tool, but as a collaborative partner in creating and sharing knowledge.

15. Physical and Mental Well-Being Fuel Everything Else

· 5 min read

Your time and energy are your most valuable, self-renewing assets. Protect them to sustain an energetic and fulfilling life.

Physical and Mental Well-Being

15.1 Make Personal Well-Being a Checklist Priority

Self-care often gets overlooked amid external demands. Combat this by incorporating health habits into a daily or weekly checklist. A checklist offers:

  • Continuous improvement: Track and adapt as your mental and physical state evolves.
  • Proactive health management: Catch minor issues early to prevent chronic conditions.
  • Cognitive ease: Reduce decision fatigue by automating routine care.

For example, treating a daily walk as a checklist item ensures you move regularly, easing your mind into or out of “work mode.”

15.2 Exercise Intentionally Across Five Key Areas

Not all exercise is created equal. Each type serves specific needs for your body. Below is a breakdown of the five primary categories and their benefits:

CategoryExamplesKey Benefits
MIIT (Moderate-Intensity Interval Training)Jogging, cycling, rowing at moderate pacesImproves cardiovascular health; enhances stamina; joint-friendly.
HIIT (High-Intensity Interval Training)Sprints, burpees, Tabata workoutsMaximizes calorie burn; boosts metabolism; time-efficient.
Strength TrainingFree weights, resistance bands, bodyweight exercisesBuilds muscle and bone density; enhances functional fitness.
Balance TrainingSingle-leg stands, yoga poses, Tai ChiImproves coordination; prevents falls; strengthens core stability.
Flexibility ExercisesStatic/dynamic stretches, yoga, foam rollingIncreases range of motion; reduces tension; aids recovery.

Craft a routine that integrates these elements for comprehensive fitness.

15.3 Prioritize Sleep and Nutrition

Sleep

Quality sleep underpins productivity and health. Protect your circadian rhythm with these strategies:

  • Morning light exposure: Spend 20–30 minutes outdoors or use a light therapy box (10,000 Lux) on cloudy days.
  • Limit blue light at night: Reduce screen time and establish a calming bedtime routine.
  • Stick to a schedule: Align wake-up and sleep times for optimal recovery. A person can maintain about 14–16 hours of “relatively efficient wakefulness,” so if you plan to go to bed at midnight, it’s best to get up before 8 a.m.

Nutrition

Adopt a balanced diet aligned with dietary guidelines, emphasizing:

  1. Diverse vegetables (dark greens, red/orange, starchy, legumes).
  2. Whole fruits.
  3. Whole grains over refined grains.
  4. Lean proteins (poultry, seafood, nuts, legumes).
  5. Healthy fats (e.g., Omega-3s).

Avoid high-glycemic foods and consider supplements for critical vitamins and minerals, which are critical to energy level and moods. For timing, practices like 16:8 intermittent fasting can enhance energy and focus.

15.4 Practice Mindfulness or Meditation to Manage Stress

Mindfulness is about being fully present in the moment, observing without judgment. It:

  • Heightens awareness of emotions and thoughts.
  • Reduces stress by focusing attention on the now.
  • Sharpens clarity and concentration.
  • Improves overall well-being.

Mindfulness can extend beyond meditation into daily activities—whether walking, eating, or working—by fostering deliberate attention.

15.5 Take Breaks to Recharge

Recovery is not optional—you either plan it deliberately or face burnout. Regular breaks restore energy, improve focus, and sustain high performance.

Recovery Principles:

  • Schedule recovery like work: Plan breaks as intentionally as you plan tasks.
  • Match recovery to stress type: Different stresses require different breaks—physical, emotional, or creative.
  • Use varied recovery methods: Combine short breaks (like a walk or quick stretch) with longer recovery periods.

Implementation:

  • Adopt the 52/17 rhythm: Work for 52 minutes, then rest for 17.
  • Protect weekends: Use weekends to disconnect and rejuvenate.
  • Plan quarterly resets: Schedule deep recovery periods to recharge and reflect.

15.6 Create Spaces People Love

Your environment has a profound impact on your behavior, often outweighing willpower. Optimizing your spaces can make good habits easier and bad habits harder.

Implementation:

  • Optimize workspaces for focus: Ensure good lighting, ergonomic furniture, and minimal distractions.
  • Designate areas for different activities: Create separate zones for focused work, relaxation, and creative thinking.
  • Reduce friction for positive habits: Keep tools for productive tasks accessible (e.g., a journal or fitness gear).
  • Increase friction for negative habits: Add barriers to distractions, like keeping your phone in another room.

15.7 Navigate Brain States Intentionally

Your brain operates in three primary states, each suited for specific tasks. Success depends on recognizing these states and transitioning between them effectively.

The Three States:

  1. Relaxed: Ideal for creativity, reflection, and strategic thinking.
  2. Working: Best for focused execution and problem-solving.
  3. Overheated: A counterproductive state where stress reduces effectiveness.

Implementation:

  • Learn your state indicators: Recognize when you’re entering each state (e.g., mental clarity vs. fatigue).
  • Match tasks to states: Reserve deep focus tasks for the working state and creative tasks for the relaxed state.
  • Develop transition rituals: Use activities like a short walk or a breathing exercise to move between states.
  • Avoid overheating: Take breaks when stress builds to prevent burnout.

The 4 Ps of Marketing: Rewritten for the AI Age

· 4 min read

In 2024, Notion reached a $10B valuation. Their success offers a fresh lens on McCarthy's classic 4 Ps of marketing in the AI age. The 4 Ps—Product, Price, Place, and Promotion—remain as relevant as ever. Originally introduced by E. Jerome McCarthy in the 1960s, this framework distills marketing down to its essentials. But in the fast-paced world of startups, where innovation reigns and traditional playbooks are constantly rewritten, how do these pillars apply? Let’s dive into the 4 Ps and explore their modern applications for founders navigating the frontier of tech.

1. Product: Build Obsession, Not Just Utility

In the 1960s, the product was king: make something people need, and you’ll sell. Today, “need” isn’t enough. The most successful tech products create obsession.

Notion didn’t become a $10B company because people needed another productivity tool. They succeeded because they became the default thought space for millions. Their product blends functionality (databases, templates) with delight (customization, aesthetics). In the AI era, personalization becomes the frontier for innovation.

Founders should ask:

  • Does your product evolve with the user’s behavior?
  • How does your product surprise and delight your audience in ways competitors can’t?

Great products today don’t just solve problems—they build ecosystems that users can’t imagine leaving.

2. Price: The Psychology of Free

Price was once about cost-plus margin. Now, it’s a dance of psychology and scalability. While freemium is common in 2C SaaS, Notion perfected the model. By making their core product free, they turned users into evangelists, then charged enterprises for features they couldn’t refuse.

The lesson? Pricing isn’t about dollars; it’s about entry points. Your users need to feel they’re getting immense value before they even think of paying. AI products amplify this dynamic because the amortized cost of adding new users is nearly zero, while perceived value skyrockets with network effects.

Founders should ask:

  • Are you lowering the barrier to entry while raising long-term value?
  • Does your pricing strategy encourage viral growth?

3. Place: Everywhere and Nowhere

In McCarthy’s day, “place” was about physical distribution—getting products into stores. In 2023, place is digital. It’s about being omnipresent without being intrusive.

Notion didn’t rely much on ads. Instead, they mastered organic discovery. Templates and websites created by power users spread like wildfire across social media. The product itself became its own distribution engine.

AI accelerates this trend. With APIs and integrations, place now includes where your product can live in someone else’s ecosystem. Think Slack bots, Shopify plugins, or Zapier automations.

Founders should ask:

  • Are you meeting users where they are, or forcing them to come to you?
  • How does your product seamlessly integrate into other platforms?

4. Promotion: Community Is the New Advertising

Promotion used to mean ad buys and aggressive marketing campaigns. Today, it means community. Notion built a cult following by empowering creators—YouTubers, educators, and small businesses—to showcase the product in their own ways.

In the AI world, promotion shifts from shouting to listening. Community-building means enabling users to shape the narrative. OpenAI’s success with ChatGPT wasn’t just about building a great product—it was about letting users discover use cases the creators hadn’t even imagined.

Founders should ask:

  • Are your users your best promoters?
  • How does your community contribute to your product’s evolution?

Bringing the 4 Ps Together: The AI Playbook

The 4 Ps aren't obsolete relics, but timeless guideposts: they are both the entirety of marketing and marketing in its entirety. Notion's rise demonstrates that while marketing's fundamental principles endure, they can be reinterpreted and reimagined for the AI-driven age.

As AI continues to reshape technology, the 4 Ps will evolve further:

  • Products will self-improve based on usage patterns
  • Pricing will become increasingly dynamic and personalized
  • Place will expand to include AI-native environments
  • Promotion will leverage AI to create personalized community experiences

For startups, the challenge is not just preserving core principles, but evolving them for the modern age. Ultimately, successful marketing isn't merely about attracting users—it's about building an ecosystem that resonates with users and grows sustainably over time. This is the key insight modern tech founders must grasp, and the core message we hope to convey through this piece.

The $100M Telemetry Bug: What OpenAI's Outage Teaches Us About System Design

· 3 min read

On December 11, 2024, OpenAI experienced a catastrophic outage that took down ChatGPT, their API, and Sora for over four hours. While outages happen to every company, this one is particularly fascinating because it reveals a critical lesson about modern system design: sometimes the tools we add to prevent failures become the source of failures themselves.

The Billion-Dollar Irony

Here's the fascinating part: The outage wasn't caused by a hack, a failed deployment, or even a bug in their AI models. Instead, it was caused by a tool meant to improve reliability. OpenAI was adding better monitoring to prevent outages when they accidentally created one of their biggest outages ever.

It's like hiring a security guard who accidentally locks everyone out of the building.

The Cascade of Failures

The incident unfolded like this:

  1. OpenAI deployed a new telemetry service to better monitor their systems
  2. This service overwhelmed their Kubernetes control plane with API requests
  3. When the control plane failed, DNS resolution broke
  4. Without DNS, services couldn't find each other
  5. Engineers couldn't fix the problem because they needed the control plane to remove the problematic service

But the most interesting part isn't the failure itself – it's how multiple safety systems failed simultaneously:

  1. Testing didn't catch the issue because it only appeared at scale
  2. DNS caching masked the problem long enough for it to spread everywhere
  3. The very systems needed to fix the problem were the ones that broke

Three Critical Lessons

1. Scale Changes Everything

The telemetry service worked perfectly in testing. The problem only emerged when deployed to clusters with thousands of nodes. This highlights a fundamental challenge in modern system design: some problems only emerge at scale.

2. Safety Systems Can Become Risk Factors

OpenAI's DNS caching, meant to improve reliability, actually made the problem worse by masking the issue until it was too late. Their Kubernetes control plane, designed to manage cluster health, became a single point of failure.

3. Recovery Plans Need Recovery Plans

The most damning part? Engineers couldn't fix the problem because they needed working systems to fix the broken systems. It's like needing a ladder to reach the ladder you need.

The Future of System Design

OpenAI's response plan reveals where system design is headed:

  1. Decoupling Critical Systems: They're separating their data plane from their control plane, reducing interdependencies
  2. Improved Testing: They're adding fault injection testing to simulate failures at scale
  3. Break-Glass Procedures: They're building emergency access systems that work even when everything else fails

What This Means for Your Company

Even if you're not operating at OpenAI's scale, the lessons apply:

  1. Test at scale, not just functionality
  2. Build emergency access systems before you need them
  3. Question your safety systems – they might be hiding risks

The future of reliable systems isn't about preventing all failures – it's about ensuring we can recover from them quickly and gracefully.

Remember: The most dangerous problems aren't the ones we can see coming. They're the ones that emerge from the very systems we build to keep us safe.

The Inadequate Equilibrium: How Systems Fail and Where Opportunity Hides

· 4 min read

In 2018, the FDA finally approved fish oil-based nutrition for infants with short bowel syndrome—a treatment that had been saving lives in Europe for decades. The delay wasn’t the result of inept regulators; it was a textbook case of what Eliezer Yudkowsky calls an “inadequate equilibrium”: a stable but suboptimal state where obvious improvements remain unmade. While American infants faced a harrowing 30% mortality rate using soybean-based formulas, European infants, treated with fish oil-based alternatives, saw mortality rates drop to 9%. This stark disparity reveals how even advanced systems can become trapped by inertia.

An inadequate equilibrium arises when no single actor—be it a company, regulator, or individual—has both the incentive and the means to improve the system. Markets, when efficient, tend to eliminate such inefficiencies. But in some domains, systemic constraints entrench failure, creating opportunities for those willing to challenge the status quo.

The Hidden Cost of Systemic Inertia

Take the U.S. healthcare system, where medical errors remain the third leading cause of death, contributing to over 250,000 fatalities annually. Unlike aviation, which reduced accidents by 65% through rigorous error tracking, hospitals rarely track error rates or publish performance metrics. This failure isn’t due to incompetence among healthcare professionals; it’s the product of structural barriers. Hospitals fear litigation, regulatory penalties, and reputational damage, creating a culture of concealment rather than transparency. Inadequate equilibrium: preserved.

Similarly, in cybersecurity, despite rising threats, many organizations continue to rely on outdated practices. Procurement processes, compliance mandates, and sheer organizational inertia create a system where even superior solutions struggle to gain traction. These systemic blind spots—embedded in policy, habit, and culture—lock organizations into suboptimal outcomes.

Lessons from Tech: Breaking Equilibrium

The tech industry, often lauded for its dynamism, isn’t immune to these traps. For decades, programmers endured clunky version control systems. Tools like CVS and Subversion were incremental improvements at best, leaving fundamental inefficiencies unchallenged. Enter Linus Torvalds, an outsider to the version control tooling space, who created Git—not an incremental improvement but a paradigm shift. Git’s distributed model and performance advantages shattered the inadequate equilibrium, demonstrating how bold, outsider-driven innovation can unstick stagnant systems.

A Framework for Spotting Opportunities

Yudkowsky’s concept of inadequate equilibrium offers a lens to identify when systems are ripe for disruption. It hinges on three questions:

  1. Market Efficiency: Does the domain quickly eliminate inefficiencies?
    • Efficient markets, like high-frequency trading, leave little room for obvious opportunities.
    • Inefficient ones, like healthcare, are plagued by opaque pricing and misaligned incentives.
  2. Systemic Constraints: Are there structural barriers preventing improvement?
    • FDA regulations demand costly large-scale studies, deterring solutions like fish oil-based nutrition, even when benefits are already clear.
    • Academic research prioritizes novelty over replication, leaving critical findings unvalidated.
  3. Information Asymmetry: Do you possess insights others lack?
    • Patients often out-research general practitioners on niche conditions.
    • Startups, unburdened by bureaucracy, can outpace incumbents.

Opportunity Beckons

For entrepreneurs and tech leaders, this framework points to actionable strategies:

  • Target domains where systemic constraints limit incumbents.
  • Focus on “good enough” markets that are far from optimal.
  • Seek high-friction problems with low technical barriers.

For example, consider climate tech. Carbon capture is riddled with inadequate equilibria: funding gaps, policy inertia, and entrenched energy interests slow adoption. Yet, those who can bypass these systemic barriers—via modular solutions or unconventional funding models—can transform the landscape.

Breaking the Myth of Impossibility

“Inadequate equilibria” remind us that the reason no one has solved a problem isn’t always technical impossibility—it’s often systemic misalignment. Asking “Why hasn’t someone done this already?” is the wrong question. The right questions are:

  • What incentives sustain the current state?
  • Which barriers can I bypass that others cannot?
  • How can I deliver value without waiting for the system to change?

Consider OpenAI. While academic AI research languished under the weight of grant cycles and publish-or-perish incentives, OpenAI built a moonshot-focused organization that prioritized deployment over papers. By sidestepping traditional academic constraints, they accelerated progress and captured the frontier.

For the Optimist in You

For optimists, inadequate equilibria are more than problems—they’re maps to hidden opportunities. History shows that systems don’t fix themselves; they are fixed by those who see what others overlook and act when others won’t. Whether it’s transforming infant care, rewriting the rules of cybersecurity, or pioneering new tech, the greatest breakthroughs come from understanding not just what’s broken, but why—and daring to fix it.

So the next time you encounter a broken system, don’t dismiss it as an unsolvable mess. Look closer. Somewhere in its constraints lies an opportunity waiting to be seized.

How to Build and Sell Software Effectively

· 4 min read

Building successful software requires both exceptional product development and strategic distribution. Here's a framework for achieving both.

Strategic Foundation

Vision & Mission

  • Vision: Define the future state you aim to create
  • Mission: Outline core actions driving toward that vision
  • Strategic Master Plan: Map key milestones from small wins to major goals

Build Process

Development Approach

  1. Start with PRFAQ (Press Release/FAQ) for customer alignment
  2. Leverage community feedback to identify pain points
  3. Set aggressive timeframes:
    • Features: 2 weeks max
    • Projects: 1 quarter max

Product Evaluation Framework

Track what we build and how they serve customers using a product map:

SolutionUse CaseJobs to Be DoneScore (Quality × Distribution)
[Feature grouping by audience or purpose][Specific feature][Scenario-driven task or goal][Impact assessment]

For example,

SolutionUse CaseJobs to Be DoneScore (Quality × Distribution)
Asset Mobility for Blockchain UsersBridgeFacilitate seamless transfer of assets across blockchain networks...
Transparency for Network ParticipantsCuckoo Scan (Mainnet)Provide users and developers with detailed mainnet transaction and block data...
Cuckoo Sepolia Scan (Testnet)Help developers explore and test in a sandbox environment...

Quality Assessment (Insanely Great Product Framework)

DimensionCore Question1 (Inadequate)3 (Good)5 (Insanely Great)
Magical ExperienceCreates delight?FrustratingPleasantUsers become evangelists
Aesthetic AppealThoughtful design?ClutteredCleanEssential, elegant
Technical ExcellenceSolves complex problems?BasicSolidMakes impossible effortless
Ecosystem FitSeamless integration?High frictionWorks wellOpens new possibilities
Market ImpactCategory transformation?Me-too productIncrementalCategory-defining

Distribution Assessment (Go-to-Market Framework)

DimensionCore Question1 (Inadequate)3 (Good)5 (Insanely Great)
Customer EngagementAttracts/retains?Poor retentionModerate loyaltyBrand evangelists
Brand PerceptionBrand strength?UnrecognizedTrustedIconic
Channel EffectivenessDistribution performance?Limited reachKey segments coveredWide, seamless reach
Marketing InnovationStrategy uniqueness?GenericSome uniquenessTrendsetter
Revenue GrowthSustainable growth?Minimal growthSteady growthMarket leader

Distribution Strategy

Early-Stage Tactics

  • Personalized outreach (cold DMs) - use strategically due to platform risks
  • Content-driven SEO (blogs + tools)
  • Carefully managed affiliate programs
  • Targeted lifecycle emails
  • Supplementary paid advertising

Best Practices

  1. Retention and Referral: Prioritize how to make the product sticky and easy to be recommended
  2. Continuous Feedback: Actively gather and incorporate user input
  3. Platform Selection: Use appropriate tools for each function

Measurement & Iteration

Continuously evaluate and adjust using tools above:

  1. Evaluate and score initiatives in the product master map
  2. Identify misalignments with vision/mission and user feedback
  3. Prioritize adjustments
  4. Update strategic plan as needed

With a centralized map detailing what, where, and how we serve customers—combined with metrics and market feedback—we can navigate iterations more confidently, ensuring every solution is well-managed and improvements are driven by clear, fact-backed insights.