Blog

Page 135

12 articles

Database-Native AI: When Your Postgres Learns to Embed
PostgreSQL extensions like pgvector and pgai now handle embedding generation, vector search, and LLM calls inside the database — eliminating the sync pipeline most RAG architectures carry and keeping vectors transactionally consistent with source data.
postgresvector-search
Apr 127 min
The Death of the Glue Engineer: AI Is Absorbing the Work That Holds Systems Together
AI agents are rapidly automating the integration work — ETL pipelines, API adapters, webhook handlers — that glue engineers built careers on. Here's what falls first, what remains human-essential, and how to move up the stack before the implementation layer disappears.
ai-engineeringcareer
Apr 1211 min
Debug Your AI Agent Like a Distributed System, Not a Program
Print statements and flat logs fail for multi-step AI agents. Structured tracing, deterministic replay, and the replay-diverge-compare methodology bring distributed systems debugging to agent workflows.
ai-agentsdebugging
Apr 129 min
Edge LLM Inference: When Latency, Privacy, or Cost Force You Off the Cloud
A fine-tuned 7B model on one GPU can beat GPT-4 in narrow domains at zero marginal token cost. A practical guide to hardware sizing, quantization formats, hybrid local-cloud routing, and the deployment frameworks that make edge LLM inference production-ready.
edge-aillm-inference
Apr 129 min
The Inference Gateway Pattern: Why Every Production AI Team Builds the Same Middleware
The inference gateway is an emergent architectural pattern — a middleware layer between applications and LLM providers that consolidates rate limiting, failover, cost tracking, and routing. A practical guide to why every production AI team converges on this pattern and how to build or buy one.
inference-gatewayllm
Apr 128 min
Internal AI Tools vs. External AI Products: Why Most Teams Get the Safety Bar Backwards
Internal AI tools often need more safety engineering than customer-facing products — but a completely different kind. How ambient authority, silent failures, and data synthesis across classification boundaries make internal deployments the higher-risk bet.
ai-safetyenterprise-ai
Apr 128 min
Knowledge Graphs Are Back: Why RAG Teams Are Adding Structure to Their Retrieval
Baseline RAG captures only 22-32% of multi-hop answers while GraphRAG achieves 72-83%. A practical guide to adding knowledge graph structure to your retrieval pipeline — construction patterns, routing strategies, and when the schema overhead isn't worth it.
insiderknowledge-graphs
Apr 128 min
LLM Provider Lock-in: The Portability Patterns That Actually Work
Most LLM lock-in advice stops at API wrappers, but the real lock-in hides in prompts, tool-calling assumptions, and behavioral quirks. Portability patterns that address what abstraction layers cannot.
llmvendor-lock-in
Apr 128 min
The MCP Composability Trap: When 'Just Add Another Server' Becomes Dependency Hell
The MCP ecosystem hit 10,000+ servers and 30 CVEs in sixty days. How dependency sprawl, supply chain attacks, and tool conflicts turn composability into a liability — and the operational patterns that prevent it.
insidermcp
Apr 129 min
Open-Weight Models in Production: When Self-Hosting Actually Beats the API
A practical decision framework for self-hosting open-weight models like Llama, Mistral, and Qwen versus using frontier APIs — covering real cost breakdowns, compliance triggers, operational burdens, and the hybrid architecture most production teams actually need.
llmself-hosting
Apr 128 min
The Post-Framework Era: Build Agents with an API Client and a While Loop
Why 80% of production AI agents need nothing more than a prompt, a tool list, and a while loop — and how framework complexity becomes the bottleneck it promised to eliminate.
ai-agentsllm
Apr 128 min
The 10x Prompt Engineer Myth: Why System Design Beats Prompt Wordsmithing
Production data shows the first 5 hours of prompt work yield 35% improvement while the next 40 hours add just 1%. The real leverage in LLM applications lies in retrieval quality, task decomposition, output validation, and evaluation infrastructure — not prompt wordsmithing.
insiderprompt-engineering
Apr 128 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 135

Database-Native AI: When Your Postgres Learns to Embed

The Death of the Glue Engineer: AI Is Absorbing the Work That Holds Systems Together

Debug Your AI Agent Like a Distributed System, Not a Program

Edge LLM Inference: When Latency, Privacy, or Cost Force You Off the Cloud

The Inference Gateway Pattern: Why Every Production AI Team Builds the Same Middleware

Internal AI Tools vs. External AI Products: Why Most Teams Get the Safety Bar Backwards

Knowledge Graphs Are Back: Why RAG Teams Are Adding Structure to Their Retrieval

LLM Provider Lock-in: The Portability Patterns That Actually Work

The MCP Composability Trap: When 'Just Add Another Server' Becomes Dependency Hell

Open-Weight Models in Production: When Self-Hosting Actually Beats the API

The Post-Framework Era: Build Agents with an API Client and a While Loop

The 10x Prompt Engineer Myth: Why System Design Beats Prompt Wordsmithing

About Tian Pan