Skip to main content

7 posts tagged with "validation"

View all tags

LLM-Powered Data Migrations: What Actually Works at Scale

· 10 min read
Tian Pan
Software Engineer

The pitch is compelling: feed your legacy records into an LLM, describe the target schema, and let the model figure out the mapping. No hand-written parsers, no months of transformation logic, no domain expert bottlenecks. Teams have run this and gotten to 70–97% accuracy in a fraction of the time it would take traditional ETL. The problem is that the remaining 3–30% of failures don't look like failures. They look like correct data.

That asymmetry—where wrong outputs are structurally valid and plausible—is what makes LLM-powered data migrations genuinely dangerous without the right validation architecture. This post covers what the teams that have done this successfully actually built: when LLMs earn their place in the pipeline, where they silently break, and the validation layer that catches errors traditional tools cannot.

Structured Output Reliability in Production: Why JSON Mode Is Not a Contract

· 8 min read
Tian Pan
Software Engineer

A team ships a document extraction pipeline. It uses JSON mode. QA passes. Monitoring shows near-zero parse errors. Six weeks later, a silent failure surfaces: every risk assessment in the corpus has been marked "low" — valid JSON, correct field names, wrong answers. The pipeline has been confidently lying in a schema-compliant format for weeks.

This is the core problem with treating JSON mode as a reliability guarantee. Structural conformance and semantic correctness are different properties of a system, and confusing them is one of the most expensive mistakes in production AI engineering.

Structured Output Is Not Structured Thinking: The Semantic Validation Layer Most Teams Skip

· 11 min read
Tian Pan
Software Engineer

A medical scheduling system receives a valid JSON object from its LLM extraction layer. The schema passes. The types check out. The required fields are present. Then a downstream job tries to book an appointment and finds that the end_time is three hours before the start_time. Both fields are correctly formatted ISO timestamps. Neither violates the schema. The booking silently fails, and the patient gets no appointment — no error surfaced, no alert fired.

This is what it looks like when schema validation is mistaken for correctness validation. The model followed the format. It did not follow the logic.

The Schema Problem: Taming LLM Output in Production

· 9 min read
Tian Pan
Software Engineer

You ship a feature that extracts structured data from user text using an LLM. You test it thoroughly. It works. Three months later, a model provider quietly updates their weights, and without changing a single line of your code, your downstream pipeline starts silently dropping records. No exceptions thrown. No alerts fired. Just wrong data flowing through your system.

This is the schema problem. And despite years of improvements to structured output APIs, it remains one of the least-discussed failure modes in LLM-powered systems.

The Semantic Validation Layer: Why JSON Schema Isn't Enough for Production LLM Outputs

· 10 min read
Tian Pan
Software Engineer

By 2025, every major LLM provider had shipped constrained decoding for structured outputs. OpenAI, Anthropic, Gemini, Mistral — they all let you hand the model a JSON schema and guarantee it comes back structurally intact. Teams adopted this and breathed a collective sigh of relief. Parsing errors disappeared. Retry loops shrank. Dashboards turned green.

Then the subtle failures started.

A sentiment classifier locked in at 0.99 confidence on every input — gibberish included — for two weeks before anyone noticed. A credit risk agent returned valid JSON approving a loan application that should have been declined, with a risk score fifty points too high. A financial pipeline coerced "$500,000" (a string, technically schema-valid) down to zero in an integer field, corrupting six weeks of risk calculations. Every one of these failures passed schema validation cleanly.

The lesson: structural validity is necessary, not sufficient. You need a semantic validation layer, and most teams don't have one.

JSON Mode Won't Save You: Structured Output Failures in Production LLM Systems

· 9 min read
Tian Pan
Software Engineer

When developers first wire up JSON mode, the response feels like solving a problem. The LLM stops returning markdown fences, prose apologies, and curly-brace-adjacent gibberish. The output parses. The tests pass. Production ships.

Then, three weeks later, a background job silently fails because the model returned {"status": "complete"} when the schema expected {"status": "completed"}. A data pipeline crashes because a required field came back as null instead of being omitted. An agent tool-call loop terminates early because the model embedded a stray newline inside a string value and the downstream parser choked on it.

JSON mode guarantees syntactically valid JSON. It does not guarantee that the JSON means what you think it means, contains the fields your application expects, or maintains semantic consistency across requests. These are different problems, and they require different solutions.

Structured Output in Production: Getting LLMs to Return Reliable JSON

· 8 min read
Tian Pan
Software Engineer

At some point in production, every LLM-powered application needs to stop treating model output as prose and start treating it as data. The moment you try to reliably extract a JSON object from a language model — and feed it downstream into a database, API call, or UI — you discover just how many ways this can go wrong. The model wraps JSON in markdown fences. It generates a valid object but omits required fields. It formats dates inconsistently across calls. It hallucinates enum values. Any one of these failures silently corrupts downstream state.

Structured output has evolved from an afterthought into a first-class concern for production LLM systems. This post covers the three main mechanisms for enforcing it, where each breaks down, and how to design schemas that keep quality high under constraint.