Skip to main content

3 posts tagged with "generative-ai"

View all tags

Diffusion Models in Production: The Engineering Stack Nobody Discusses After the Demo

· 10 min read
Tian Pan
Software Engineer

Your image generation feature just went viral. 100,000 requests are coming in daily. The API provider's rate limit technically accommodates it. Latency crawls to 12 seconds at p95. Your NSFW classifier is flagging legitimate medical illustrations. A compliance audit surfaces that California's AI Transparency Act required watermarking since September 2024. Support has 50 open tickets from users whose content was silently blocked. By the time you realize you need a real production stack, you've already burned two weeks in crisis mode.

This is the moment "just call the API" fails—not because the API is bad, but because the demo's success exposes every assumption you made about inference latency, content policy, moderation fairness, and regulatory compliance. The engineering work nobody shows you in tutorials lives here.

When RAG Makes Your AI Worse: The Creativity-Grounding Tradeoff

· 8 min read
Tian Pan
Software Engineer

A team at a product company built a brainstorming assistant for their marketing department. They added RAG over their document corpus — campaign briefs, brand guidelines, competitor analyses — figuring the richer context would produce better ideas. Usage dropped within three weeks. The qualitative feedback: outputs felt "too safe," "too predictable," "like it just remixed our existing stuff." They removed retrieval from the brainstorming feature. Ideas improved. Engagement recovered.

This pattern repeats more often than practitioners admit. Retrieval-augmented generation has become the default architecture for grounding LLM outputs in facts, and for factual tasks it earns that default. But for generative tasks — ideation, creative writing, novel solution generation — adding a retrieval layer can silently cap the ceiling of what your model produces. Not because retrieval is broken, but because it's working exactly as designed.

OpenAI: 7 Lessons for Enterprise Adoption of Generative AI

· 7 min read

While many companies are still exploring the potential of generative AI, some trailblazers have already woven it into their core operations, achieving impressive results. OpenAI's latest report, "AI in the Enterprise," distills seven universal principles for successful AI adoption in businesses, drawing from in-depth research into industry leaders like Morgan Stanley, Indeed, and Klarna. This isn't just a technological achievement—it's a shift in mindset, collaboration, and business value.

Seven Insights: From Exploration to Scalable Implementation

1. Start with Rigorous Evaluation (Evals): Prioritize "Control" Before "Growth"

Adopting AI isn't an overnight process. Before rolling it out widely, establishing a thorough, measurable evaluation system is crucial for success.

Take financial giant Morgan Stanley as an example. With sensitive client operations at stake, they didn't just follow trends blindly. Instead, they developed a multi-dimensional evaluation system focusing on three core areas—accuracy in language translation, quality of information summarization, and comparison with human expert answers. Only when the model was deemed "controllable, safe, and beneficial" did they gradually introduce it to frontline operations.

This cautious approach has paid off: now, 98% of Morgan Stanley's financial advisors use AI daily; the document hit rate in their internal knowledge base has soared from 20% to 80%; and client follow-ups that once took days are now completed in hours.

2. Deeply Embed AI into Product Experience, Rather Than Adding a Chatbot

The most successful AI applications are those that seamlessly integrate into existing products, enhancing the core user experience. It should feel as natural as water or electricity in daily life.

Indeed, the world's largest job site, exemplifies this approach. Instead of merely adding a job search chatbot, they used GPT-4o mini to automatically generate personalized "recommendation reasons" for each system-matched job. This seemingly small tweak directly addresses job seekers' "why me" questions, significantly improving matching efficiency and user experience. As a result, job seekers' application initiation increased by 20%, and the employer successful hiring rate rose by 13%.

3. Act Early to Enjoy the "Compounding Snowball" of Knowledge and Experience

AI's value grows through continuous iteration and learning. The earlier you start, the more your organization can benefit from this "compounding" effect.

Swedish fintech company Klarna's AI customer service system is a vivid example of this principle. In just a few months, AI customer service has handled two-thirds of customer chat sessions, effectively taking on the workload of hundreds of human agents. More impressively, the average resolution time for customer issues dropped from 11 minutes to 2 minutes. This initiative is expected to generate $40 million in annual profit growth for the company. Today, 90% of Klarna employees use AI in their daily work, enabling faster innovation and continuous optimization across the organization.

Loading…