Skip to main content

3 posts tagged with "mobile"

View all tags

The On-Device LLM Problem Nobody Talks About: Model Update Propagation

· 12 min read
Tian Pan
Software Engineer

Most engineers who build on-device LLM features spend their time solving the problems that are easy to see: quantization, latency, memory limits. The model fits on the phone, inference is fast enough, and the demo looks great. Then they ship to millions of devices and discover a harder problem that nobody warned them about: you now have millions of independent compute nodes running different versions of your AI model, and you have no reliable way to know which one any given user is running.

Cloud inference is boring in the best way. You update the model, redeploy the server, and within minutes the entire user base is running the new version. On-device inference breaks this assumption entirely. A user who last opened your app three months ago is still running the model that was current then — and there's no clean way to force an update, no server-side rollback, and no simple way to detect the mismatch without adding instrumentation you probably didn't build from the start.

This version fragmentation is the central operational challenge of on-device AI, and it has consequences that reach far beyond a slow rollout. It creates silent capability drift, complicates incident response, and turns your "AI feature" into a heterogeneous fleet of independently-behaving systems that you're responsible for but can't directly control.

On-Device LLM Inference in Production: When Edge Models Are Right and What They Actually Cost

· 10 min read
Tian Pan
Software Engineer

Most teams decide to use on-device LLM inference the same way they decide to rewrite their database: impulsively, in response to a problem that a cheaper solution could have solved. The pitch is always compelling—no network round-trips, full privacy, zero inference costs—and the initial prototype validates it. Then six months post-ship, the model silently starts returning worse outputs, a new OS update breaks quantization compatibility, and your users on budget Android phones are running a version you can't push an update to.

This guide is about making that decision with eyes open. On-device inference is genuinely the right call in specific situations, but the cost structure is different from what teams expect, and the production failure modes are almost entirely unlike cloud LLM deployment.

A Closer Look at iOS Architecture Patterns

· 3 min read

Why Should We Care About Architecture?

The answer is: to reduce the human resources spent on each feature.

Mobile developers evaluate the quality of an architecture on three levels:

  1. Whether the responsibilities of different features are evenly distributed
  2. Whether it is easy to test
  3. Whether it is easy to use and maintain
Responsibility DistributionTestabilityUsability
Tight Coupling MVC
Cocoa MVC❌ V and C are coupled✅⭐
MVP✅ Independent view lifecycleAverage: more code
MVVMAverage: View has a dependency on UIKitAverage
VIPER✅⭐️✅⭐️

Tight Coupling MVC

Traditional MVC

For example, in a multi-page web application, when you click a link to navigate to another page, the entire page reloads. The problem with this architecture is that the View is tightly coupled with the Controller and Model.

Cocoa MVC

Cocoa MVC is the architecture recommended by Apple for iOS developers. Theoretically, this architecture allows the Controller to decouple the Model from the View.

Cocoa MVC

However, in practice, Cocoa MVC encourages the use of massive view controllers, ultimately leading to the view controller handling all operations.

Realistic Cocoa MVC

Although testing such tightly coupled massive view controllers is quite difficult, Cocoa MVC performs the best in terms of development speed among the existing options.

MVP

In MVP, the Presenter has no relationship with the lifecycle of the view controller, allowing the view to be easily replaced. We can think of UIViewController as the View.

Variant of MVC

There is another type of MVP: MVP with data binding. As shown in the figure, the View is tightly coupled with the Model and Controller.

MVP

MVVM

MVVM is similar to MVP, but MVVM binds the View to the View Model.

MVVM

VIPER

Unlike the three-layer structure of MV(X), VIPER has a five-layer structure (VIPER View, Interactor, Presenter, Entity, and Routing). This structure allows for good responsibility distribution but has poorer maintainability.

VIPER

Compared to MV(X), VIPER has the following differences:

  1. The logic processing of the Model is transferred to the Interactor, so Entities have no logic and are purely data storage structures.
  2. ==UI-related business logic is handled in the Presenter, while data modification functions are handled in the Interactor==.
  3. VIPER introduces a routing module, Router, to implement inter-module navigation.