The NPU (Neural Processing Unit) marketing at CES 2026 was overwhelming. Every laptop now has AI performance specs, and I wanted to cut through the noise for people who actually do ML work.
The TOPS Scorecard
Here’s what the major players announced:
| Vendor | Chip | NPU TOPS | Notes |
|---|---|---|---|
| Intel | Core Ultra Series 3 | 50 | First Intel 18A, certified for edge |
| AMD | Ryzen AI Max+ | 60 | Unified memory, integrated graphics |
| Qualcomm | Snapdragon X Plus 2 | ~45 | ARM architecture |
| HP EliteBook | Snapdragon X2 Elite | 85 | “World’s first business notebook” at this level |
What TOPS Actually Means
TOPS = Trillion Operations Per Second. It’s a measure of raw NPU compute throughput.
What it tells you:
- Peak theoretical performance for specific operations (usually INT8)
- General NPU capability ballpark
What it doesn’t tell you:
- Real-world inference performance
- Which models are supported
- Memory bandwidth limitations
- Software stack maturity
A 60 TOPS chip with poor software support will underperform a 45 TOPS chip with mature tooling.
What Actually Matters for ML Work
If you’re doing ML work on a laptop, here’s my prioritized checklist:
1. Memory bandwidth and capacity
LLM inference is memory-bound, not compute-bound. AMD’s unified memory architecture is more important than raw TOPS for running large models locally.
2. Software stack support
Can you run PyTorch/TensorFlow/ONNX without conversion headaches? Intel has OpenVINO, AMD has ROCm (improving), Qualcomm has its own stack. None are as mature as CUDA.
3. Thermal performance
Sustained performance matters more than peak. A chip that throttles after 30 seconds of inference is useless for real work.
4. Actual model benchmarks
I want to see LLaMA-2 7B inference latency, not image classification TOPS. These are different workloads.
The Copilot+ Question
Microsoft’s Copilot+ PC requirements (40+ TOPS NPU) are driving the spec race. But Copilot+ features are still limited:
- Recall (screenshot search) - privacy concerns got it delayed
- Live Captions with translation
- Windows Studio Effects for video calls
- Creative AI features in apps
None of these are compelling enough to drive laptop purchases on their own. The question is whether third-party apps will leverage NPUs effectively.
My Recommendations
For ML engineers/researchers:
- Wait for real benchmarks on LLM inference, not marketing TOPS
- Prioritize memory (32GB+) and memory bandwidth
- Consider AMD Ryzen AI Max+ for the unified memory architecture
- Don’t abandon your cloud GPU instances yet
For developers experimenting with AI:
- Any Copilot+ PC will be fine for local experimentation
- The software ecosystem matters more than hardware specs
- Focus on models that fit in your memory budget
For business users:
- Honestly? Dell is right that consumers aren’t buying for AI features
- Buy for traditional laptop qualities (display, keyboard, battery)
- AI features are nice-to-have, not must-have
The Uncomfortable Reality
Local AI on laptops is still early. The hardware is getting better, but:
- Software stacks are fragmented and immature
- Serious ML work still needs cloud GPUs or dedicated hardware
- Consumer AI features don’t justify NPU hardware… yet
We’re in the “installing plumbing” phase. The applications that make this worthwhile are still being built.
What’s your experience running ML models locally? Anyone tried the new NPU-accelerated inference?