Local Qwen models serve different purposes than Claude Opus
Original: Local Qwen isn't a worse Opus, it's a different tool
Why This Matters
Clarifies realistic capabilities of local LLMs versus cloud models, informing business decisions on AI infrastructure and deployment strategies.
Alex Ellis, founder of OpenFaaS and infrastructure startup, argues local Qwen 27B/35B models are not inferior versions of Claude Opus but specialized tools. His small software company uses local models cost-effectively for specific business cases while acknowledging limitations like hallucinations and infinite loops when quantized for consumer GPUs.
Alex Ellis, an open-source maintainer and founder of a bootstrapped infrastructure software company, published a detailed analysis comparing local Qwen models to Anthropic's Claude Opus. Ellis maintains OpenFaaS, SlicerVM, Actuated.com, and Inlets.com—infrastructure products built on Linux primitives, Go, and Kubernetes. He clarifies that local Qwen 27B and 35B models should not be evaluated as cheaper alternatives to Opus, but as different tools suited to specific use cases. Ellis reports that local models have paid for themselves within two to three months at his company and continue serving particular business needs. However, he emphasizes significant limitations: local models cannot be trusted unsupervised, and quantization to fit consumer GPUs introduces problematic behaviors including infinite loops and hallucinations. Ellis notes that frontier models like Claude Opus and Codex have progressed from reducing boilerplate to enabling full end-to-end design, architecture, and testing. A turning point occurred between November 2025 and January 2026 when developers widely adopted Claude Opus, with top-end coding plans settling around $200 monthly. Ellis maintains he uses AI tools extensively but insists on writing his own code and analysis, though most of his daily work now relies on Claude or Codex.