It's just not there yet. I have tried all the models from April, including the Gemma 4 variants.
These are so far from Opus it's not even funny. They are not close to being in the same league. Gemma might be like a frontier model from a couple years ago, but with much worse performance in long context chats.
Correct they aren't opus. They are sonnet with a little hand holding. They also run on a single GPU at 40 tps.
No one is saying a local model will give you anthropics business in a 5min download. People are saying, "hmm, maybe I should do this one locally". People are also saying "this is surprisingly good enough for me given the trade offs"
If your time is worth nothing to even triage that question.
Unless you have fanatic needs for data privacy or really don't have Internet, running local models almost certainly results in negative ROI overall.
Not to mention that you need to have decent hardware (that is getting expensive by the day) to even have this conversation in the first place.
People in this post talk as if everyone has a Mac with 24GB or 32GB RAM. When the reality is that most people use a Windows laptop with crappy integrated GPU.
Hm. I think there is a bit of a shifting goalpost dynamic at play here. Those April releases, even the fast MoE versions, are better than big cloud models from 18 months ago. I remember when everyone was gushing about Sonnet 3.7 and what a transformative experience development was using it. So was it useful or wasn’t it? A tool doesn’t lose its usability just because a better one comes along.
To me, these small local LLMs are highly useful (and this “usable”) even though they don’t match the output of today’s frontier models.
When were you trying local models? The model releases from April 2026 are a serious change in performance.