I think the real crux of the moat is model intelligence. I'd bet that most of the money being spent on inference is on the top few models (today Opus-4.7 and GPT-5.5) from people and companies that benefit from using the best models.
Truly the main moat that OAI/Anthropic have is being 6 months months ahead of the competition in performance, which might be indefinite if the competition is just distilling their models (China) or takes many months between releases (Google).
Once you look passed the frontier of performance, it's just a race to the bottom on inference costs because there's at least 5 companies with equivalent open models at that level.
Truly the main moat that OAI/Anthropic have is being 6 months months ahead of the competition in performance, which might be indefinite if the competition is just distilling their models (China) or takes many months between releases (Google).
Once you look passed the frontier of performance, it's just a race to the bottom on inference costs because there's at least 5 companies with equivalent open models at that level.