Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The interesting part is that you can use the same API with Workers AI models (hosted at the edge) and proxied models (OpenRouter-style).

Disclaimer: I work at Cloudflare, but not on this.



It's the same problem as fireworks, the only models supporting LORA are like year old dense models that perform horribly on most tasks. If you want to do anything close to relevant you still need to rent/own dedicated GPUs, which seems insane to me when vLLM fully support dynamic LORA loading.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: