The interesting part is that you can use the same API with Workers AI models (ho... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		ascorbic 40 days ago \| parent \| context \| favorite \| on: Cloudflare's AI Platform: an inference layer desig... The interesting part is that you can use the same API with Workers AI models (hosted at the edge) and proxied models (OpenRouter-style). Disclaimer: I work at Cloudflare, but not on this.

mips_avatar 40 days ago [–]

It's the same problem as fireworks, the only models supporting LORA are like year old dense models that perform horribly on most tasks. If you want to do anything close to relevant you still need to rent/own dedicated GPUs, which seems insane to me when vLLM fully support dynamic LORA loading.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact