Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

With how much vendor harnesses are now actively steering the agent with their own instructions on top of user prompts, I think it’d be super interesting to see a comparison of one of the already tested models - so Opus 4.7 or GPT-5.5 - across a range of different harnesses that aren’t their native. OpenCode, Pi, Hermes, Kilo Code. The most popular coding-focused harnesses, basically.


Agreed. Harness is really important. Especially since many labs are now post-training agents directly in their native harness.

(Which is why my prior is that third party harnesses would not perform as well. But I haven't actually measured this.)


OpenCode seems to give me better results than codex-cli, i’d be interested in seeing this too!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: