Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is very cool, though I don't understand exactly what they've done here. Is it some kind of LLM with convolutional layers added?

The graph doesn't exactly make it clear but it describes a pipeline that goes beyond the LLM, so the CNN could be a separate model there.



Here’s the academic paper behind it: https://arxiv.org/abs/2602.04101


Thanks. Well this is fascinating.

>Instead of a single transformer, we combine (i) a stack of heterogeneous DNNs paired with small language models as perception modules

It seems that we're reinventing the brain's organs one by one from first principles. (Though Transformer + Common Crawl unintentionally builds a whole bunch of them we don't even understand yet.)

I found some broader context and the whole thing is indeed very harness-shaped:

>Using Interfaze as a Tool Inside Your Agent

https://interfaze.ai/blog/using-interfaze-as-a-tool-inside-y...

Well, Harness is the wrong word here... "environment/tools the LLM interacts with" definitely fits though. Or "other organoid" to use the previous metaphor.


Yup does really depend on the use case.

We see two types: workflows & agents.

Workflows are the most common, there's a pipeline like processing loan documents before data gets loaded to the next step or translating user comments before being stored in the database.

Agents are where you have a chat based system or a brain of sorts that calls many tools to achieve a user goal. The model doing this is a lot better at non deterministic task which then delegates to Interfaze for specific deterministic actions like OCR, Web extract then consumes that data. That's the article you referenced :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: