Sorry noob question - where can I read more about this "agents" paradigm? Is one...

zby · on June 21, 2024

In practice this means function calling - the LLM chooses the function to call (and its parameters). Usually in a loop with a 'finish' function that returns the control to the outside code.

You can do that without function calling - as did the original ReAct paper - but then you have to write your own grammar for the communication with the LLM, a parser for it, and also you need to teach the LLM to use that grammar. This is very time consuming.

zEddSH · on June 21, 2024

> Also, how much success people have or had with automating the E2E tests for their various apps by stringing such agents themselves together?

There’s a few startups in the space doing this like QA Tech in Stockholm, and others even in YC (but I forgot the name). I’m skeptical of how successful they’ll be, not just from complex test cases but things like data management and mistakingly affecting other tests. Interesting to follow just in case though, E2E is a pain!

CGamesPlay · on June 21, 2024

Fundamentally, "Agent" refers to anything that operates in an "observe-act" loop. So in the context of LLMs, an agent sees an observation (like the code base and test output) and produces an action (like a patch), and repeats.

pavi2410 · on June 21, 2024

I want to learn about agents too!

hcks · on June 21, 2024

Don’t waste your time, it’s been around since GPT3, and had no results so far. Also notice how no frontier lab is working on it.

zby · on June 21, 2024

Letting the LLM to decide what to do is a powerful technique. For example one pass RAG is very limited: https://zzbbyy.substack.com/p/why-iterative-thinking-is-cruc... To make it iterative you need the cede the control to the LLM.