I'd be interested in knowing if anyone is seriously using the assistants API, it feels like such a lock in to OpenAIs platform when your can alternatively just use completions that are much more easily interchanged.
Assistants API use in prod used to suck because it would send full convo on each message. But last month they added an option to send truncted history so its no longer 2$ a pop thankfully. Also Grok, Haiku and Mistral is cheap
Yeah, I do both in prod and in the lib. In the lib I even ported Anthropics streaming API to be OpenAI compatible. Will write the docs over the coming days if interested.
I've indeed refused to work with some providers giving only a chat interface and not a completion interface because it made the communication "less natural" to the model (like adding new system messages in between for function calling on models which don't officially does it, or adding other categories than system/user/assistant)
Great points. Dont even get me started about how function calling in other LLMs costs me tokens. Something OpenAI provides OOTB. I'm also not a big fan of OpenAI's lock in. Right now I'm on a huge Claude 3 Haiku kick. That said, OpenAI does seem to get the APIs right and my hunch is the new Assistants API is going to potentially disrupt things again. Time will tell.
I would love to be using Claude, but you can't get API access (beyond an initial trial period) in the EU without providing a European VAT number. They don't want personal users or people to even learn and experiment I guess.
Opus is really cool. I’ve found it to have a few persistent bugs in what I initially assumed is tokenization but now wonder if might be more fundamental, but modulo a few typographical-level errors, I personally think it’s the most useful of the API-mediated models for involved sessions.
And there are some serious people at Anthropic, they’ll get the typo thing if they haven’t already (been a busy week and change, they easily could have shipped a fix and I overlooked it).
I have some assumptions/guesses on how billing works. Gonna do a post on this on my unremarkable.ai blog, please do signup for posts there, no spam. I could be right or wrong but need to do some experiments and publish later.
I'm not sure you're talking about the same thing: OpenAI specifically has a "Assistants API" that manages long term memory and tool usage for the consumer: https://platform.openai.com/assistants
I'd guestimate 99% of people using LLMs are using instruct-based message interfaces that have a variation of system/user/assistant. The top models mostly only come as a completion models, and even Anthropic has switched to a message based API
I've used it and in some cases it's taking days and weeks of development away to get to testing the market.
In some cases the lock in is what it is for now because a particular model in reality is so far ahead, or staying ahead.
It doesn't mean other options won't become available, but it does matter to relate your need to your actions.
Getting something working consistently for example might be the first goal, and then learning to implement it with multiple models might be secondary. The chances of that increase the later other models are explored in some cases.
It should be possible to tell pretty quickly if something works in a particular model that's the leader, how others compare to it and how to track the rate of change between them.
I know at least one team is at work is using the Assistants API, and I'm talking with another team that is leaning pretty heavily towards using it over building a custom RAG solution themselves, or even over other in-house frameworks.
I use it mostly exclusively (I've even developed a Python library for it, https://github.com/skorokithakis/ez-openai), because it does RAG and function calling out of the box. It's pretty convenient, even if OpenAI's APIs are generally a trash fire.