I'd be interested in knowing if anyone is seriously using the assistants API, it...

Nedomas · on May 18, 2024

I do and built Assistants API compat layer for Groq and Anthropic: https://github.com/supercorp-ai/supercompat I’d argue that Assistants API DX > manual completions API.

tomrod · on May 18, 2024

Aye, but your FinOps will be comolaining even with simple use.

Nedomas · on May 18, 2024

Assistants API use in prod used to suck because it would send full convo on each message. But last month they added an option to send truncted history so its no longer 2$ a pop thankfully. Also Grok, Haiku and Mistral is cheap

brianjking · on May 18, 2024

Are you using Assistants API v2 with streaming?

Nedomas · on May 19, 2024

Yeah, I do both in prod and in the lib. In the lib I even ported Anthropics streaming API to be OpenAI compatible. Will write the docs over the coming days if interested.

phh · on May 18, 2024

I've indeed refused to work with some providers giving only a chat interface and not a completion interface because it made the communication "less natural" to the model (like adding new system messages in between for function calling on models which don't officially does it, or adding other categories than system/user/assistant)

metaskills · on May 18, 2024

Great points. Dont even get me started about how function calling in other LLMs costs me tokens. Something OpenAI provides OOTB. I'm also not a big fan of OpenAI's lock in. Right now I'm on a huge Claude 3 Haiku kick. That said, OpenAI does seem to get the APIs right and my hunch is the new Assistants API is going to potentially disrupt things again. Time will tell.

heggy · on May 18, 2024

I would love to be using Claude, but you can't get API access (beyond an initial trial period) in the EU without providing a European VAT number. They don't want personal users or people to even learn and experiment I guess.

bjterry · on May 18, 2024

You can use the Claude APIs via OpenRouter with a pre-paid account.

heggy · on May 19, 2024

Thanks, this did the job!

metaskills · on May 18, 2024

Interesting, would Amazon Bedrock be an alternative? That's how I use Claude.

Jimmc414 · on May 18, 2024

I'd guess it's more likely about the additional programming needed to meet GDPR compliance requirements.

benreesman · on May 18, 2024

Opus is really cool. I’ve found it to have a few persistent bugs in what I initially assumed is tokenization but now wonder if might be more fundamental, but modulo a few typographical-level errors, I personally think it’s the most useful of the API-mediated models for involved sessions.

And there are some serious people at Anthropic, they’ll get the typo thing if they haven’t already (been a busy week and change, they easily could have shipped a fix and I overlooked it).

msp26 · on May 18, 2024

> Dont even get me started about how function calling in other LLMs costs me tokens. Something OpenAI provides out of the box.

Not sure what you mean by this.

metaskills · on May 18, 2024

I have some assumptions/guesses on how billing works. Gonna do a post on this on my unremarkable.ai blog, please do signup for posts there, no spam. I could be right or wrong but need to do some experiments and publish later.

BoorishBears · on May 18, 2024

I'm not sure you're talking about the same thing: OpenAI specifically has a "Assistants API" that manages long term memory and tool usage for the consumer: https://platform.openai.com/assistants

I'd guestimate 99% of people using LLMs are using instruct-based message interfaces that have a variation of system/user/assistant. The top models mostly only come as a completion models, and even Anthropic has switched to a message based API

j45 · on May 18, 2024

I've used it and in some cases it's taking days and weeks of development away to get to testing the market.

In some cases the lock in is what it is for now because a particular model in reality is so far ahead, or staying ahead.

It doesn't mean other options won't become available, but it does matter to relate your need to your actions.

Getting something working consistently for example might be the first goal, and then learning to implement it with multiple models might be secondary. The chances of that increase the later other models are explored in some cases.

It should be possible to tell pretty quickly if something works in a particular model that's the leader, how others compare to it and how to track the rate of change between them.

oddthink · on May 18, 2024

I know at least one team is at work is using the Assistants API, and I'm talking with another team that is leaning pretty heavily towards using it over building a custom RAG solution themselves, or even over other in-house frameworks.

stavros · on May 18, 2024

I use it mostly exclusively (I've even developed a Python library for it, https://github.com/skorokithakis/ez-openai), because it does RAG and function calling out of the box. It's pretty convenient, even if OpenAI's APIs are generally a trash fire.