Assistants API Overview with AgentOps
This notebook has been adapted from this OpenAI Cookbook example. The new Assistants API is a stateful evolution of our Chat Completions API meant to simplify the creation of assistant-like experiences, and enable developer access to powerful tools like Code Interpreter and Retrieval.
Chat Completions API vs Assistants API
The primitives of the Chat Completions API areMessages
, on which you perform a Completion
with a Model
(gpt-3.5-turbo
, gpt-4
, etc). It is lightweight and powerful, but inherently stateless, which means you have to manage conversation state, tool definitions, retrieval documents, and code execution manually.
The primitives of the Assistants API are
Assistants
, which encapsulate a base model, instructions, tools, and (context) documents,Threads
, which represent the state of a conversation, andRuns
, which power the execution of anAssistant
on aThread
, including textual responses and multi-step tool use.
Setup
Note
The Assistants API is currently in beta so the latest Python SDK is needed (1.58.1
at time of writing) for this example.
Pretty Printing Helper
Complete Example with Assistants API
Assistants
The easiest way to get started with the Assistants API is through the Assistants Playground.


-
Create an environment variable in a .env file or other method. By default, the AgentOps
init()
function will look for an environment variable namedAGENTOPS_API_KEY
. Or… -
Replace
<your_agentops_key>
below and pass in the optionalapi_key
parameter to the AgentOpsinit(api_key=...)
function. Remember not to commit your API key to a public repo!
Threads
Create a new thread:Note Even though you’re no longer sending the entire history each time, you will still be charged for the tokens of the entire conversation history with each Run.
Runs
Notice how the Thread we created is not associated with the Assistant we created earlier! Threads exist independently from Assistants, which may be different from what you’d expect if you’ve used ChatGPT (where a thread is tied to a model/GPT). To get a completion from an Assistant for a given Thread, we must create a Run. Creating a Run will indicate to an Assistant it should look at the messages in the Thread and take action: either by adding a single response, or using tools.Note Runs are a key difference between the Assistants API and Chat Completions API. While in Chat Completions the model will only ever respond with a single message, in the Assistants API a Run may result in an Assistant using one or multiple tools, and potentially adding multiple messages to the Thread.To get our Assistant to respond to the user, let’s create the Run. As mentioned earlier, you must specify both the Assistant and the Thread.
status
that will initially be set to queued
. The status
will be updated as the Assistant performs operations (like using tools and adding messages).
To know when the Assistant has completed processing, we can poll the Run in a loop. (Support for streaming is coming soon!) While here we are only checking for a queued
or in_progress
status, in practice a Run may undergo a variety of status changes which you can choose to surface to the user. (These are called Steps, and will be covered later.)
Messages
Now that the Run has completed, we can list the Messages in the Thread to see what got added by the Assistant.page
(since results can be paginated). Do keep a look out for this, since this is the opposite order to messages in the Chat Completions API.
Let’s ask our Assistant to explain the result a bit further!
Example
Let’s take a look at how we could potentially put all of this together. Below is all the code you need to use an Assistant you’ve created. Since we’ve already created our Math Assistant, I’ve saved its ID inMATH_ASSISTANT_ID
. I then defined two functions:
submit_message
: create a Message on a Thread, then start (and return) a new Runget_response
: returns the list of Messages in a Thread
create_thread_and_run
function that I can re-use (which is actually almost identical to the client.beta.threads.create_and_run
compound function in our API ;) ). Finally, we can submit our mock user requests each to a new Thread.
Notice how all of these API calls are asynchronous operations; this means we actually get async behavior in our code without the use of async libraries! (e.g. asyncio
)
Tools
A key feature of the Assistants API is the ability to equip our Assistants with Tools, like Code Interpreter, Retrieval, and custom Functions. Let’s take a look at each.Code Interpreter
Let’s equip our Math Tutor with the Code Interpreter tool, which we can do from the Dashboard…
Steps
A Run is composed of one or more Steps. Like a Run, each Step has astatus
that you can query. This is useful for surfacing the progress of a Step to a user (e.g. a spinner while the Assistant is writing code or performing retrieval).
step_details
.
step_details
for two Steps:
tool_calls
(plural, since it could be more than one in a single Step)message_creation
tool_calls
, specifically using the code_interpreter
which contains:
input
, which was the Python code generated before the tool was called, andoutput
, which was the result of running the Code Interpreter.
message_creation
, which contains the message
that was added to the Thread to communicate the results to the user.
Retrieval
Another powerful tool in the Assistants API is Retrieval: the ability to upload files that the Assistant will use as a knowledge base when answering questions. This can also be enabled from the Dashboard or the API, where we can upload files we want to be used.
Note There are more intricacies in Retrieval, like Annotations, which may be covered in another cookbook.
Functions
As a final powerful tool for your Assistant, you can specify custom Functions (much like the Function Calling in the Chat Completions API). During a Run, the Assistant can then indicate it wants to call one or more functions you specified. You are then responsible for calling the Function, and providing the output back to the Assistant. Let’s take a look at an example by defining adisplay_quiz()
Function for our Math Tutor.
This function will take a title
and an array of question
s, display the quiz, and get input from the user for each:
title
questions
question_text
question_type
: [MULTIPLE_CHOICE
,FREE_RESPONSE
]choices
: [“choice 1”, “choice 2”, …]
get_mock_response...
. This is where you’d get the user’s actual input.

Note Pasting the function JSON into the Dashboard was a bit finicky due to indentation, etc. I just asked ChatGPT to format my function the same as one of the examples on the Dashboard :).
status
we see requires_action
! Let’s take a closer look.
required_action
field indicates a Tool is waiting for us to run it and submit its output back to the Assistant. Specifically, the display_quiz
function! Let’s start by parsing the name
and arguments
.
Note While in this case we know there is only one Tool call, in practice the Assistant may choose to call multiple tools.
display_quiz
function with the arguments provided by the Assistant:
tool_call
ID, found in the tool_call
we parsed out earlier. We’ll also need to encode our list
of responses into a str
.