- Published on
Function Calling - OpenAI Responses API vs OpenAI Assistant API
- Authors
- Name
- J-AI
- @
When OpenAI released the Responses API it said it would help developers create more capable agents. This post explains how the Responses API helps do that.
The Responses API and the Assistants API both enable AI models to use external tools and functions, but they differ in integration style. The Responses API gives developers fine-grained control. The Assistants API, by contrast, provides a structured framework that handled much of the orchestration for you. Building an agent with the Responses API is like working with raw ingredients, whereas using the Assistants API is more like using a premade framework.
Let's dig deeper and see how the Responses API gives the fine grained control in comparison to the Assistants API.
Support for Function Calling and Tool Use
OpenAI Responses API: The Responses API supports tool use both via built-in tools and custom functions. Built-in tools (such as web search or file search) can be invoked simply by including them in the request, and the model will autonomously use these tools to gather information and produce an answer. For example, adding the web search tool in a single API call allows the model to fetch live data and return a final answer with citations, without the developer manually handling search results. The Responses API also supports custom function calling: the developer provides function definitions (name, parameters, etc.) as part of the request. The model will decide if/when to call those functions based on the user query. When the model opts to call a custom function, it outputs a function call payload (with the function name and arguments) that the client application must catch and execute. In summary, the Responses API enables both autonomous use of hosted tools and a mechanism for the model to request custom tool usage via function calls.
OpenAI Assistants API: The Assistants API is likewise designed to let the AI use tools and call functions, but it packages this capability within an assistant construct. You first create an Assistant with a specific model, instructions, and a set of tools (functions) it can use. These tools can include custom functions. Just like the Responses API, the model behind an Assistant can decide to invoke a function when needed to fulfill the user’s request. The key difference is that in the Assistants API this behavior is managed through a structured workflow: when a function needs to be called, the assistant’s run enters a requires_action
state indicating a tool invocation is required. The developer then performs the actual function call outside the model and feeds the result back for the assistant to continue.
Overall, both APIs allow tool integration, but the Assistants API does it via a more managed, stateful conversation session with the run
abstraction. Whereas the Responses API tends to handle tool use in a stateless way.
Tool Definition: Per Request vs. Registered Tools
Responses API Tool Definitions: In the Responses API, tool and function definitions are embedded in each request. The developer provides a tools
list with each API call that specifies what tools/functions the model has at its disposal. For built-in tools this might just be an identifier (e.g. {"type": "web_search_preview"}
), and for custom functions it includes the full schema (name, description, parameters) for that function. This means every time you call client.responses.create(...)
, you include the functions you want the model to be able to use. It’s a bit like sending the “toolbox” along with each prompt. The upside is simplicity for single-turn interactions – you just declare the function schema on the fly. The downside is potential repetition - you must ensure the function definitions are included every time.
Assistants API Tool Definitions: The Assistants API handles tool definitions via a manual registry when setting up the assistant. When you create a new assistant, you pass in the list of tools (functions) once, and they get registered with that assistant profile. The assistant retains knowledge of these tools for all future interactions in its threads. In practice, this means you define the function schema one time with client.assistants.create(...)
, and the assistant will remember those capabilities. You do not have to resend the function specification on each user query – the assistant’s context already includes the tool definitions. This approach can reduce flexibility in an agentic system since the “toolbox” is fixed to the assistant. In summary, Responses API = embed tools per call, whereas Assistants API = register tools once for reuse. The Assistants approach is more like configuring an agent upfront with its available tools, which can make the system more organized when building complex applications, but OpenAI likely intends this extra step is for the API consumers to manage.
Output Parsing and Execution Flow
Responses API (Manual Parsing): With the Responses API (akin to the classic Chat Completions flow), handling a custom function call involves manual parsing and multiple calls. If the model decides to call a function, the client.responses.create
call will return a special assistant message indicating the function name and a JSON blob of arguments. The developer must intercept this and parse it – for example, by extracting response.output[0].arguments
and running json.loads
on that string to get a usable arguments object. The developer’s code then executes the actual function and prepares the result. To complete the cycle, the developer must send a follow-up request: they append a message with the function’s result (often using a designated type like "function_call_output"
with the function call ID and output data) and call the API again to get the model’s final answer incorporating that result. This means the developer has to interpret the model’s function call output and feed the data back in the correct format. The important distinction here is the follow up request is independent of the initial request. Whereas in the Assistants API they are tied together using the run
object, as explained below.
Assistants API (Structured Handling): The Assistants API streamlines the above process with a more structured interface. When the assistant needs to use a tool, the run
status changes to indicate a required action. The OpenAI library (or API response) provides a structured object representing the function call request: for instance, you can directly access the function.name
and function.arguments
in the run.required_action
data structure. This eliminates the need to scrape a raw JSON string out of the model’s message – the arguments are readily available as data fields. The developer then executes the function and calls an API method (e.g. client.threads.runs.submit_tool_outputs(...)
) to submit the result back to the assistant. At this point, the assistant continues the conversation and incorporates the function output into its answer, without the developer having to manually craft new message payloads; the linkage is handled via the tool call ID. In short, the Assistants API handles a lot of the “bookkeeping”: you don’t manually append messages or parse JSON blobs by hand – you work with higher-level objects and API calls that abstract those details. This significantly reduces the complexity of output parsing and makes the code less error-prone when integrating tools. But it also makes things more opinionated since it is all tied to the same run which needs to be managed. OTOH, the Responses API doesn't have the run abstraction, it needs to be managed by the API consumers.
Chaining Multiple Tool Invocations
Responses API: In the basic Responses approach, each API call can result in a function call. Chaining multiple tools therefore requires an iterative loop managed by the developer. For example, if a task needs Tool A then Tool B then an answer, the developer would call the model -> get a function call for A -> execute A -> call model again with A’s result -> maybe get a function call for B -> execute B -> call model again, and so on, until a final answer is produced. The state (conversation history) must be maintained by the developer across these calls (by accumulating messages) so the model remembers context. This is certainly doable, but requires custom workflow code.
OpenAI models can autonomously string together multiple internal tools (like web search, file lookup, etc.) uses in one request. In such cases, the Responses API can internally handle state and sequence, i.e, the client makes a single call, that internally handles all the actions autonomously - orchestrating them in the right sequence. However, for custom functions, the chain still involves pausing and handing control to the developer. In essence, multi-step reasoning with custom tools under Responses API means writing a mini-orchestrator around the API – each tool invocation is a round-trip where you parse and feed results before continuing, managing the state in your code unlike with the Assistant API.
Assistants API: The Assistants API was built with compound tool use in mind and wrappped a multi-step process into a cohesive workflow. Once an assistant and thread are created, a single run can encompass multiple tool calls in sequence if the query requires it. The model would request a function; after you provide the output, the same run can continue and the model can immediately decide if another function is needed, and so forth, before finally producing the answer. All these steps happen within the context of one thread run, so the state and intermediate results were preserved by the API automatically between steps.
The Assistants platform autonomously takes the output from each tool call and used it to prepare the request for the next one, meaning the model’s tool results were carried through without explicit developer-provides workflow logic. The developer’s job was simply to execute each required function and return the result, and the assistant orchestrated the sequence of calls until completion. This made handling tools easy and also needed less code to be written. The developer didn't have to manually loop and accumulate a conversation – the Assistants API’s run guided the model through all necessary calls and then returned a final response.
Implementation Complexity and Developer Ergonomics
From a developer’s perspective, these two APIs offer different trade-offs in building tool-using AI systems:
Responses API: Simple for single-turn interactions or simple scenarios. You can fire off a prompt with function definitions and get a result, which is great for straightforward tasks. However, as soon as you introduce multi-step tools with function calling, the developer complexity rises. The developer must implement the controller logic and handle each step.
Assistants API: provides a higher-level, structured approach. The API managed conversation state, and the SDK gives convenient methods to fetch required tool calls and supply outputs. You spend less time parsing JSON and concatenating messages, and more time simply implementing the actual tool logic and letting the assistant handle the reasoning.
In conclusion, the OpenAI Responses API and the Assistants API both enable AI models to use external tools and functions, but they differ in integration style. The Responses API gives developers fine-grained control at the cost of more manual work – you directly orchestrate each function call and parse outputs. The Assistants API, by contrast, provides a structured framework that handled much of the orchestration for you.