OpenAI Agents Patterns Demonstration
View Notebook on Github
This notebook demonstrates various common agentic patterns using the Agents SDK and how one can observe them using the AgentOps platform.
Note: This notebook was edited using Claude MCP NotebookEdit tool!
This notebook will walk you through several key agent patterns:
Each pattern demonstrates how AgentOps automatically tracks and monitors your agent interactions, providing valuable insights into performance, costs, and behavior.
Before running this notebook, you’ll need:
Make sure to set these as environment variables or create a .env
file in your project root with:
The mental model for handoffs is that the new agent “takes over”. It sees the previous conversation history, and owns the conversation from that point onwards. However, this is not the only way to use agents. You can also use agents as a tool - the tool agent goes off and runs on its own, and then returns the result to the original agent.
For example, you could model the translation task above as tool calls instead: rather than handing over to the language-specific agent, you could call the agent as a tool, and then use the result in the next step. This enables things like translating multiple languages at once.
This pattern demonstrates using agents as callable tools within other agents. The orchestrator agent receives a user message and then picks which specialized agents to call as tools.
A common tactic is to break down a task into a series of smaller steps. Each task can be performed by an agent, and the output of one agent is used as input to the next. For example, if your task was to generate a story, you could break it down into the following steps:
Each of these steps can be performed by an agent. The output of one agent is used as input to the next.
This pattern demonstrates breaking down a complex task into a series of smaller, sequential steps. Each step is performed by an agent, and the output of one agent is used as input to the next.
This pattern shows how to force an agent to use a tool using ModelSettings(tool_choice="required")
. This is useful when you want to ensure the agent always uses a specific tool rather than generating a response directly.
You can run it with 3 options:
default
: The default behavior, which is to send the tool output to the LLM. In this case, tool_choice
is not set, because otherwise it would result in an infinite loop - the LLM would call the tool, the tool would run and send the results to the LLM, and that would repeat (because the model is forced to use a tool every time.)first_tool_result
: The first tool result is used as the final output.custom
: A custom tool use behavior function is used. The custom function receives all the tool results, and chooses to use the first tool result to generate the final output.For this demo, we’ll allow the user to choose which tool use behavior to test:
Related to parallelization, you often want to run input guardrails to make sure the inputs to your agents are valid. For example, if you have a customer support agent, you might want to make sure that the user isn’t trying to ask for help with a math problem.
You can definitely do this without any special Agents SDK features by using parallelization, but we support a special guardrail primitive. Guardrails can have a “tripwire” - if the tripwire is triggered, the agent execution will immediately stop and a GuardrailTripwireTriggered
exception will be raised.
This is really useful for latency: for example, you might have a very fast model that runs the guardrail and a slow model that runs the actual agent. You wouldn’t want to wait for the slow model to finish, so guardrails let you quickly reject invalid inputs.
This pattern demonstrates how to use input guardrails to validate user inputs before they reach the main agent. Guardrails can prevent inappropriate or off-topic requests from being processed.
LLMs can often improve the quality of their output if given feedback. A common pattern is to generate a response using a model, and then use a second model to provide feedback. You can even use a small model for the initial generation and a larger model for the feedback, to optimize cost.
For example, you could use an LLM to generate an outline for a story, and then use a second LLM to evaluate the outline and provide feedback. You can then use the feedback to improve the outline, and repeat until the LLM is satisfied with the outline.
This pattern shows how to use one LLM to evaluate and improve the output of another. The first agent generates content, and the second agent judges the quality and provides feedback for improvement.
Related to parallelization, you often want to run output guardrails to make sure the outputs from your agents are valid. Guardrails can have a “tripwire” - if the tripwire is triggered, the agent execution will immediately stop and a GuardrailTripwireTriggered
exception will be raised.
This is really useful for latency: for example, you might have a very fast model that runs the guardrail and a slow model that runs the actual agent. You wouldn’t want to wait for the slow model to finish, so guardrails let you quickly reject invalid outputs.
This pattern demonstrates how to use output guardrails to validate agent outputs after they are generated. This can help prevent sensitive information from being shared or ensure outputs meet quality standards.
Running multiple agents in parallel is a common pattern. This can be useful for both latency (e.g. if you have multiple steps that don’t depend on each other) and also for other reasons e.g. generating multiple responses and picking the best one.
This example runs a translation agent multiple times in parallel, and then picks the best translation.
This pattern shows how to run multiple agents in parallel to improve latency or generate multiple options to choose from. In this example, we run translation agents multiple times and pick the best result.
In many situations, you have specialized sub-agents that handle specific tasks. You can use handoffs to route the task to the right agent.
For example, you might have a frontline agent that receives a request, and then hands off to a specialized agent based on the language of the request.
This pattern demonstrates handoffs and routing between specialized agents. The triage agent receives the first message and hands off to the appropriate agent based on the language of the request.
This example shows how to use guardrails as the model is streaming. Output guardrails run after the final output has been generated; this example runs guardrails every N tokens, allowing for early termination if bad output is detected.
The expected output is that you’ll see a bunch of tokens stream in, then the guardrail will trigger and stop the streaming.
This pattern shows how to use guardrails during streaming to provide real-time validation. Unlike output guardrails that run after completion, streaming guardrails can interrupt the generation process early.
This notebook has demonstrated 9 key agent patterns that are commonly used in production AI applications. Each pattern showcases how agents can be orchestrated to perform complex tasks, validate inputs and outputs, and improve overall application performance.
AgentOps provides comprehensive observability for AI agents, automatically tracking all these interactions and providing valuable insights into:
Visit app.agentops.ai to explore your agent sessions and gain deeper insights into your AI application’s behavior.
OpenAI Agents Patterns Demonstration
View Notebook on Github
This notebook demonstrates various common agentic patterns using the Agents SDK and how one can observe them using the AgentOps platform.
Note: This notebook was edited using Claude MCP NotebookEdit tool!
This notebook will walk you through several key agent patterns:
Each pattern demonstrates how AgentOps automatically tracks and monitors your agent interactions, providing valuable insights into performance, costs, and behavior.
Before running this notebook, you’ll need:
Make sure to set these as environment variables or create a .env
file in your project root with:
The mental model for handoffs is that the new agent “takes over”. It sees the previous conversation history, and owns the conversation from that point onwards. However, this is not the only way to use agents. You can also use agents as a tool - the tool agent goes off and runs on its own, and then returns the result to the original agent.
For example, you could model the translation task above as tool calls instead: rather than handing over to the language-specific agent, you could call the agent as a tool, and then use the result in the next step. This enables things like translating multiple languages at once.
This pattern demonstrates using agents as callable tools within other agents. The orchestrator agent receives a user message and then picks which specialized agents to call as tools.
A common tactic is to break down a task into a series of smaller steps. Each task can be performed by an agent, and the output of one agent is used as input to the next. For example, if your task was to generate a story, you could break it down into the following steps:
Each of these steps can be performed by an agent. The output of one agent is used as input to the next.
This pattern demonstrates breaking down a complex task into a series of smaller, sequential steps. Each step is performed by an agent, and the output of one agent is used as input to the next.
This pattern shows how to force an agent to use a tool using ModelSettings(tool_choice="required")
. This is useful when you want to ensure the agent always uses a specific tool rather than generating a response directly.
You can run it with 3 options:
default
: The default behavior, which is to send the tool output to the LLM. In this case, tool_choice
is not set, because otherwise it would result in an infinite loop - the LLM would call the tool, the tool would run and send the results to the LLM, and that would repeat (because the model is forced to use a tool every time.)first_tool_result
: The first tool result is used as the final output.custom
: A custom tool use behavior function is used. The custom function receives all the tool results, and chooses to use the first tool result to generate the final output.For this demo, we’ll allow the user to choose which tool use behavior to test:
Related to parallelization, you often want to run input guardrails to make sure the inputs to your agents are valid. For example, if you have a customer support agent, you might want to make sure that the user isn’t trying to ask for help with a math problem.
You can definitely do this without any special Agents SDK features by using parallelization, but we support a special guardrail primitive. Guardrails can have a “tripwire” - if the tripwire is triggered, the agent execution will immediately stop and a GuardrailTripwireTriggered
exception will be raised.
This is really useful for latency: for example, you might have a very fast model that runs the guardrail and a slow model that runs the actual agent. You wouldn’t want to wait for the slow model to finish, so guardrails let you quickly reject invalid inputs.
This pattern demonstrates how to use input guardrails to validate user inputs before they reach the main agent. Guardrails can prevent inappropriate or off-topic requests from being processed.
LLMs can often improve the quality of their output if given feedback. A common pattern is to generate a response using a model, and then use a second model to provide feedback. You can even use a small model for the initial generation and a larger model for the feedback, to optimize cost.
For example, you could use an LLM to generate an outline for a story, and then use a second LLM to evaluate the outline and provide feedback. You can then use the feedback to improve the outline, and repeat until the LLM is satisfied with the outline.
This pattern shows how to use one LLM to evaluate and improve the output of another. The first agent generates content, and the second agent judges the quality and provides feedback for improvement.
Related to parallelization, you often want to run output guardrails to make sure the outputs from your agents are valid. Guardrails can have a “tripwire” - if the tripwire is triggered, the agent execution will immediately stop and a GuardrailTripwireTriggered
exception will be raised.
This is really useful for latency: for example, you might have a very fast model that runs the guardrail and a slow model that runs the actual agent. You wouldn’t want to wait for the slow model to finish, so guardrails let you quickly reject invalid outputs.
This pattern demonstrates how to use output guardrails to validate agent outputs after they are generated. This can help prevent sensitive information from being shared or ensure outputs meet quality standards.
Running multiple agents in parallel is a common pattern. This can be useful for both latency (e.g. if you have multiple steps that don’t depend on each other) and also for other reasons e.g. generating multiple responses and picking the best one.
This example runs a translation agent multiple times in parallel, and then picks the best translation.
This pattern shows how to run multiple agents in parallel to improve latency or generate multiple options to choose from. In this example, we run translation agents multiple times and pick the best result.
In many situations, you have specialized sub-agents that handle specific tasks. You can use handoffs to route the task to the right agent.
For example, you might have a frontline agent that receives a request, and then hands off to a specialized agent based on the language of the request.
This pattern demonstrates handoffs and routing between specialized agents. The triage agent receives the first message and hands off to the appropriate agent based on the language of the request.
This example shows how to use guardrails as the model is streaming. Output guardrails run after the final output has been generated; this example runs guardrails every N tokens, allowing for early termination if bad output is detected.
The expected output is that you’ll see a bunch of tokens stream in, then the guardrail will trigger and stop the streaming.
This pattern shows how to use guardrails during streaming to provide real-time validation. Unlike output guardrails that run after completion, streaming guardrails can interrupt the generation process early.
This notebook has demonstrated 9 key agent patterns that are commonly used in production AI applications. Each pattern showcases how agents can be orchestrated to perform complex tasks, validate inputs and outputs, and improve overall application performance.
AgentOps provides comprehensive observability for AI agents, automatically tracking all these interactions and providing valuable insights into:
Visit app.agentops.ai to explore your agent sessions and gain deeper insights into your AI application’s behavior.