Merge branch 'main' of github.com:openai/openai-agents-python into alex/inline-snapshot

This commit is contained in:
Alex Hall 2025-03-12 11:01:00 +02:00
commit 5aba0b5b19
24 changed files with 135 additions and 49 deletions

28
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View file

@ -0,0 +1,28 @@
---
name: Bug report
about: Report a bug
title: ''
labels: bug
assignees: ''
---
### Please read this first
- **Have you read the docs?**[Agents SDK docs](https://openai.github.io/openai-agents-python/)
- **Have you searched for related issues?** Others may have faced similar issues.
### Describe the bug
A clear and concise description of what the bug is.
### Debug information
- Agents SDK version: (e.g. `v0.0.3`)
- Python version (e.g. Python 3.10)
### Repro steps
Ideally provide a minimal python script that can be run to reproduce the bug.
### Expected behavior
A clear and concise description of what you expected to happen.

View file

@ -0,0 +1,16 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: enhancement
assignees: ''
---
### Please read this first
- **Have you read the docs?**[Agents SDK docs](https://openai.github.io/openai-agents-python/)
- **Have you searched for related issues?** Others may have had similar requesrs
### Describe the feature
What is the feature you're requesting? How would it work? Please provide examples and details if possible.

16
.github/ISSUE_TEMPLATE/question.md vendored Normal file
View file

@ -0,0 +1,16 @@
---
name: Question
about: Questions about the SDK
title: ''
labels: question
assignees: ''
---
### Please read this first
- **Have you read the docs?**[Agents SDK docs](https://openai.github.io/openai-agents-python/)
- **Have you searched for related issues?** Others may have had similar requesrs
### Question
Describe your question. Provide details if available.

23
.github/workflows/issues.yml vendored Normal file
View file

@ -0,0 +1,23 @@
name: Close inactive issues
on:
schedule:
- cron: "30 1 * * *"
jobs:
close-issues:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v9
with:
days-before-issue-stale: 7
days-before-issue-close: 3
stale-issue-label: "stale"
stale-issue-message: "This issue is stale because it has been open for 7 days with no activity."
close-issue-message: "This issue was closed because it has been inactive for 3 days since being marked as stale."
days-before-pr-stale: -1
days-before-pr-close: -1
any-of-labels: 'question,needs-more-info'
repo-token: ${{ secrets.GITHUB_TOKEN }}

View file

@ -116,9 +116,9 @@ When you call `Runner.run()`, we run a loop until we get a final output.
1. We call the LLM, using the model and settings on the agent, and the message history.
2. The LLM returns a response, which may include tool calls.
3. If the response has a final output (see below for the more on this), we return it and end the loop.
3. If the response has a final output (see below for more on this), we return it and end the loop.
4. If the response has a handoff, we set the agent to the new agent and go back to step 1.
5. We process the tool calls (if any) and append the tool responses messsages. Then we go to step 1.
5. We process the tool calls (if any) and append the tool responses messages. Then we go to step 1.
There is a `max_turns` parameter that you can use to limit the number of times the loop executes.

View file

@ -10,14 +10,14 @@ from agents import set_default_openai_key
set_default_openai_key("sk-...")
```
Alternatively, you can also configure an OpenAI client to be used. By default, the SDK creates an `AsyncOpenAI` instance, using the API key from the environment variable or the default key set above. You can chnage this by using the [set_default_openai_client()][agents.set_default_openai_client] function.
Alternatively, you can also configure an OpenAI client to be used. By default, the SDK creates an `AsyncOpenAI` instance, using the API key from the environment variable or the default key set above. You can change this by using the [set_default_openai_client()][agents.set_default_openai_client] function.
```python
from openai import AsyncOpenAI
from agents import set_default_openai_client
custom_client = AsyncOpenAI(base_url="...", api_key="...")
set_default_openai_client(client)
set_default_openai_client(custom_client)
```
Finally, you can also customize the OpenAI API that is used. By default, we use the OpenAI Responses API. You can override this to use the Chat Completions API by using the [set_default_openai_api()][agents.set_default_openai_api] function.

View file

@ -21,7 +21,7 @@ Input guardrails run in 3 steps:
## Output guardrails
Output guardrailas run in 3 steps:
Output guardrails run in 3 steps:
1. First, the guardrail receives the same input passed to the agent.
2. Next, the guardrail function runs to produce a [`GuardrailFunctionOutput`][agents.guardrail.GuardrailFunctionOutput], which is then wrapped in an [`OutputGuardrailResult`][agents.guardrail.OutputGuardrailResult]
@ -33,7 +33,7 @@ Output guardrailas run in 3 steps:
## Tripwires
If the input or output fails the guardrail, the Guardrail can signal this with a tripwire. As soon as we see a guardail that has triggered the tripwires, we immediately raise a `{Input,Output}GuardrailTripwireTriggered` exception and halt the Agent execution.
If the input or output fails the guardrail, the Guardrail can signal this with a tripwire. As soon as we see a guardrail that has triggered the tripwires, we immediately raise a `{Input,Output}GuardrailTripwireTriggered` exception and halt the Agent execution.
## Implementing a guardrail

View file

@ -1,12 +1,12 @@
# OpenAI Agents SDK
The [OpenAI Agents SDK](https://github.com/openai/openai-agents-python) enables you to build agentic AI apps in a lightweight, easy to use package with very few abstractions. It's a production-ready upgrade of our previous experimentation for agents, [Swarm](https://github.com/openai/swarm/tree/main). The Agents SDK has a very small set of primitives:
The [OpenAI Agents SDK](https://github.com/openai/openai-agents-python) enables you to build agentic AI apps in a lightweight, easy-to-use package with very few abstractions. It's a production-ready upgrade of our previous experimentation for agents, [Swarm](https://github.com/openai/swarm/tree/main). The Agents SDK has a very small set of primitives:
- **Agents**, which are LLMs equipped with instructions and tools
- **Handoffs**, which allow agents to delegate to other agents for specific tasks
- **Guardrails**, which enable the inputs to agents to be validated
In combination with Python, these primitives are powerful enough to express complex relationships between tools and agents, and allow you to build real world applications without a steep learning curve. In addition, the SDK comes with built-in **tracing** that lets you visualize and debug your agentic flows, as well as evaluate them and even fine-tune models for your application.
In combination with Python, these primitives are powerful enough to express complex relationships between tools and agents, and allow you to build real-world applications without a steep learning curve. In addition, the SDK comes with built-in **tracing** that lets you visualize and debug your agentic flows, as well as evaluate them and even fine-tune models for your application.
## Why use the Agents SDK

View file

@ -1,6 +1,6 @@
# Models
The Agents SDK comes with out of the box support for OpenAI models in two flavors:
The Agents SDK comes with out-of-the-box support for OpenAI models in two flavors:
- **Recommended**: the [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel], which calls OpenAI APIs using the new [Responses API](https://platform.openai.com/docs/api-reference/responses).
- The [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel], which calls OpenAI APIs using the [Chat Completions API](https://platform.openai.com/docs/api-reference/chat).
@ -15,7 +15,7 @@ Within a single workflow, you may want to use different models for each agent. F
!!!note
While our SDK supports both the [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel] and the[`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel] shapes, we recommend using a single model shape for each workflow because the two shapes support a different set of features and tools. If your workflow requires mixing and matching model shapes, make sure that all the features you're using are available on both.
While our SDK supports both the [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel] and the [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel] shapes, we recommend using a single model shape for each workflow because the two shapes support a different set of features and tools. If your workflow requires mixing and matching model shapes, make sure that all the features you're using are available on both.
```python
from agents import Agent, Runner, AsyncOpenAI, OpenAIChatCompletionsModel
@ -48,7 +48,7 @@ async def main():
print(result.final_output)
```
1. Sets the the name of an OpenAI model directly.
1. Sets the name of an OpenAI model directly.
2. Provides a [`Model`][agents.models.interface.Model] implementation.
## Using other LLM providers

View file

@ -27,11 +27,11 @@ This pattern is great when the task is open-ended and you want to rely on the in
## Orchestrating via code
While orchestrating via LLM is powerful, orchestrating via LLM makes tasks more deterministic and predictable, in terms of speed, cost and performance. Common patterns here are:
While orchestrating via LLM is powerful, orchestrating via code makes tasks more deterministic and predictable, in terms of speed, cost and performance. Common patterns here are:
- Using [structured outputs](https://platform.openai.com/docs/guides/structured-outputs) to generate well formed data that you can inspect with your code. For example, you might ask an agent to classify the task into a few categories, and then pick the next agent based on the category.
- Chaining multiple agents by transforming the output of one into the input of the next. You can decompose a task like writing a blog post into a series of steps - do research, write an outline, write the blog post, critique it, and then improve it.
- Running the agent that performs the task in a `while` loop with an agent that evaluates and provides feedback, until the evaluator says the output passes certain criteria.
- Running multiple agents in parallel, e.g. via Python primitives like `asyncio.gather`. This is useful for speed when you have multiple tasks that don't depend on each other.
We have a number of examples in [`examples/agent_patterns`](https://github.com/openai/openai-agents-python/examples/agent_patterns).
We have a number of examples in [`examples/agent_patterns`](https://github.com/openai/openai-agents-python/tree/main/examples/agent_patterns).

View file

@ -166,6 +166,9 @@ triage_agent = Agent(
)
async def main():
result = await Runner.run(triage_agent, "who was the first president of the united states?")
print(result.final_output)
result = await Runner.run(triage_agent, "what is life")
print(result.final_output)

View file

@ -32,7 +32,7 @@ The [`new_items`][agents.result.RunResultBase.new_items] property contains the n
- [`MessageOutputItem`][agents.items.MessageOutputItem] indicates a message from the LLM. The raw item is the message generated.
- [`HandoffCallItem`][agents.items.HandoffCallItem] indicates that the LLM called the handoff tool. The raw item is the tool call item from the LLM.
- [`HandoffOutputItem`][agents.items.HandoffOutputItem] indicates that a handoff occured. The raw item is the tool response to the handoff tool call. You can also access the source/target agents from the item.
- [`HandoffOutputItem`][agents.items.HandoffOutputItem] indicates that a handoff occurred. The raw item is the tool response to the handoff tool call. You can also access the source/target agents from the item.
- [`ToolCallItem`][agents.items.ToolCallItem] indicates that the LLM invoked a tool.
- [`ToolCallOutputItem`][agents.items.ToolCallOutputItem] indicates that a tool was called. The raw item is the tool response. You can also access the tool output from the item.
- [`ReasoningItem`][agents.items.ReasoningItem] indicates a reasoning item from the LLM. The raw item is the reasoning generated.

View file

@ -16,7 +16,7 @@ The Agents SDK includes built-in tracing, collecting a comprehensive record of e
- `trace_id`: A unique ID for the trace. Automatically generated if you don't pass one. Must have the format `trace_<32_alphanumeric>`.
- `group_id`: Optional group ID, to link multiple traces from the same conversation. For example, you might use a chat thread ID.
- `disabled`: If True, the trace will not be recorded.
- `metadata`: Optiona metadata for the trace.
- `metadata`: Optional metadata for the trace.
- **Spans** represent operations that have a start and end time. Spans have:
- `started_at` and `ended_at` timestamps.
- `trace_id`, to represent the trace they belong to

View file

@ -21,5 +21,5 @@ If you're building your own research bot, some ideas to add to this are:
1. Retrieval: Add support for fetching relevant information from a vector store. You could use the File Search tool for this.
2. Image and file upload: Allow users to attach PDFs or other files, as baseline context for the research.
3. More planning and thinking: Models often produce better results given more time to think. Improve the planning process to come up with a better plan, and add an evaluation step so that the model can choose to improve it's results, search for more stuff, etc.
3. More planning and thinking: Models often produce better results given more time to think. Improve the planning process to come up with a better plan, and add an evaluation step so that the model can choose to improve its results, search for more stuff, etc.
4. Code execution: Allow running code, which is useful for data analysis.

View file

@ -1,6 +1,5 @@
import asyncio
import base64
import logging
from typing import Literal, Union
from playwright.async_api import Browser, Page, Playwright, async_playwright
@ -16,8 +15,10 @@ from agents import (
trace,
)
logging.getLogger("openai.agents").setLevel(logging.DEBUG)
logging.getLogger("openai.agents").addHandler(logging.StreamHandler())
# Uncomment to see very verbose logs
# import logging
# logging.getLogger("openai.agents").setLevel(logging.DEBUG)
# logging.getLogger("openai.agents").addHandler(logging.StreamHandler())
async def main():

View file

@ -1,6 +1,6 @@
[project]
name = "openai-agents"
version = "0.0.2"
version = "0.0.3"
description = "OpenAI Agents SDK"
readme = "README.md"
requires-python = ">=3.9"
@ -9,7 +9,7 @@ authors = [
{ name = "OpenAI", email = "support@openai.com" },
]
dependencies = [
"openai>=1.66.0",
"openai>=1.66.2",
"pydantic>=2.10, <3",
"griffe>=1.5.6, <2",
"typing-extensions>=4.12.2, <5",

View file

@ -23,7 +23,7 @@ from openai.types.responses.response_computer_tool_call import (
ActionWait,
)
from openai.types.responses.response_input_param import ComputerCallOutput
from openai.types.responses.response_output_item import Reasoning
from openai.types.responses.response_reasoning_item import ResponseReasoningItem
from . import _utils
from .agent import Agent
@ -167,7 +167,7 @@ class RunImpl:
agent: Agent[TContext],
# The original input to the Runner
original_input: str | list[TResponseInputItem],
# Eveything generated by Runner since the original input, but before the current step
# Everything generated by Runner since the original input, but before the current step
pre_step_items: list[RunItem],
new_response: ModelResponse,
processed_response: ProcessedResponse,
@ -288,7 +288,7 @@ class RunImpl:
items.append(ToolCallItem(raw_item=output, agent=agent))
elif isinstance(output, ResponseFunctionWebSearch):
items.append(ToolCallItem(raw_item=output, agent=agent))
elif isinstance(output, Reasoning):
elif isinstance(output, ResponseReasoningItem):
items.append(ReasoningItem(raw_item=output, agent=agent))
elif isinstance(output, ResponseComputerToolCall):
items.append(ToolCallItem(raw_item=output, agent=agent))

View file

@ -19,7 +19,7 @@ from openai.types.responses import (
ResponseStreamEvent,
)
from openai.types.responses.response_input_item_param import ComputerCallOutput, FunctionCallOutput
from openai.types.responses.response_output_item import Reasoning
from openai.types.responses.response_reasoning_item import ResponseReasoningItem
from pydantic import BaseModel
from typing_extensions import TypeAlias
@ -136,10 +136,10 @@ class ToolCallOutputItem(RunItemBase[Union[FunctionCallOutput, ComputerCallOutpu
@dataclass
class ReasoningItem(RunItemBase[Reasoning]):
class ReasoningItem(RunItemBase[ResponseReasoningItem]):
"""Represents a reasoning item."""
raw_item: Reasoning
raw_item: ResponseReasoningItem
"""The raw reasoning item."""
type: Literal["reasoning_item"] = "reasoning_item"

View file

@ -361,7 +361,7 @@ class Converter:
includes = "file_search_call.results" if tool.include_search_results else None
elif isinstance(tool, ComputerTool):
converted_tool = {
"type": "computer-preview",
"type": "computer_use_preview",
"environment": tool.computer.environment,
"display_width": tool.computer.dimensions[0],
"display_height": tool.computer.dimensions[1],

View file

@ -78,9 +78,6 @@ class BackendSpanExporter(TracingExporter):
logger.warning("OPENAI_API_KEY is not set, skipping trace export")
return
traces: list[dict[str, Any]] = []
spans: list[dict[str, Any]] = []
data = [item.export() for item in items if item.export()]
payload = {"data": data}
@ -100,7 +97,7 @@ class BackendSpanExporter(TracingExporter):
# If the response is successful, break out of the loop
if response.status_code < 300:
logger.debug(f"Exported {len(traces)} traces, {len(spans)} spans")
logger.debug(f"Exported {len(items)} items")
return
# If the response is a client error (4xx), we wont retry

View file

@ -13,12 +13,12 @@ from openai.types.responses.response_function_tool_call import ResponseFunctionT
from openai.types.responses.response_function_tool_call_param import ResponseFunctionToolCallParam
from openai.types.responses.response_function_web_search import ResponseFunctionWebSearch
from openai.types.responses.response_function_web_search_param import ResponseFunctionWebSearchParam
from openai.types.responses.response_input_item_param import Reasoning as ReasoningInputParam
from openai.types.responses.response_output_item import Reasoning, ReasoningContent
from openai.types.responses.response_output_message import ResponseOutputMessage
from openai.types.responses.response_output_message_param import ResponseOutputMessageParam
from openai.types.responses.response_output_refusal import ResponseOutputRefusal
from openai.types.responses.response_output_text import ResponseOutputText
from openai.types.responses.response_reasoning_item import ResponseReasoningItem, Summary
from openai.types.responses.response_reasoning_item_param import ResponseReasoningItemParam
from agents import (
Agent,
@ -129,7 +129,7 @@ def test_text_message_outputs_across_list_of_runitems() -> None:
item1: RunItem = MessageOutputItem(agent=Agent(name="test"), raw_item=message1)
item2: RunItem = MessageOutputItem(agent=Agent(name="test"), raw_item=message2)
# Create a non-message run item of a different type, e.g., a reasoning trace.
reasoning = Reasoning(id="rid", content=[], type="reasoning")
reasoning = ResponseReasoningItem(id="rid", summary=[], type="reasoning")
non_message_item: RunItem = ReasoningItem(agent=Agent(name="test"), raw_item=reasoning)
# Confirm only the message outputs are concatenated.
assert ItemHelpers.text_message_outputs([item1, non_message_item, item2]) == "foobar"
@ -266,16 +266,18 @@ def test_to_input_items_for_computer_call_click() -> None:
def test_to_input_items_for_reasoning() -> None:
"""A reasoning output should produce the same dict as a reasoning input item."""
rc = ReasoningContent(text="why", type="reasoning_summary")
reasoning = Reasoning(id="rid1", content=[rc], type="reasoning")
rc = Summary(text="why", type="summary_text")
reasoning = ResponseReasoningItem(id="rid1", summary=[rc], type="reasoning")
resp = ModelResponse(output=[reasoning], usage=Usage(), referenceable_id=None)
input_items = resp.to_input_items()
assert isinstance(input_items, list) and len(input_items) == 1
converted_dict = input_items[0]
expected: ReasoningInputParam = {
expected: ResponseReasoningItemParam = {
"id": "rid1",
"content": [{"text": "why", "type": "reasoning_summary"}],
"summary": [{"text": "why", "type": "summary_text"}],
"type": "reasoning",
}
print(converted_dict)
print(expected)
assert converted_dict == expected

View file

@ -163,7 +163,7 @@ def test_convert_tools_basic_types_and_includes():
assert "function" in types
assert "file_search" in types
assert "web_search_preview" in types
assert "computer-preview" in types
assert "computer_use_preview" in types
# Verify file search tool contains max_num_results and vector_store_ids
file_params = next(ct for ct in converted.tools if ct["type"] == "file_search")
assert file_params.get("max_num_results") == file_tool.max_num_results
@ -173,7 +173,7 @@ def test_convert_tools_basic_types_and_includes():
assert web_params.get("user_location") == web_tool.user_location
assert web_params.get("search_context_size") == web_tool.search_context_size
# Verify computer tool contains environment and computed dimensions
comp_params = next(ct for ct in converted.tools if ct["type"] == "computer-preview")
comp_params = next(ct for ct in converted.tools if ct["type"] == "computer_use_preview")
assert comp_params.get("environment") == "mac"
assert comp_params.get("display_width") == 800
assert comp_params.get("display_height") == 600

View file

@ -7,7 +7,7 @@ from openai.types.responses import (
ResponseFunctionWebSearch,
)
from openai.types.responses.response_computer_tool_call import ActionClick
from openai.types.responses.response_output_item import Reasoning, ReasoningContent
from openai.types.responses.response_reasoning_item import ResponseReasoningItem, Summary
from pydantic import BaseModel
from agents import (
@ -287,8 +287,8 @@ def test_function_web_search_tool_call_parsed_correctly():
def test_reasoning_item_parsed_correctly():
# Verify that a Reasoning output item is converted into a ReasoningItem.
reasoning = Reasoning(
id="r1", type="reasoning", content=[ReasoningContent(text="why", type="reasoning_summary")]
reasoning = ResponseReasoningItem(
id="r1", type="reasoning", summary=[Summary(text="why", type="summary_text")]
)
response = ModelResponse(
output=[reasoning],

10
uv.lock
View file

@ -797,7 +797,7 @@ wheels = [
[[package]]
name = "openai"
version = "1.66.0"
version = "1.66.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "anyio" },
@ -809,14 +809,14 @@ dependencies = [
{ name = "tqdm" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/84/c5/3c422ca3ccc81c063955e7c20739d7f8f37fea0af865c4a60c81e6225e14/openai-1.66.0.tar.gz", hash = "sha256:8a9e672bc6eadec60a962f0b40d7d1c09050010179c919ed65322e433e2d1025", size = 396819 }
sdist = { url = "https://files.pythonhosted.org/packages/d8/e1/b3e1fda1aa32d4f40d4de744e91de4de65c854c3e53c63342e4b5f9c5995/openai-1.66.2.tar.gz", hash = "sha256:9b3a843c25f81ee09b6469d483d9fba779d5c6ea41861180772f043481b0598d", size = 397041 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/d7/f1/d52960dac9519c9de64593460826a0fe2e19159389ec97ecf3e931d2e6a3/openai-1.66.0-py3-none-any.whl", hash = "sha256:43e4a3c0c066cc5809be4e6aac456a3ebc4ec1848226ef9d1340859ac130d45a", size = 566389 },
{ url = "https://files.pythonhosted.org/packages/2c/6f/3315b3583ffe3e31c55b446cb22d2a7c235e65ca191674fffae62deb3c11/openai-1.66.2-py3-none-any.whl", hash = "sha256:75194057ee6bb8b732526387b6041327a05656d976fc21c064e21c8ac6b07999", size = 567268 },
]
[[package]]
name = "openai-agents"
version = "0.0.2"
version = "0.0.3"
source = { editable = "." }
dependencies = [
{ name = "griffe" },
@ -846,7 +846,7 @@ dev = [
[package.metadata]
requires-dist = [
{ name = "griffe", specifier = ">=1.5.6,<2" },
{ name = "openai", specifier = ">=1.66.0" },
{ name = "openai", specifier = ">=1.66.2" },
{ name = "pydantic", specifier = ">=2.10,<3" },
{ name = "requests", specifier = ">=2.0,<3" },
{ name = "types-requests", specifier = ">=2.0,<3" },