diff --git a/.github/ISSUE_TEMPLATE/model_provider.md b/.github/ISSUE_TEMPLATE/model_provider.md
new file mode 100644
index 0000000..b56cb24
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/model_provider.md
@@ -0,0 +1,26 @@
+---
+name: Custom model providers
+about: Questions or bugs about using non-OpenAI models
+title: ''
+labels: bug
+assignees: ''
+
+---
+
+### Please read this first
+
+- **Have you read the custom model provider docs, including the 'Common issues' section?** [Model provider docs](https://openai.github.io/openai-agents-python/models/#using-other-llm-providers)
+- **Have you searched for related issues?** Others may have faced similar issues.
+
+### Describe the question
+A clear and concise description of what the question or bug is.
+
+### Debug information
+- Agents SDK version: (e.g. `v0.0.3`)
+- Python version (e.g. Python 3.10)
+
+### Repro steps
+Ideally provide a minimal python script that can be run to reproduce the issue.
+
+### Expected behavior
+A clear and concise description of what you expected to happen.
diff --git a/.github/PULL_REQUEST_TEMPLATE/pull_request_template.md b/.github/PULL_REQUEST_TEMPLATE/pull_request_template.md
new file mode 100644
index 0000000..0fdeab1
--- /dev/null
+++ b/.github/PULL_REQUEST_TEMPLATE/pull_request_template.md
@@ -0,0 +1,18 @@
+### Summary
+
+<!-- Please give a short summary of the change and the problem this solves. -->
+
+### Test plan
+
+<!-- Please explain how this was tested -->
+
+### Issue number
+
+<!-- For example: "Closes #1234" -->
+
+### Checks
+
+- [ ] I've added new tests (if relevant)
+- [ ] I've added/updated the relevant documentation
+- [ ] I've run `make lint` and `make format`
+- [ ] I've made sure tests pass
diff --git a/README.md b/README.md
index 90fea50..210f6f4 100644
--- a/README.md
+++ b/README.md
@@ -47,9 +47,11 @@ print(result.final_output)
 
 (_If running this, ensure you set the `OPENAI_API_KEY` environment variable_)
 
+(_For Jupyter notebook users, see [hello_world_jupyter.py](examples/basic/hello_world_jupyter.py)_)
+
 ## Handoffs example
 
-```py
+```python
 from agents import Agent, Runner
 import asyncio
 
@@ -140,7 +142,7 @@ The Agents SDK is designed to be highly flexible, allowing you to model a wide r
 
 ## Tracing
 
-The Agents SDK automatically traces your agent runs, making it easy to track and debug the behavior of your agents. Tracing is extensible by design, supporting custom spans and a wide variety of external destinations, including [Logfire](https://logfire.pydantic.dev/docs/integrations/llms/openai/#openai-agents), [AgentOps](https://docs.agentops.ai/v1/integrations/agentssdk), and [Braintrust](https://braintrust.dev/docs/guides/traces/integrations#openai-agents-sdk). For more details about how to customize or disable tracing, see [Tracing](http://openai.github.io/openai-agents-python/tracing).
+The Agents SDK automatically traces your agent runs, making it easy to track and debug the behavior of your agents. Tracing is extensible by design, supporting custom spans and a wide variety of external destinations, including [Logfire](https://logfire.pydantic.dev/docs/integrations/llms/openai/#openai-agents), [AgentOps](https://docs.agentops.ai/v1/integrations/agentssdk), [Braintrust](https://braintrust.dev/docs/guides/traces/integrations#openai-agents-sdk), [Scorecard](https://docs.scorecard.io/docs/documentation/features/tracing#openai-agents-sdk-integration), and [Keywords AI](https://docs.keywordsai.co/integration/development-frameworks/openai-agent). For more details about how to customize or disable tracing, see [Tracing](http://openai.github.io/openai-agents-python/tracing).
 
 ## Development (only needed if you need to edit the SDK/examples)
 
diff --git a/docs/agents.md b/docs/agents.md
index 9b6264b..17589b3 100644
--- a/docs/agents.md
+++ b/docs/agents.md
@@ -13,6 +13,7 @@ The most common properties of an agent you'll configure are:
 ```python
 from agents import Agent, ModelSettings, function_tool
 
+@function_tool
 def get_weather(city: str) -> str:
     return f"The weather in {city} is sunny"
 
@@ -20,7 +21,7 @@ agent = Agent(
     name="Haiku agent",
     instructions="Always respond in haiku form",
     model="o3-mini",
-    tools=[function_tool(get_weather)],
+    tools=[get_weather],
 )
 ```
 
diff --git a/docs/context.md b/docs/context.md
index 5dcaceb..69c43fb 100644
--- a/docs/context.md
+++ b/docs/context.md
@@ -36,6 +36,7 @@ class UserInfo:  # (1)!
     name: str
     uid: int
 
+@function_tool
 async def fetch_user_age(wrapper: RunContextWrapper[UserInfo]) -> str:  # (2)!
     return f"User {wrapper.context.name} is 47 years old"
 
@@ -44,7 +45,7 @@ async def main():
 
     agent = Agent[UserInfo](  # (4)!
         name="Assistant",
-        tools=[function_tool(fetch_user_age)],
+        tools=[fetch_user_age],
     )
 
     result = await Runner.run(
diff --git a/docs/models.md b/docs/models.md
index 7ad515b..ab4cefb 100644
--- a/docs/models.md
+++ b/docs/models.md
@@ -53,21 +53,41 @@ async def main():
 
 ## Using other LLM providers
 
-Many providers also support the OpenAI API format, which means you can pass a `base_url` to the existing OpenAI model implementations and use them easily. `ModelSettings` is used to configure tuning parameters (e.g., temperature, top_p) for the model you select.
+You can use other LLM providers in 3 ways (examples [here](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/)):
 
-```python
-external_client = AsyncOpenAI(
-    api_key="EXTERNAL_API_KEY",
-    base_url="https://api.external.com/v1/",
-)
+1. [`set_default_openai_client`][agents.set_default_openai_client] is useful in cases where you want to globally use an instance of `AsyncOpenAI` as the LLM client. This is for cases where the LLM provider has an OpenAI compatible API endpoint, and you can set the `base_url` and `api_key`. See a configurable example in [examples/model_providers/custom_example_global.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_global.py).
+2. [`ModelProvider`][agents.models.interface.ModelProvider] is at the `Runner.run` level. This lets you say "use a custom model provider for all agents in this run". See a configurable example in [examples/model_providers/custom_example_provider.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_provider.py).
+3. [`Agent.model`][agents.agent.Agent.model] lets you specify the model on a specific Agent instance. This enables you to mix and match different providers for different agents. See a configurable example in [examples/model_providers/custom_example_agent.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_agent.py).
+
+In cases where you do not have an API key from `platform.openai.com`, we recommend disabling tracing via `set_tracing_disabled()`, or setting up a [different tracing processor](tracing.md).
+
+!!! note
+
+    In these examples, we use the Chat Completions API/model, because most LLM providers don't yet support the Responses API. If your LLM provider does support it, we recommend using Responses.
+
+## Common issues with using other LLM providers
+
+### Tracing client error 401
+
+If you get errors related to tracing, this is because traces are uploaded to OpenAI servers, and you don't have an OpenAI API key. You have three options to resolve this:
+
+1. Disable tracing entirely: [`set_tracing_disabled(True)`][agents.set_tracing_disabled].
+2. Set an OpenAI key for tracing: [`set_tracing_export_api_key(...)`][agents.set_tracing_export_api_key]. This API key will only be used for uploading traces, and must be from [platform.openai.com](https://platform.openai.com/).
+3. Use a non-OpenAI trace processor. See the [tracing docs](tracing.md#custom-tracing-processors).
+
+### Responses API support
+
+The SDK uses the Responses API by default, but most other LLM providers don't yet support it. You may see 404s or similar issues as a result. To resolve, you have two options:
+
+1. Call [`set_default_openai_api("chat_completions")`][agents.set_default_openai_api]. This works if you are setting `OPENAI_API_KEY` and `OPENAI_BASE_URL` via environment vars.
+2. Use [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel]. There are examples [here](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/).
+
+### Structured outputs support
+
+Some model providers don't have support for [structured outputs](https://platform.openai.com/docs/guides/structured-outputs). This sometimes results in an error that looks something like this:
 
-spanish_agent = Agent(
-    name="Spanish agent",
-    instructions="You only speak Spanish.",
-    model=OpenAIChatCompletionsModel(
-        model="EXTERNAL_MODEL_NAME",
-        openai_client=external_client,
-    ),
-    model_settings=ModelSettings(temperature=0.5),
-)
 ```
+BadRequestError: Error code: 400 - {'error': {'message': "'response_format.type' : value is not one of the allowed values ['text','json_object']", 'type': 'invalid_request_error'}}
+```
+
+This is a shortcoming of some model providers - they support JSON outputs, but don't allow you to specify the `json_schema` to use for the output. We are working on a fix for this, but we suggest relying on providers that do have support for JSON schema output, because otherwise your app will often break because of malformed JSON.
diff --git a/docs/running_agents.md b/docs/running_agents.md
index a2f137c..32abf9d 100644
--- a/docs/running_agents.md
+++ b/docs/running_agents.md
@@ -78,7 +78,7 @@ async def main():
         # San Francisco
 
         # Second turn
-        new_input = output.to_input_list() + [{"role": "user", "content": "What state is it in?"}]
+        new_input = result.to_input_list() + [{"role": "user", "content": "What state is it in?"}]
         result = await Runner.run(agent, new_input)
         print(result.final_output)
         # California
diff --git a/docs/tracing.md b/docs/tracing.md
index da0d536..d7d0a65 100644
--- a/docs/tracing.md
+++ b/docs/tracing.md
@@ -50,7 +50,7 @@ async def main():
 
     with trace("Joke workflow"): # (1)!
         first_result = await Runner.run(agent, "Tell me a joke")
-        second_result = await Runner.run(agent, f"Rate this joke: {first_output.final_output}")
+        second_result = await Runner.run(agent, f"Rate this joke: {first_result.final_output}")
         print(f"Joke: {first_result.final_output}")
         print(f"Rating: {second_result.final_output}")
 ```
@@ -93,3 +93,5 @@ External trace processors include:
 -   [Braintrust](https://braintrust.dev/docs/guides/traces/integrations#openai-agents-sdk)
 -   [Pydantic Logfire](https://logfire.pydantic.dev/docs/integrations/llms/openai/#openai-agents)
 -   [AgentOps](https://docs.agentops.ai/v1/integrations/agentssdk)
+-   [Scorecard](https://docs.scorecard.io/docs/documentation/features/tracing#openai-agents-sdk-integration))
+-   [Keywords AI](https://docs.keywordsai.co/integration/development-frameworks/openai-agent)
diff --git a/examples/agent_patterns/input_guardrails.py b/examples/agent_patterns/input_guardrails.py
index 6259188..8c8e182 100644
--- a/examples/agent_patterns/input_guardrails.py
+++ b/examples/agent_patterns/input_guardrails.py
@@ -53,7 +53,7 @@ async def math_guardrail(
 
     return GuardrailFunctionOutput(
         output_info=final_output,
-        tripwire_triggered=not final_output.is_math_homework,
+        tripwire_triggered=final_output.is_math_homework,
     )
 
 
diff --git a/examples/basic/hello_world_jupyter.py b/examples/basic/hello_world_jupyter.py
new file mode 100644
index 0000000..bb8f14c
--- /dev/null
+++ b/examples/basic/hello_world_jupyter.py
@@ -0,0 +1,11 @@
+from agents import Agent, Runner
+
+agent = Agent(name="Assistant", instructions="You are a helpful assistant")
+
+# Intended for Jupyter notebooks where there's an existing event loop
+result = await Runner.run(agent, "Write a haiku about recursion in programming.") # type: ignore[top-level-await]  # noqa: F704
+print(result.final_output)
+
+# Code within code loops,
+# Infinite mirrors reflect—
+# Logic folds on self.
diff --git a/examples/model_providers/README.md b/examples/model_providers/README.md
new file mode 100644
index 0000000..f9330c2
--- /dev/null
+++ b/examples/model_providers/README.md
@@ -0,0 +1,19 @@
+# Custom LLM providers
+
+The examples in this directory demonstrate how you might use a non-OpenAI LLM provider. To run them, first set a base URL, API key and model.
+
+```bash
+export EXAMPLE_BASE_URL="..."
+export EXAMPLE_API_KEY="..."
+export EXAMPLE_MODEL_NAME"..."
+```
+
+Then run the examples, e.g.:
+
+```
+python examples/model_providers/custom_example_provider.py
+
+Loops within themselves,
+Function calls its own being,
+Depth without ending.
+```
diff --git a/examples/model_providers/custom_example_agent.py b/examples/model_providers/custom_example_agent.py
new file mode 100644
index 0000000..f10865c
--- /dev/null
+++ b/examples/model_providers/custom_example_agent.py
@@ -0,0 +1,55 @@
+import asyncio
+import os
+
+from openai import AsyncOpenAI
+
+from agents import Agent, OpenAIChatCompletionsModel, Runner, function_tool, set_tracing_disabled
+
+BASE_URL = os.getenv("EXAMPLE_BASE_URL") or ""
+API_KEY = os.getenv("EXAMPLE_API_KEY") or ""
+MODEL_NAME = os.getenv("EXAMPLE_MODEL_NAME") or ""
+
+if not BASE_URL or not API_KEY or not MODEL_NAME:
+    raise ValueError(
+        "Please set EXAMPLE_BASE_URL, EXAMPLE_API_KEY, EXAMPLE_MODEL_NAME via env var or code."
+    )
+
+"""This example uses a custom provider for a specific agent. Steps:
+1. Create a custom OpenAI client.
+2. Create a `Model` that uses the custom client.
+3. Set the `model` on the Agent.
+
+Note that in this example, we disable tracing under the assumption that you don't have an API key
+from platform.openai.com. If you do have one, you can either set the `OPENAI_API_KEY` env var
+or call set_tracing_export_api_key() to set a tracing specific key.
+"""
+client = AsyncOpenAI(base_url=BASE_URL, api_key=API_KEY)
+set_tracing_disabled(disabled=True)
+
+# An alternate approach that would also work:
+# PROVIDER = OpenAIProvider(openai_client=client)
+# agent = Agent(..., model="some-custom-model")
+# Runner.run(agent, ..., run_config=RunConfig(model_provider=PROVIDER))
+
+
+@function_tool
+def get_weather(city: str):
+    print(f"[debug] getting weather for {city}")
+    return f"The weather in {city} is sunny."
+
+
+async def main():
+    # This agent will use the custom LLM provider
+    agent = Agent(
+        name="Assistant",
+        instructions="You only respond in haikus.",
+        model=OpenAIChatCompletionsModel(model=MODEL_NAME, openai_client=client),
+        tools=[get_weather],
+    )
+
+    result = await Runner.run(agent, "What's the weather in Tokyo?")
+    print(result.final_output)
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/examples/model_providers/custom_example_global.py b/examples/model_providers/custom_example_global.py
new file mode 100644
index 0000000..ae9756d
--- /dev/null
+++ b/examples/model_providers/custom_example_global.py
@@ -0,0 +1,63 @@
+import asyncio
+import os
+
+from openai import AsyncOpenAI
+
+from agents import (
+    Agent,
+    Runner,
+    function_tool,
+    set_default_openai_api,
+    set_default_openai_client,
+    set_tracing_disabled,
+)
+
+BASE_URL = os.getenv("EXAMPLE_BASE_URL") or ""
+API_KEY = os.getenv("EXAMPLE_API_KEY") or ""
+MODEL_NAME = os.getenv("EXAMPLE_MODEL_NAME") or ""
+
+if not BASE_URL or not API_KEY or not MODEL_NAME:
+    raise ValueError(
+        "Please set EXAMPLE_BASE_URL, EXAMPLE_API_KEY, EXAMPLE_MODEL_NAME via env var or code."
+    )
+
+
+"""This example uses a custom provider for all requests by default. We do three things:
+1. Create a custom client.
+2. Set it as the default OpenAI client, and don't use it for tracing.
+3. Set the default API as Chat Completions, as most LLM providers don't yet support Responses API.
+
+Note that in this example, we disable tracing under the assumption that you don't have an API key
+from platform.openai.com. If you do have one, you can either set the `OPENAI_API_KEY` env var
+or call set_tracing_export_api_key() to set a tracing specific key.
+"""
+
+client = AsyncOpenAI(
+    base_url=BASE_URL,
+    api_key=API_KEY,
+)
+set_default_openai_client(client=client, use_for_tracing=False)
+set_default_openai_api("chat_completions")
+set_tracing_disabled(disabled=True)
+
+
+@function_tool
+def get_weather(city: str):
+    print(f"[debug] getting weather for {city}")
+    return f"The weather in {city} is sunny."
+
+
+async def main():
+    agent = Agent(
+        name="Assistant",
+        instructions="You only respond in haikus.",
+        model=MODEL_NAME,
+        tools=[get_weather],
+    )
+
+    result = await Runner.run(agent, "What's the weather in Tokyo?")
+    print(result.final_output)
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/examples/model_providers/custom_example_provider.py b/examples/model_providers/custom_example_provider.py
new file mode 100644
index 0000000..4e59019
--- /dev/null
+++ b/examples/model_providers/custom_example_provider.py
@@ -0,0 +1,77 @@
+from __future__ import annotations
+
+import asyncio
+import os
+
+from openai import AsyncOpenAI
+
+from agents import (
+    Agent,
+    Model,
+    ModelProvider,
+    OpenAIChatCompletionsModel,
+    RunConfig,
+    Runner,
+    function_tool,
+    set_tracing_disabled,
+)
+
+BASE_URL = os.getenv("EXAMPLE_BASE_URL") or ""
+API_KEY = os.getenv("EXAMPLE_API_KEY") or ""
+MODEL_NAME = os.getenv("EXAMPLE_MODEL_NAME") or ""
+
+if not BASE_URL or not API_KEY or not MODEL_NAME:
+    raise ValueError(
+        "Please set EXAMPLE_BASE_URL, EXAMPLE_API_KEY, EXAMPLE_MODEL_NAME via env var or code."
+    )
+
+
+"""This example uses a custom provider for some calls to Runner.run(), and direct calls to OpenAI for
+others. Steps:
+1. Create a custom OpenAI client.
+2. Create a ModelProvider that uses the custom client.
+3. Use the ModelProvider in calls to Runner.run(), only when we want to use the custom LLM provider.
+
+Note that in this example, we disable tracing under the assumption that you don't have an API key
+from platform.openai.com. If you do have one, you can either set the `OPENAI_API_KEY` env var
+or call set_tracing_export_api_key() to set a tracing specific key.
+"""
+client = AsyncOpenAI(base_url=BASE_URL, api_key=API_KEY)
+set_tracing_disabled(disabled=True)
+
+
+class CustomModelProvider(ModelProvider):
+    def get_model(self, model_name: str | None) -> Model:
+        return OpenAIChatCompletionsModel(model=model_name or MODEL_NAME, openai_client=client)
+
+
+CUSTOM_MODEL_PROVIDER = CustomModelProvider()
+
+
+@function_tool
+def get_weather(city: str):
+    print(f"[debug] getting weather for {city}")
+    return f"The weather in {city} is sunny."
+
+
+async def main():
+    agent = Agent(name="Assistant", instructions="You only respond in haikus.", tools=[get_weather])
+
+    # This will use the custom model provider
+    result = await Runner.run(
+        agent,
+        "What's the weather in Tokyo?",
+        run_config=RunConfig(model_provider=CUSTOM_MODEL_PROVIDER),
+    )
+    print(result.final_output)
+
+    # If you uncomment this, it will use OpenAI directly, not the custom provider
+    # result = await Runner.run(
+    #     agent,
+    #     "What's the weather in Tokyo?",
+    # )
+    # print(result.final_output)
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/pyproject.toml b/pyproject.toml
index 0dec7a5..ff3d01f 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "openai-agents"
-version = "0.0.3"
+version = "0.0.4"
 description = "OpenAI Agents SDK"
 readme = "README.md"
 requires-python = ">=3.9"
diff --git a/src/agents/__init__.py b/src/agents/__init__.py
index 69c500a..a2d7f24 100644
--- a/src/agents/__init__.py
+++ b/src/agents/__init__.py
@@ -92,13 +92,19 @@ from .tracing import (
 from .usage import Usage
 
 
-def set_default_openai_key(key: str) -> None:
-    """Set the default OpenAI API key to use for LLM requests and tracing. This is only necessary if
-    the OPENAI_API_KEY environment variable is not already set.
+def set_default_openai_key(key: str, use_for_tracing: bool = True) -> None:
+    """Set the default OpenAI API key to use for LLM requests (and optionally tracing(). This is
+    only necessary if the OPENAI_API_KEY environment variable is not already set.
 
     If provided, this key will be used instead of the OPENAI_API_KEY environment variable.
+
+    Args:
+        key: The OpenAI key to use.
+        use_for_tracing: Whether to also use this key to send traces to OpenAI. Defaults to True
+            If False, you'll either need to set the OPENAI_API_KEY environment variable or call
+            set_tracing_export_api_key() with the API key you want to use for tracing.
     """
-    _config.set_default_openai_key(key)
+    _config.set_default_openai_key(key, use_for_tracing)
 
 
 def set_default_openai_client(client: AsyncOpenAI, use_for_tracing: bool = True) -> None:
@@ -123,10 +129,9 @@ def set_default_openai_api(api: Literal["chat_completions", "responses"]) -> Non
 
 def enable_verbose_stdout_logging():
     """Enables verbose logging to stdout. This is useful for debugging."""
-    for name in ["openai.agents", "openai.agents.tracing"]:
-        logger = logging.getLogger(name)
-        logger.setLevel(logging.DEBUG)
-        logger.addHandler(logging.StreamHandler(sys.stdout))
+    logger = logging.getLogger("openai.agents")
+    logger.setLevel(logging.DEBUG)
+    logger.addHandler(logging.StreamHandler(sys.stdout))
 
 
 __all__ = [
diff --git a/src/agents/_config.py b/src/agents/_config.py
index 55ded64..304cfb8 100644
--- a/src/agents/_config.py
+++ b/src/agents/_config.py
@@ -5,15 +5,18 @@ from .models import _openai_shared
 from .tracing import set_tracing_export_api_key
 
 
-def set_default_openai_key(key: str) -> None:
-    set_tracing_export_api_key(key)
+def set_default_openai_key(key: str, use_for_tracing: bool) -> None:
     _openai_shared.set_default_openai_key(key)
 
+    if use_for_tracing:
+        set_tracing_export_api_key(key)
+
 
 def set_default_openai_client(client: AsyncOpenAI, use_for_tracing: bool) -> None:
+    _openai_shared.set_default_openai_client(client)
+
     if use_for_tracing:
         set_tracing_export_api_key(client.api_key)
-    _openai_shared.set_default_openai_client(client)
 
 
 def set_default_openai_api(api: Literal["chat_completions", "responses"]) -> None:
diff --git a/src/agents/guardrail.py b/src/agents/guardrail.py
index fcae0b8..5bebcd6 100644
--- a/src/agents/guardrail.py
+++ b/src/agents/guardrail.py
@@ -86,7 +86,7 @@ class InputGuardrail(Generic[TContext]):
         [RunContextWrapper[TContext], Agent[Any], str | list[TResponseInputItem]],
         MaybeAwaitable[GuardrailFunctionOutput],
     ]
-    """A function that receives the the agent input and the context, and returns a
+    """A function that receives the agent input and the context, and returns a
      `GuardrailResult`. The result marks whether the tripwire was triggered, and can optionally
      include information about the guardrail's output.
     """
diff --git a/src/agents/model_settings.py b/src/agents/model_settings.py
index d8178ae..cc4b6cb 100644
--- a/src/agents/model_settings.py
+++ b/src/agents/model_settings.py
@@ -10,15 +10,34 @@ class ModelSettings:
 
     This class holds optional model configuration parameters (e.g. temperature,
     top_p, penalties, truncation, etc.).
+
+    Not all models/providers support all of these parameters, so please check the API documentation
+    for the specific model and provider you are using.
     """
 
     temperature: float | None = None
+    """The temperature to use when calling the model."""
+
     top_p: float | None = None
+    """The top_p to use when calling the model."""
+
     frequency_penalty: float | None = None
+    """The frequency penalty to use when calling the model."""
+
     presence_penalty: float | None = None
+    """The presence penalty to use when calling the model."""
+
     tool_choice: Literal["auto", "required", "none"] | str | None = None
+    """The tool choice to use when calling the model."""
+
     parallel_tool_calls: bool | None = False
+    """Whether to use parallel tool calls when calling the model."""
+
     truncation: Literal["auto", "disabled"] | None = None
+    """The truncation strategy to use when calling the model."""
+
+    max_tokens: int | None = None
+    """The maximum number of output tokens to generate."""
 
     def resolve(self, override: ModelSettings | None) -> ModelSettings:
         """Produce a new ModelSettings by overlaying any non-None values from the
@@ -33,4 +52,5 @@ class ModelSettings:
             tool_choice=override.tool_choice or self.tool_choice,
             parallel_tool_calls=override.parallel_tool_calls or self.parallel_tool_calls,
             truncation=override.truncation or self.truncation,
+            max_tokens=override.max_tokens or self.max_tokens,
         )
diff --git a/src/agents/models/openai_chatcompletions.py b/src/agents/models/openai_chatcompletions.py
index a7340d0..3543225 100644
--- a/src/agents/models/openai_chatcompletions.py
+++ b/src/agents/models/openai_chatcompletions.py
@@ -51,8 +51,10 @@ from openai.types.responses import (
     ResponseOutputText,
     ResponseRefusalDeltaEvent,
     ResponseTextDeltaEvent,
+    ResponseUsage,
 )
 from openai.types.responses.response_input_param import FunctionCallOutput, ItemReference, Message
+from openai.types.responses.response_usage import OutputTokensDetails
 
 from .. import _debug
 from ..agent_output import AgentOutputSchema
@@ -405,7 +407,23 @@ class OpenAIChatCompletionsModel(Model):
             for function_call in state.function_calls.values():
                 outputs.append(function_call)
 
-            final_response = response.model_copy(update={"output": outputs, "usage": usage})
+            final_response = response.model_copy()
+            final_response.output = outputs
+            final_response.usage = (
+                ResponseUsage(
+                    input_tokens=usage.prompt_tokens,
+                    output_tokens=usage.completion_tokens,
+                    total_tokens=usage.total_tokens,
+                    output_tokens_details=OutputTokensDetails(
+                        reasoning_tokens=usage.completion_tokens_details.reasoning_tokens
+                        if usage.completion_tokens_details
+                        and usage.completion_tokens_details.reasoning_tokens
+                        else 0
+                    ),
+                )
+                if usage
+                else None
+            )
 
             yield ResponseCompletedEvent(
                 response=final_response,
@@ -503,6 +521,7 @@ class OpenAIChatCompletionsModel(Model):
             top_p=self._non_null_or_not_given(model_settings.top_p),
             frequency_penalty=self._non_null_or_not_given(model_settings.frequency_penalty),
             presence_penalty=self._non_null_or_not_given(model_settings.presence_penalty),
+            max_tokens=self._non_null_or_not_given(model_settings.max_tokens),
             tool_choice=tool_choice,
             response_format=response_format,
             parallel_tool_calls=parallel_tool_calls,
@@ -808,6 +827,13 @@ class _Converter:
                         "content": cls.extract_text_content(content),
                     }
                     result.append(msg_developer)
+                elif role == "assistant":
+                    flush_assistant_message()
+                    msg_assistant: ChatCompletionAssistantMessageParam = {
+                        "role": "assistant",
+                        "content": cls.extract_text_content(content),
+                    }
+                    result.append(msg_assistant)
                 else:
                     raise UserError(f"Unexpected role in easy_input_message: {role}")
 
diff --git a/src/agents/models/openai_provider.py b/src/agents/models/openai_provider.py
index 5194663..e6a859f 100644
--- a/src/agents/models/openai_provider.py
+++ b/src/agents/models/openai_provider.py
@@ -38,28 +38,41 @@ class OpenAIProvider(ModelProvider):
             assert api_key is None and base_url is None, (
                 "Don't provide api_key or base_url if you provide openai_client"
             )
-            self._client = openai_client
+            self._client: AsyncOpenAI | None = openai_client
         else:
-            self._client = _openai_shared.get_default_openai_client() or AsyncOpenAI(
-                api_key=api_key or _openai_shared.get_default_openai_key(),
-                base_url=base_url,
-                organization=organization,
-                project=project,
-                http_client=shared_http_client(),
-            )
+            self._client = None
+            self._stored_api_key = api_key
+            self._stored_base_url = base_url
+            self._stored_organization = organization
+            self._stored_project = project
 
-        self._is_openai_model = self._client.base_url.host.startswith("api.openai.com")
         if use_responses is not None:
             self._use_responses = use_responses
         else:
             self._use_responses = _openai_shared.get_use_responses_by_default()
 
+    # We lazy load the client in case you never actually use OpenAIProvider(). Otherwise
+    # AsyncOpenAI() raises an error if you don't have an API key set.
+    def _get_client(self) -> AsyncOpenAI:
+        if self._client is None:
+            self._client = _openai_shared.get_default_openai_client() or AsyncOpenAI(
+                api_key=self._stored_api_key or _openai_shared.get_default_openai_key(),
+                base_url=self._stored_base_url,
+                organization=self._stored_organization,
+                project=self._stored_project,
+                http_client=shared_http_client(),
+            )
+
+        return self._client
+
     def get_model(self, model_name: str | None) -> Model:
         if model_name is None:
             model_name = DEFAULT_MODEL
 
+        client = self._get_client()
+
         return (
-            OpenAIResponsesModel(model=model_name, openai_client=self._client)
+            OpenAIResponsesModel(model=model_name, openai_client=client)
             if self._use_responses
-            else OpenAIChatCompletionsModel(model=model_name, openai_client=self._client)
+            else OpenAIChatCompletionsModel(model=model_name, openai_client=client)
         )
diff --git a/src/agents/models/openai_responses.py b/src/agents/models/openai_responses.py
index e060fb8..78765ec 100644
--- a/src/agents/models/openai_responses.py
+++ b/src/agents/models/openai_responses.py
@@ -5,7 +5,7 @@ from collections.abc import AsyncIterator
 from dataclasses import dataclass
 from typing import TYPE_CHECKING, Any, Literal, overload
 
-from openai import NOT_GIVEN, AsyncOpenAI, AsyncStream, NotGiven
+from openai import NOT_GIVEN, APIStatusError, AsyncOpenAI, AsyncStream, NotGiven
 from openai.types import ChatModel
 from openai.types.responses import (
     Response,
@@ -113,7 +113,8 @@ class OpenAIResponsesModel(Model):
                         },
                     )
                 )
-                logger.error(f"Error getting response: {e}")
+                request_id = e.request_id if isinstance(e, APIStatusError) else None
+                logger.error(f"Error getting response: {e}. (request_id: {request_id})")
                 raise
 
         return ModelResponse(
@@ -235,6 +236,7 @@ class OpenAIResponsesModel(Model):
             temperature=self._non_null_or_not_given(model_settings.temperature),
             top_p=self._non_null_or_not_given(model_settings.top_p),
             truncation=self._non_null_or_not_given(model_settings.truncation),
+            max_output_tokens=self._non_null_or_not_given(model_settings.max_tokens),
             tool_choice=tool_choice,
             parallel_tool_calls=parallel_tool_calls,
             stream=stream,
diff --git a/src/agents/result.py b/src/agents/result.py
index 5683827..6e806b7 100644
--- a/src/agents/result.py
+++ b/src/agents/result.py
@@ -216,5 +216,3 @@ class RunResultStreaming(RunResultBase):
 
         if self._output_guardrails_task and not self._output_guardrails_task.done():
             self._output_guardrails_task.cancel()
-            self._output_guardrails_task.cancel()
-            self._output_guardrails_task.cancel()
diff --git a/src/agents/tracing/create.py b/src/agents/tracing/create.py
index 8d7fc49..78a064b 100644
--- a/src/agents/tracing/create.py
+++ b/src/agents/tracing/create.py
@@ -3,7 +3,7 @@ from __future__ import annotations
 from collections.abc import Mapping, Sequence
 from typing import TYPE_CHECKING, Any
 
-from .logger import logger
+from ..logger import logger
 from .setup import GLOBAL_TRACE_PROVIDER
 from .span_data import (
     AgentSpanData,
diff --git a/src/agents/tracing/processors.py b/src/agents/tracing/processors.py
index 308adf2..1b39ded 100644
--- a/src/agents/tracing/processors.py
+++ b/src/agents/tracing/processors.py
@@ -9,7 +9,7 @@ from typing import Any
 
 import httpx
 
-from .logger import logger
+from ..logger import logger
 from .processor_interface import TracingExporter, TracingProcessor
 from .spans import Span
 from .traces import Trace
@@ -40,7 +40,7 @@ class BackendSpanExporter(TracingExporter):
         """
         Args:
             api_key: The API key for the "Authorization" header. Defaults to
-                `os.environ["OPENAI_TRACE_API_KEY"]` if not provided.
+                `os.environ["OPENAI_API_KEY"]` if not provided.
             organization: The OpenAI organization to use. Defaults to
                 `os.environ["OPENAI_ORG_ID"]` if not provided.
             project: The OpenAI project to use. Defaults to
diff --git a/src/agents/tracing/scope.py b/src/agents/tracing/scope.py
index 9ccd9f8..513ca8c 100644
--- a/src/agents/tracing/scope.py
+++ b/src/agents/tracing/scope.py
@@ -2,7 +2,7 @@
 import contextvars
 from typing import TYPE_CHECKING, Any
 
-from .logger import logger
+from ..logger import logger
 
 if TYPE_CHECKING:
     from .spans import Span
diff --git a/src/agents/tracing/setup.py b/src/agents/tracing/setup.py
index bc340c9..3a7c6ad 100644
--- a/src/agents/tracing/setup.py
+++ b/src/agents/tracing/setup.py
@@ -4,8 +4,8 @@ import os
 import threading
 from typing import Any
 
+from ..logger import logger
 from . import util
-from .logger import logger
 from .processor_interface import TracingProcessor
 from .scope import Scope
 from .spans import NoOpSpan, Span, SpanImpl, TSpanData
diff --git a/src/agents/tracing/spans.py b/src/agents/tracing/spans.py
index d682a9a..ee933e7 100644
--- a/src/agents/tracing/spans.py
+++ b/src/agents/tracing/spans.py
@@ -6,8 +6,8 @@ from typing import Any, Generic, TypeVar
 
 from typing_extensions import TypedDict
 
+from ..logger import logger
 from . import util
-from .logger import logger
 from .processor_interface import TracingProcessor
 from .scope import Scope
 from .span_data import SpanData
diff --git a/src/agents/tracing/traces.py b/src/agents/tracing/traces.py
index bf3b43d..53d0628 100644
--- a/src/agents/tracing/traces.py
+++ b/src/agents/tracing/traces.py
@@ -4,8 +4,8 @@ import abc
 import contextvars
 from typing import Any
 
+from ..logger import logger
 from . import util
-from .logger import logger
 from .processor_interface import TracingProcessor
 from .scope import Scope
 
diff --git a/tests/test_openai_chatcompletions_converter.py b/tests/test_openai_chatcompletions_converter.py
index 8cf07d7..73acb8a 100644
--- a/tests/test_openai_chatcompletions_converter.py
+++ b/tests/test_openai_chatcompletions_converter.py
@@ -393,3 +393,38 @@ def test_unknown_object_errors():
     with pytest.raises(UserError, match="Unhandled item type or structure"):
         # Purposely ignore the type error
         _Converter.items_to_messages([TestObject()])  # type: ignore
+
+
+def test_assistant_messages_in_history():
+    """
+    Test that assistant messages are added to the history.
+    """
+    messages = _Converter.items_to_messages(
+        [
+            {
+                "role": "user",
+                "content": "Hello",
+            },
+            {
+                "role": "assistant",
+                "content": "Hello?",
+            },
+            {
+                "role": "user",
+                "content": "What was my Name?",
+            },
+        ]
+    )
+
+    assert messages == [
+        {"role": "user", "content": "Hello"},
+        {"role": "assistant", "content": "Hello?"},
+        {"role": "user", "content": "What was my Name?"},
+    ]
+    assert len(messages) == 3
+    assert messages[0]["role"] == "user"
+    assert messages[0]["content"] == "Hello"
+    assert messages[1]["role"] == "assistant"
+    assert messages[1]["content"] == "Hello?"
+    assert messages[2]["role"] == "user"
+    assert messages[2]["content"] == "What was my Name?"
diff --git a/tests/test_openai_chatcompletions_stream.py b/tests/test_openai_chatcompletions_stream.py
index 2a15f7f..7add92a 100644
--- a/tests/test_openai_chatcompletions_stream.py
+++ b/tests/test_openai_chatcompletions_stream.py
@@ -107,6 +107,11 @@ async def test_stream_response_yields_events_for_text_content(monkeypatch) -> No
     assert isinstance(completed_resp.output[0].content[0], ResponseOutputText)
     assert completed_resp.output[0].content[0].text == "Hello"
 
+    assert completed_resp.usage, "usage should not be None"
+    assert completed_resp.usage.input_tokens == 7
+    assert completed_resp.usage.output_tokens == 5
+    assert completed_resp.usage.total_tokens == 12
+
 
 @pytest.mark.allow_call_model_methods
 @pytest.mark.asyncio
diff --git a/uv.lock b/uv.lock
index c828fa3..40f0553 100644
--- a/uv.lock
+++ b/uv.lock
@@ -1,5 +1,4 @@
 version = 1
-revision = 1
 requires-python = ">=3.9"
 
 [[package]]
@@ -816,7 +815,7 @@ wheels = [
 
 [[package]]
 name = "openai-agents"
-version = "0.0.3"
+version = "0.0.4"
 source = { editable = "." }
 dependencies = [
     { name = "griffe" },