feat(pydantic-ai): migrate onto unified harness surface (PR4)#415
Open
declan-scale wants to merge 14 commits into
Open
feat(pydantic-ai): migrate onto unified harness surface (PR4)#415declan-scale wants to merge 14 commits into
declan-scale wants to merge 14 commits into
Conversation
b4c53ca to
cae14d4
Compare
724120b to
6d0f0e8
Compare
Contributor
Author
|
@greptile review |
8cd851c to
2e820c7
Compare
…or usage capture Adds an `on_result: Callable[[AgentRunResultEvent], Any] | None = None` parameter to `convert_pydantic_ai_to_agentex_events`. When set, the callback is invoked (sync or async) with the terminal `AgentRunResultEvent` that carries the run result and usage. Streaming output is unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tests Strengthen backward-compat guarantees for the on_result callback: - test_streaming_output_unchanged_with_callback now asserts model_dump() equality per yielded pair, not just type, proving the callback does not alter streamed message content. - test_async_callback_is_awaited adds a real suspension point (await asyncio.sleep(0)) before its side effect, so the assertion only passes if the converter actually awaits the returned coroutine. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds PydanticAITurn, a HarnessTurn wrapping a pydantic-ai event stream, with pydantic_ai_usage_to_turn_usage mapping verified RunUsage fields (requests, input_tokens, output_tokens, cache_read_tokens, total_tokens) onto TurnUsage via defensive getattr; usage() populates after events exhaustion. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ion + cover real usage accessor Pass getattr results straight through so a real 0 (e.g. a cache-hit with 0 output tokens) stays 0 while a MISSING attribute still degrades to None. Previously `x if x else None` coerced legitimate zeros to None. Adds tests for the 0->0 mapping, the missing-field->None defensive guarantee, and the real result.usage property accessor path the converter uses. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds TestCharacterizeWireShapeCurrent to lock the current wire-level delivery shape: text via streaming_task_message_context, tool messages via adk.messages.create. Serves as the before-snapshot for the UnifiedEmitter reimplementation that follows. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…edEmitter (default tracing) Replaces the hand-rolled event loop in _pydantic_ai_async.py with a three-line delegation to UnifiedEmitter.auto_send_turn: turn = PydanticAITurn(stream, model=None, tracing_handler=tracing_handler) emitter = UnifiedEmitter(task_id=task_id, trace_id=None, parent_span_id=None) return (await emitter.auto_send_turn(turn)).final_text Public signature unchanged: stream_pydantic_ai_events(stream, task_id, tracing_handler=None) -> str. Supporting changes: - _pydantic_ai_turn.py: add optional tracing_handler arg (threaded to convert_pydantic_ai_to_agentex_events); add _coalesce_tool_requests() which converts Start(tool_request)+deltas+Done into Full(tool_request) so auto_send receives tool messages in the shape it expects (Option A: no streaming of argument tokens in the async/temporal path). - auto_send.py: reset final_text_parts on Start(text) so multi-step runs return only the last text segment, matching stream_langgraph_events and the existing stream_pydantic_ai_events convention. Wire shape change (AGX1-373 accepted envelope change): Before: tool messages via adk.messages.create After: tool messages via streaming_task_message_context open+close pairs Logical content (tool_call_id, name, arguments, result) is identical; only the delivery channel changed. Test updates: all test assertions updated to the new delivery channel. Two tool_call_with_*_args tests updated to include PartDeltaEvent (the realistic pydantic-ai event sequence for streamed JSON args). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sync arg streaming); ref AGX1-377 PydanticAITurn.events feeds BOTH delivery channels (yield_turn for sync, auto_send_turn for async). Applying _coalesce_tool_requests unconditionally would deliver tool requests as a single Full with no ToolRequestDelta tokens, losing the sync converter's documented tool-call-argument token streaming (Task 4 routes the sync/HTTP path through emitter.yield_turn(PydanticAITurn(...))). - Add constructor param coalesce_tool_requests: bool = False. Default OFF means PydanticAITurn(...).events == bare convert_pydantic_ai_to_agentex_events output (Start+Delta+Done for tool calls, arg streaming preserved on yield/sync). - stream_pydantic_ai_events builds the Turn with coalesce_tool_requests=True, because the foundation auto_send currently DROPS tool requests delivered as Start+Delta+Done (AGX1-377). Comment cites AGX1-377 as a temporary workaround to be removed once auto_send handles the streamed tool-request shape natively. - Tests: default-off Turn yields a ToolRequestDelta for streamed args (matches bare converter); coalesce-on Turn yields a single Full(tool_request) with fully-accumulated args and no ToolRequestDelta. Async characterization test still passes (goes through coalesce=True). - Add parts-manager invariant comment to the two corrected async tests. auto_send.py is unchanged (final_text last-segment fix stays; AGX1-377 covers the Start+Delta+Done handling). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ing handler (docstring) - _pydantic_ai_sync.py: add "Recommended: unified surface" section to module docstring showing PydanticAITurn + UnifiedEmitter usage with automatic span derivation; bare converter docstring/code unchanged. - _pydantic_ai_tracing.py: deprecation notes (docstring-only) on module, AgentexPydanticAITracingHandler, and create_pydantic_ai_tracing_handler; no runtime warnings.warn so warnings-as-errors does not break callers; NOTE: comment explains the deferral rationale. - tests/lib/adk/test_pydantic_ai_sync_unified.py: 6 new tests covering the unified sync path: passthrough equality + tool/reasoning span derivation via _FakeTracing injection, no-trace-id no-op, tracer=False suppression. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Register 4 pydantic-ai conformance fixtures (text-only, single tool call, reasoning block, multi-step) that drive both yield_events and auto_send channels and assert logical-delivery + span-signal equivalence. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… live-matrix rows Add 3 offline integration tests (TestModel + fake streaming/tracing, no API keys or live infra needed) that prove the unified harness surface is correctly wired for each delivery channel: - test_harness_pydantic_ai_sync.py — yield_turn path (12 tests): event ordering (tool_request before tool_response before text), accumulated text, Start/Done pairing, SpanDeriver wiring (OpenSpan/CloseSpan for tool calls on sync path). - test_harness_pydantic_ai_async.py — auto_send_turn path (13 tests): message ordering, ToolRequestContent/ToolResponseContent content verification, matching tool_call_ids, final_text, context open/close lifecycle; documents that span derivation is suppressed when coalesce_tool_requests=True (AGX1-377 note). - test_harness_pydantic_ai_temporal.py — TemporalAgent event_stream_handler path (12 tests + 1 intentional skip): drives TemporalAgent.run_stream_events offline, feeds into _fake_stream_pydantic_ai_events (PydanticAITurn + UnifiedEmitter with injected FakeStreaming), asserts same canonical message order; skip placeholder documents what requires live Temporal+Redis infra. Enable harness-integration.yml live-matrix job (was `if: false`) with a 3-way matrix over [sync, async, temporal], each running its test file via ./scripts/test. Add test file glob to PR path trigger so the workflow re-runs when tests change. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ing the unified surface Add 3 minimal, deployable tutorial agent projects, each a tiny pydantic-ai agent with one get_weather(city) tool whose message handler goes through the unified harness surface (UnifiedEmitter + PydanticAITurn) EXPLICITLY: - examples/tutorials/00_sync/harness_pydantic_ai (s-harness-pydantic-ai) sync ACP: `async for ev in emitter.yield_turn(PydanticAITurn(stream, model=...))`. Unlike 040_pydantic_ai (bare converter), this gives the sync channel real unified-yield coverage (coalesce off → tool-call arg-token streaming + auto span derivation under the per-turn span). - examples/tutorials/10_async/00_base/harness_pydantic_ai (ab-harness-pydantic-ai) async ACP: `await emitter.auto_send_turn(PydanticAITurn(..., coalesce_tool_requests=True))` called directly (not via stream_pydantic_ai_events). Persists pydantic-ai message history via adk.state. - examples/tutorials/10_async/10_temporal/harness_pydantic_ai (at-harness-pydantic-ai) temporal: TemporalAgent event_stream_handler builds a UnifiedEmitter from RunContext.deps and calls auto_send_turn inside the model activity. Durable workflow + run_worker structured like the temporal-pydantic-ai template. Each UnifiedEmitter is constructed from the ACP/streaming context (task_id + trace_id + parent_span_id) so tracing is automatic. CI discovery: both agentex-tutorials-test.yml and build-and-push-tutorial-agent.yml discover agents dynamically via `find examples/tutorials -name manifest.yaml`, so the 3 agents are picked up with no workflow edits. Directory placement keeps the build-and-push ACP-type inference (`*10_async*` → async) correct: sync under 00_sync, async/temporal under 10_async. Each ships tests/test_agent.py (required by the build validator) as the live integration test. Verified structurally: all 3 manifests parse; `from project.acp import acp` imports cleanly for all 3 under CI-style env; temporal agent/workflow/run_worker import; the sync handler driven offline with TestModel emits the expected tool_request → tool_response → text sequence through yield_turn. Keeps the 3 offline integration tests and the harness-integration.yml live-matrix from the previous commit. tests/lib/core/harness + tests/lib/adk: 230 passed, 1 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fix 22 pyright errors introduced in PR 4's new test files: - isinstance narrowing before union-member attribute access (ToolRequestDelta.arguments_delta, TextDelta.text_delta, ToolResponseContent.content, FunctionToolResultEvent.part.content) - reportReturnType in _run_yield_turn: hoist result variable out of async-with - reportImplicitOverride on _RecordingTracer.handle: add @OverRide - reportMissingImports in conformance tests: switch absolute tests.lib... imports to relative .runner imports so pyright's executionEnvironments root matches All 230 tests pass on 3.12 and 3.13. Ruff: clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Folds the plan doc (previously the separate #413) into this PR so plan + implementation land together. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ation auto_send delivers streamed tool requests natively (AGX1-377/378) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2c4fc88 to
a0f4fd2
Compare
Contributor
Author
|
@greptile review |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
stream_pydantic_ai_eventson top ofUnifiedEmitter(default tracing, no bespoke handler required) via the async pathPydanticAITurnimplementingHarnessTurn, wiring the sync yield path throughUnifiedEmitter.yield_turnon_resultoptional callback toconvert_pydantic_ai_to_agentex_eventsfor usage capture (additive only, no breaking change)coalesce_tool_requests=Falseby default) to preserve streaming arg delta delivery on sync path (AGX1-377)AgentexPydanticAITracingHandler/create_pydantic_ai_tracing_handlervia docstring only (no runtime warning, preserving warnings-as-errors safety)Test plan
./scripts/lintclean (ruff: 0 errors; pyright: 0 errors on PR4 files; 2 pre-existing errors in pre-existing test files unchanged from base)convert_pydantic_ai_to_agentex_eventsadditive-only (new optionalon_resultkwarg, default None)stream_pydantic_ai_eventssignature identical to baseDeprecationWarningon tracing handler (docstring-only)PydanticAITurn,UnifiedEmitterimportable🤖 Generated with Claude Code
Greptile Summary
This PR migrates
stream_pydantic_ai_eventsand introducesPydanticAITurnto wire pydantic-ai onto theUnifiedEmittersurface, replacing ~200 lines of bespoke async event handling and adding three new tutorial agents (sync, async base, async temporal).PydanticAITurnadapts a pydantic-ai stream to theHarnessTurnprotocol, capturing run-level usage via the newon_resultcallback added toconvert_pydantic_ai_to_agentex_events.stream_pydantic_ai_eventsis reduced to a thinUnifiedEmitter.auto_send_turn(PydanticAITurn(...))wrapper; the deprecatedAgentexPydanticAITracingHandlerremains functional but is docstring-deprecated.00_base/acp.pyand10_temporal/agent.py) passcoalesce_tool_requests=TruetoPydanticAITurn, but that parameter is absent from the constructor — these tutorials will raiseTypeErrorat runtime.Confidence Score: 4/5
Core library code is correct and well-tested, but both new async tutorials will fail immediately at runtime due to a missing constructor parameter.
The
PydanticAITurnconstructor does not acceptcoalesce_tool_requests, yet both async tutorial agents pass it — any developer running these tutorials hits an immediate TypeError. The production library path is sound and well-covered by the 230 tests, so the bug is isolated to the tutorial examples rather than the shipped API surface.Both async tutorial
acp.py/agent.pyfiles and thePydanticAITurn.__init__definition in_pydantic_ai_turn.pyneed to be aligned — either add the parameter to the constructor or remove it from the tutorial call sites.Important Files Changed
coalesce_tool_requestsparameter that both async tutorials pass, causing TypeError at runtime; docstring incorrectly claims auto_send delivers streamed tool-request shapes natively.Sequence Diagram
%%{init: {'theme': 'neutral'}}%% sequenceDiagram participant Agent as Agent Author participant PAI as PydanticAITurn participant Conv as convert_pydantic_ai_to_agentex_events participant UE as UnifiedEmitter participant AS as auto_send / yield_events Note over Agent,AS: Sync (HTTP ACP) path Agent->>PAI: "PydanticAITurn(stream, model=...)" Agent->>UE: UnifiedEmitter(task_id, trace_id, parent_span_id) Agent->>UE: yield_turn(turn) UE->>AS: yield_events(turn.events, tracer) AS->>PAI: iterate turn.events PAI->>Conv: "convert_pydantic_ai_to_agentex_events(stream, on_result=_capture)" Conv-->>AS: "StreamTaskMessage* events" AS-->>Agent: forwarded events (HTTP stream) Conv->>PAI: _capture sets _usage Note over Agent,AS: Async / Temporal path Agent->>PAI: "PydanticAITurn(stream, model=...)" Agent->>UE: UnifiedEmitter(task_id, trace_id, parent_span_id) Agent->>UE: auto_send_turn(turn) UE->>AS: "auto_send(turn.events, usage=turn.usage() read early)" AS->>PAI: iterate turn.events PAI->>Conv: "convert + on_result=_capture" Conv-->>AS: "StreamTaskMessage* events pushed to Redis" Conv->>PAI: _capture sets _usage after iteration%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% sequenceDiagram participant Agent as Agent Author participant PAI as PydanticAITurn participant Conv as convert_pydantic_ai_to_agentex_events participant UE as UnifiedEmitter participant AS as auto_send / yield_events Note over Agent,AS: Sync (HTTP ACP) path Agent->>PAI: "PydanticAITurn(stream, model=...)" Agent->>UE: UnifiedEmitter(task_id, trace_id, parent_span_id) Agent->>UE: yield_turn(turn) UE->>AS: yield_events(turn.events, tracer) AS->>PAI: iterate turn.events PAI->>Conv: "convert_pydantic_ai_to_agentex_events(stream, on_result=_capture)" Conv-->>AS: "StreamTaskMessage* events" AS-->>Agent: forwarded events (HTTP stream) Conv->>PAI: _capture sets _usage Note over Agent,AS: Async / Temporal path Agent->>PAI: "PydanticAITurn(stream, model=...)" Agent->>UE: UnifiedEmitter(task_id, trace_id, parent_span_id) Agent->>UE: auto_send_turn(turn) UE->>AS: "auto_send(turn.events, usage=turn.usage() read early)" AS->>PAI: iterate turn.events PAI->>Conv: "convert + on_result=_capture" Conv-->>AS: "StreamTaskMessage* events pushed to Redis" Conv->>PAI: _capture sets _usage after iterationComments Outside Diff (6)
src/agentex/lib/adk/_modules/_pydantic_ai_sync.py, line 257-261 (link)This early
continuemakes the fallback in theToolCallPartDeltabranch unreachable. If a provider emits a tool-call delta before aPartStartEvent,message_indexis missing, so the converter skips the event before it can synthesize the tool request from the delta. That drops the tool call and its argument stream for the provider edge this code is trying to handle. Please allocate an Agentex message index and emit a synthetic tool-request start when the first event for an index is a tool-call delta.Prompt To Fix With AI
src/agentex/lib/adk/_modules/_pydantic_ai_sync.py, line 280-299 (link)Once
tool_call_metais initialized fromPartStartEvent, later deltas cannot fill in missing metadata. If the start event has an emptytool_call_idor tool name and a laterToolCallPartDeltasuppliestool_call_idortool_name_delta, this branch keeps using the stale empty values. The emittedToolRequestDeltathen has blank identifiers, and the later tool response or span close cannot match the request. Please merge non-empty delta metadata intotool_call_metabefore constructing the delta.Prompt To Fix With AI
src/agentex/lib/core/harness/emitter.py, line 66 (link)usage=turn.usage()is evaluated eagerly as a keyword argument beforeauto_senditeratesturn.events. PydanticAITurn only populates usage via the on_result callback during stream iteration.Artifacts
Supporting artifact from the T-Rex run
Repro output showing stale usage in TurnResult
src/agentex/lib/core/harness/emitter.py, line 66 (link)auto_send_turnreadsturn.usage()before the event stream is consumed.PydanticAITurnupdates usage only after the terminal run-result event is consumed.TurnResult.usagekeeps the initial empty usage instead of the captured token counts.result.usagehad empty token fields whileturn.usage()after iteration contained the expected values.Artifacts
Supporting artifact from the T-Rex run
Verbose output showing stale vs real usage values
src/agentex/lib/core/harness/emitter.py, line 66 (link)auto_send_turncallsturn.usage()on line 66 before passing it toauto_send, butPydanticAITurnonly populates usage when the terminalAgentRunResultEventis consumed during event iteration. The returnedTurnResult.usagealways has None tokens and 0 LLM calls.usage=turn.usage()eagerly beforeauto_sendconsumesturn.events. SincePydanticAITurn._captureonly fires whenAgentRunResultEventis yielded during iteration, the usage snapshot is taken too early.turn.usage()after event consumption: haveauto_sendcall a usage callback after exhausting the event iterator, or changeauto_send_turnto callauto_sendwithout usage, then patchresult.usage = turn.usage()after awaiting.Artifacts
Supporting artifact from the T-Rex run
Verbose output showing stale vs real usage values
src/agentex/lib/core/harness/emitter.py, line 61-67 (link)PydanticAITurnfills its usage only afterturn.eventsis fully consumed. This call readsturn.usage()beforeauto_sendstarts iterating the stream, so async Pydantic AI runs return aTurnResultwith empty token fields even when the terminal result event contains real usage.Artifacts
Supporting artifact from the T-Rex run
Repro output confirming the usage timing bug
Supporting artifact from the T-Rex run
Repro output confirming the bug
Prompt To Fix With AI
Prompt To Fix All With AI
Reviews (5): Last reviewed commit: "refactor(pydantic-ai): drop coalesce_too..." | Re-trigger Greptile