Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.aevyra.ai/llms.txt

Use this file to discover all available pages before exploring further.

AgentTrace

The top-level container for a complete agent execution.
from aevyra_witness import AgentTrace, TraceNode

trace = AgentTrace(
    nodes=[...],        # list[TraceNode], required
    ideal=None,         # str | None — expected/reference output
    metadata={},        # dict — arbitrary trace-level metadata
)
FieldTypeDescription
nodeslist[TraceNode]Ordered list of spans in execution order. DAG structure is encoded via parent_id.
idealstr | NoneExpected or reference output for the run. Optional — used by ablation’s placeholder="ideal" strategy and by judges that compare against a known-good answer.
metadatadictArbitrary key/value metadata for the trace (e.g. session_id, model_name, pipeline_version).

Methods

trace.to_dict()                     # dict — JSON-serializable
trace.to_json(indent=2)             # str — JSON string
AgentTrace.from_dict(d)             # classmethod — reconstruct from dict
trace.to_trace_text()               # str — human-readable rendering for LLMs
trace.by_id(span_id)                # TraceNode | None — look up a node by id

Serialisation

import json
from pathlib import Path

# Save
Path("trace.json").write_text(trace.to_json(indent=2))

# Load
trace2 = AgentTrace.from_dict(json.loads(Path("trace.json").read_text()))

TraceNode

One span in the execution trace.
TraceNode(
    name="classify",           # str, required
    input=ticket,              # any JSON-serializable value
    output="billing/refund",   # any JSON-serializable value
    id="n0",                   # str — unique within this trace
    parent_id=None,            # str | None — id of parent span
    kind="reason",             # str — KIND_REASON, KIND_TOOL, etc.
    prompt_id="classifier_v1", # str | None — prompt identity for Reflex
    step=1,                    # int | None — step index in a plan-act loop
    optimize=True,             # bool — mark this prompt for Reflex
    tokens=312,                # int — LLM tokens for this span
    started_at=1714000000.0,   # float | None — Unix timestamp
    ended_at=1714000000.4,     # float | None — Unix timestamp
    error=None,                # str | None — error message on failure
    metadata={},               # dict — arbitrary per-span metadata
)
FieldTypeDefaultDescription
namestrrequiredHuman-readable span name. Not required to be unique — use id for stable identity.
inputAnyNoneThe span’s input. Any JSON-serializable value.
outputAnyNoneThe span’s output. Any JSON-serializable value.
idstr""Unique identifier within this trace. Auto-assigned as n0, n1, … if left empty. Required when using parent_id wiring.
parent_idstr | NoneNoneid of the parent span. None means this is a root span. Parallel siblings share the same parent_id.
kindstr"other"Span kind — see Span kinds below.
prompt_idstr | NoneNoneIdentity of the underlying prompt. Multiple spans may share a prompt_id (e.g. the planner prompt at each reasoning step). Reflex uses this to optimize the prompt once and have the update apply to every call site.
stepint | NoneNoneLogical step index in a plan-act loop. None for simple linear traces.
optimizeboolFalseMark this span’s prompt as a Reflex optimization target. When multiple spans share a prompt_id, set optimize=True on all of them.
tokensint0LLM tokens consumed (prompt + completion combined). 0 for non-LLM spans.
started_atfloat | NoneNoneUnix timestamp (seconds) when the span began.
ended_atfloat | NoneNoneUnix timestamp (seconds) when the span ended.
errorstr | NoneNoneShort error message if the span failed. None on success.
metadatadict{}Arbitrary per-span key/value metadata. See Well-known metadata keys.

Span kinds

ConstantString valueWhen to use
KIND_REASON"reason"An LLM reasoning or planning step
KIND_TOOL"tool"A tool or function call (native or MCP)
KIND_RETRIEVE"retrieve"A retrieval or memory lookup
KIND_AGENT"agent"A nested sub-agent invocation
KIND_OTHER"other"Anything else / unspecified
Custom kind strings are allowed — downstream tools render them generically.

Well-known metadata keys

KeyConstantDescription
"mcp_server"META_MCP_SERVERName of the MCP server that exposed this tool (e.g. "github", "slack"). Signals “this is an MCP tool call”.
"tool_call_id"META_TOOL_CALL_IDThe LLM’s tool_use id, linking this tool span to the reasoning turn that dispatched it.
"error_code"META_ERROR_CODEMachine-readable error code from a failed tool call.
"latency_ms"META_LATENCY_MSWall-clock duration in milliseconds, when started_at/ended_at aren’t available.

Factory: TraceNode.mcp_tool()

Convenience constructor for MCP tool spans — pins the metadata conventions so Origin and dashboards render them consistently:
node = TraceNode.mcp_tool(
    "GMAIL_SEND_EMAIL",
    arguments={"to": "alice@example.com", "subject": "Hi"},
    result={"message_id": "msg_abc"},
    server="gmail",
    tool_call_id="toolu_01ABC",
    parent_id="plan_step_1",
    latency_ms=420,
)

DAG wiring examples

Linear pipeline — no parent_id needed:
AgentTrace(nodes=[
    TraceNode("classify", input=ticket,    output="billing"),
    TraceNode("retrieve", input="billing", output=policy_docs),
    TraceNode("answer",   input=ticket,    output=reply, optimize=True),
])
Plan-act with parallel tool calls:
AgentTrace(nodes=[
    TraceNode("plan", id="p1", kind=KIND_REASON, prompt_id="planner",
              step=1, input=query, output=plan1, optimize=True),
    TraceNode("stripe_lookup", id="t1a", kind=KIND_TOOL, parent_id="p1",
              input={"charge_id": "ch_123"}, output={...}),
    TraceNode("kb_search",     id="t1b", kind=KIND_TOOL, parent_id="p1",
              input={"query": "refund policy"}, output=[...]),

    TraceNode("plan", id="p2", kind=KIND_REASON, prompt_id="planner",
              step=2, input=step1_context, output=plan2, optimize=True),
    TraceNode("respond", id="r", kind=KIND_REASON, prompt_id="responder",
              step=3, input=final_context, output=final_reply),
])
Both p1 and p2 carry prompt_id="planner" and optimize=True. Reflex will optimize the single planner prompt and the update applies to every step.