Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.aevyra.ai/llms.txt

Use this file to discover all available pages before exploring further.

The playbook is a markdown file that tells the agent what to try and why. It is the primary lever for domain expertise: when you know something about your hardware or workload that the agent doesn’t, you add a heuristic to the playbook. The default playbook lives at src/aevyra_forge/playbook.md and is bundled into the package so it works after pip install.

Structure

A playbook is a markdown file with ## section headings. Sections Forge always includes in the agent prompt:
SectionPurpose
## Search spaceLegal values for each config field
## HeuristicsRules of thumb for what to try first
## ForbiddenConfig combinations that crash or regress
## TerminationWhen to stop searching
Hardware-specific sections are included only for the matching vendor:
## Hardware: nvidia

On T4 (16 GB), set gpu_memory_utilization ≤ 0.85.
Prefer enable_chunked_prefill=true for long-context workloads.

## Hardware: amd

On MI300X, use attention_backend=ROCM_AITER_FA for best TTFT.

YAML front-matter

---
version: 1
layers: [config, quant]
---
version is used by Reflex (the prompt optimizer) to track playbook revisions.

Custom playbook

Pass --playbook path/to/playbook.md to override the built-in:
aevyra-forge tune \
  --model mymodel \
  --device cuda \
  --workload workload.jsonl \
  --playbook my_playbook.md

Validate a playbook

aevyra-forge playbook validate my_playbook.md
# ✓ Playbook loaded OK (6 sections)

aevyra-forge playbook show my_playbook.md
# prints all sections

Programmatic access

from aevyra_forge.playbook import load_playbook, format_for_agent
from aevyra_forge.recipe import HardwareSpec

pb = load_playbook("playbook.md")
hw = HardwareSpec(vendor="nvidia", gpu_type="A100", count=4, memory_gb_per_gpu=80)
text = format_for_agent(pb, hw, layer="config")
print(text)
format_for_agent filters to only the sections relevant for the current hardware and layer, keeping the agent prompt focused.

Reflex integration

In v0.2+, Reflex will read the experiment history and propose playbook edits — turning empirical findings (e.g. “prefix caching always helped on this hardware”) into permanent heuristics. For now, edits are manual.