Documentation Index
Fetch the complete documentation index at: https://docs.aevyra.ai/llms.txt
Use this file to discover all available pages before exploring further.
Prerequisites
- An NVIDIA or AMD GPU (or use
--dry-runon CPU) - Python 3.10+
- An Anthropic API key (or any supported LLM provider)
Install
Prepare a workload
Forge optimizes against your real traffic. The workload is a JSONL file — one request per line:Run
- Auto-detect your GPU via
nvidia-smi/rocm-smi - Boot vLLM with the baseline config
- Benchmark your workload at up to 8 concurrent requests
- Ask the agent LLM to propose a mutation
- Boot → bench → keep or revert, repeat
.forge/runs/<run-id>/.
Resume an interrupted run
If a run is interrupted (Ctrl-C, timeout, OOM), resume with no arguments — Forge reads everything from the saved config:View results
Device options
| Flag | Hardware | Requirement |
|---|---|---|
--device cuda | NVIDIA GPU | nvidia-smi on PATH |
--device rocm | AMD GPU | rocm-smi on PATH |
--device cpu | No GPU | --dry-run recommended |
LLM providers
The--llm flag selects the agent model. Format: provider/model.
Next steps
- Tutorial: Colab quickstart — run on a free T4
- Concepts: Recipe — what’s in
recipe.yaml - Concepts: Playbook — how the agent decides what to try
- API reference: Orchestrator — programmatic use