Documentation Index
Fetch the complete documentation index at: https://docs.aevyra.ai/llms.txt
Use this file to discover all available pages before exploring further.
Installation
The CLI is included with the package:
pip install aevyra-verdict
aevyra-verdict --help
run
Run evals on a dataset and print a comparison table.
aevyra-verdict run <dataset> [options]
Models
| Flag | Short | Description |
|---|
--model | -m | Model in provider/model format. Repeat for multiple. |
--config | -c | Path to a models config file (.yaml, .json, .toml). |
--model and --config are mutually exclusive.
Metrics
| Flag | Description |
|---|
--metric | Built-in metric: rouge, bleu, or exact. Repeat for multiple. Default: rouge. |
--judge | Add an LLM-as-judge using this model spec. |
--judge-prompt | Path to a custom judge prompt template (.md or .txt). |
--custom-metric | Custom scoring function in file.py:function_name format. Repeat for multiple. |
Output
| Flag | Short | Description |
|---|
--output | -o | Save results as JSON to this path. |
Tuning
| Flag | Default | Description |
|---|
--max-workers | 10 | Concurrent requests per model. Lower if hitting rate limits. |
--temperature | 0.0 | Sampling temperature. |
--max-tokens | 1024 | Max tokens per completion. |
Examples
# Compare two models
aevyra-verdict run data.jsonl -m openai/gpt-5.4-nano -m qwen/qwen3.5-9b
# Use a config file
aevyra-verdict run data.jsonl --config models.yaml
# ROUGE + LLM judge, save results
aevyra-verdict run data.jsonl -m openai/gpt-5.4-nano \
--metric rouge \
--judge openai/gpt-5.4 \
-o results.json
# Custom judge prompt and scoring function
aevyra-verdict run data.jsonl -m openai/gpt-5.4-nano \
--judge openai/gpt-5.4 \
--judge-prompt prompt.md \
--custom-metric my_metrics.py:brevity_score
# Reduce concurrency if hitting rate limits
aevyra-verdict run data.jsonl --config models.yaml --max-workers 3
inspect
Preview a dataset without running any models.
aevyra-verdict inspect <dataset>
Shows sample count, whether reference answers are present, metadata keys, and the first sample.
providers
List all available providers and whether their API keys are configured.
Available providers:
openai OPENAI_API_KEY ✓ set
anthropic ANTHROPIC_API_KEY ✗ not set
google GOOGLE_API_KEY ✗ not set
mistral MISTRAL_API_KEY ✗ not set
cohere COHERE_API_KEY ✗ not set
openrouter OPENROUTER_API_KEY ✗ not set