Laravel AI Evaluation

Make sure your agents respond how you want them to.

First Eval

Concepts

Real model calls

Evaluate actual AI behavior, not mocked responses.

Standalone Artisan runner

Run eval files via `php artisan ai-evals:run` without Pest or PHPUnit.

Pest native

Run directly inside Pest from `tests/AgentEvals` with a fluent API.

CI ready

Evals hard-fail when expectations are not met.

Quick Start

For a guided walkthrough, start with First eval in 5 minutes.

1) Install

bash

composer require --dev larswiegers/laravel-ai-evaluation

2) Configure your run mode

PestStandalone

php

pest()->extend(Tests\TestCase::class)->in('Feature', 'AgentEvals');

text

No additional setup is required.

3) Generate an eval file

PestStandalone

bash

php artisan make:ai-evals refund-policy --type=pest

bash

php artisan make:ai-evals refund-policy --type=standalone

The command scaffolds a starter file you can edit for your agent and expectations.

4) Run

PestStandalone

bash

vendor/bin/pest tests/AgentEvals

bash

php artisan ai-evals:run

5) Configure summary output

Enable summaries and choose the format in your .env (or CI environment):

TextJSON

dotenv

AI_EVAL_SUMMARY=true
AI_EVAL_SUMMARY_FORMAT=text
AI_EVAL_SUMMARY_CURRENCY=USD

dotenv

AI_EVAL_SUMMARY=true
AI_EVAL_SUMMARY_FORMAT=json
AI_EVAL_SUMMARY_CURRENCY=USD

6) Get the summary output

For standalone JSON, JUnit, and GitHub annotation reports, see Output formats.

Run your evals and check the end of the output:

TextJSON

text

$ vendor/bin/pest tests/AgentEvals

AI Eval Summary
Total: 13
Passed: 12
Failed: 1
Prompt tokens: 7842
Completion tokens: 1966
Total tokens: 9808
Estimated cost: USD 0.070000

json

$ php artisan ai-evals:run

{"type":"ai_eval_summary","total":13,"passed":12,"failed":1,"prompt_tokens":7842,"completion_tokens":1966,"total_tokens":9808,"estimated_cost":0.07,"currency":"USD"}

Laravel AI Evaluation

Real model calls

Standalone Artisan runner

Pest native

CI ready

Quick Start ​

1) Install ​

2) Configure your run mode ​

3) Generate an eval file ​

4) Run ​

5) Configure summary output ​

6) Get the summary output ​

Quick Start

1) Install

2) Configure your run mode

3) Generate an eval file

4) Run

5) Configure summary output

6) Get the summary output