> ## Documentation Index
> Fetch the complete documentation index at: https://lightdash-mintlify-5a87a3b2.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Evaluations

> Test and validate your AI agent's performance with custom evaluation suites

Create custom evaluation suites to batch test your agent's performance and ensure consistent, high-quality responses across different scenarios.

<Frame>
  <img src="https://mintcdn.com/lightdash-mintlify-5a87a3b2/lEsRVU38SWQHJGxN/images/guides/ai-agents/evaluations-start-page.png?fit=max&auto=format&n=lEsRVU38SWQHJGxN&q=85&s=b77374cfafd7dc26d37ab7fb5005ffb4" alt="AI Agent evaluation details interface" width="1180" height="491" data-path="images/guides/ai-agents/evaluations-start-page.png" />
</Frame>

## How evaluations work

1. **Define evaluation questions** - Build a set of test questions for each agent. You can either:
   * Manually create questions that represent common use cases
   * Select responses from existing agent conversations in the admin page to add to your evaluation set

     <Frame>
       <img src="https://mintcdn.com/lightdash-mintlify-5a87a3b2/lEsRVU38SWQHJGxN/images/guides/ai-agents/evals-add-to-evals-1.png?fit=max&auto=format&n=lEsRVU38SWQHJGxN&q=85&s=57daab350563302a162c4db98f595f0b" alt="AI Agent evaluation details interface" width="2730" height="2054" data-path="images/guides/ai-agents/evals-add-to-evals-1.png" />
     </Frame>

2. **Run batch tests** - Execute all prompts in your evaluation set against the agent to see how it responds

   <Frame>
     <img src="https://mintcdn.com/lightdash-mintlify-5a87a3b2/lEsRVU38SWQHJGxN/images/guides/ai-agents/run-batch-tests.png?fit=max&auto=format&n=lEsRVU38SWQHJGxN&q=85&s=d62894ee925cac30661904a57b7f94b7" alt="Running batch tests" width="1493" height="536" data-path="images/guides/ai-agents/run-batch-tests.png" />
   </Frame>

3. **Review results** - Manually review the agent's responses to ensure they meet your quality standards and expectations

   <Frame>
     <img src="https://mintcdn.com/lightdash-mintlify-5a87a3b2/lEsRVU38SWQHJGxN/images/guides/ai-agents/review-evaluation-results.png?fit=max&auto=format&n=lEsRVU38SWQHJGxN&q=85&s=d356e1784a56d935f3ca97333168f9ec" alt="Reviewing evaluation results" width="1904" height="958" data-path="images/guides/ai-agents/review-evaluation-results.png" />
   </Frame>

## Using feedback to improve evaluations

Encourage your team to actively use the thumbs-up/thumbs-down feature when interacting with AI agents. This feedback helps admins in two key ways:

* **Identify improvement areas** - Thumbs-down responses highlight where the agent needs work
* **Build better evaluation sets** - Filter and easily add thumbs-down responses to your evaluation suite to test fixes and prevent regressions

<Frame>
  <img src="https://mintcdn.com/lightdash-mintlify-5a87a3b2/lEsRVU38SWQHJGxN/images/guides/ai-agents/build-better-evaluation-sets.png?fit=max&auto=format&n=lEsRVU38SWQHJGxN&q=85&s=cdf7a66bbd1c3029900b6b2d310425d4" alt="Building better evaluation sets" width="2660" height="1178" data-path="images/guides/ai-agents/build-better-evaluation-sets.png" />
</Frame>

This systematic testing approach helps you:

* Verify agent performance before deploying changes
* Ensure consistency across common queries
