Resources

Products

Preview vs. Test: when to use each

Modified on: Wed, 13 May, 2026 at 11:59 PM

TABLE OF CONTENTS

AI Agent Studio gives you two ways to evaluate your agent before it reaches employees: the Preview panel for interactive, conversation-by-conversation testing, and the Test section for running many queries at once. Each serves a different purpose. This article explains what each approach does, where it fits in the build cycle, and how to use them together.


Key concepts to know before you start

The following terms come up throughout this article.


Term

What it means

Preview panel

A slide-in chat interface that opens within AI Agent Studio. You send messages to the agent and see its responses in real time, one conversation at a time.

Test section

A dedicated page in AI Agent Studio where you submit a list of queries to the agent all at once and review the results in a table.

Interactive testing

Testing by having a live conversation with the agent — the approach used in the Preview panel.

Batch testing

Testing by submitting many queries simultaneously and reviewing the results in aggregate — the approach used in the Test section.

Answered

A status label in the Test results table indicating the agent produced a response to a query.

Unanswered

A status label in the Test results table indicating the agent could not find relevant content for a query.

Generate sample queries

A Freddy-powered feature in the Test section that automatically creates up to 50 test questions drawn from your agent's knowledge sources.


What each approach does

Preview: test one conversation at a time

The Preview panel opens as a slide-in panel on the right side of AI Agent Studio. You type a message, the agent responds, and you can keep the conversation going across multiple turns — just as an employee would. Because the panel stays open while you work, you can make a change in Build and immediately send a follow-up message to see whether it took effect.

Use Preview when you want to:

  • Test how the agent handles a multi-turn conversation, where earlier messages affect later ones.

  • Check that a specific workflow triggers correctly when an employee asks in a natural way.

  • Verify that a knowledge article produces a clear, accurate answer before you publish it.

  • Confirm that the human handover path works — for example, that the agent escalates when it should.

  • Explore an edge case or unusual phrasing that you want to investigate interactively.


Each new conversation in the Preview panel starts with a clean context. The agent does not carry information from a previous conversation into a new one. Use the New conversation option to reset context when you want to start a fresh test.


Test: evaluate many queries at once

The Test section lets you build a list of queries and run them all against the agent in a single pass. Results appear in a table showing whether each query was Answered or Unanswered, a preview of the agent's response, the knowledge source it referenced, and controls to rate each response.

Use the Test section when you want to:

  • Check broad coverage — for example, whether the agent can handle all the common IT or HR questions in your knowledge base.

  • Identify which topics the agent cannot answer before going live.

  • Compare results before and after adding a knowledge source or changing agent instructions.

  • Generate a set of test queries using Freddy when you do not have an existing list.

  • Export results to share with your team or keep as a deployment record.


Batch tests are single-turn. Each query is treated independently — the Test section does not simulate multi-turn conversations or carry context between queries.



How they compare


Preview

Test

How you interact

Send messages one at a time in a live chat

Submit a list of queries all at once

Conversation context

Supports multi-turn conversations

Single-turn only — no context between queries

Speed

One scenario at a time

Many queries in a single run

What you see

Full response in the chat interface

Response summary, status, and source in a table

Best for

Specific scenarios, workflow testing, UX validation

Coverage checks, identifying gaps, pre-deployment review

Query source

You type each message

Manual entry, or generated by Freddy

Result format

Conversational — no export

Table view with Export option


When to use each approach

While building the agent

Use the Preview panel as your primary testing tool during the build phase. It gives you immediate feedback as you add knowledge, adjust instructions, and configure workflows. After each meaningful change, open a conversation in Preview and ask the kinds of questions your employees are likely to ask.

Run a batch test in the Test section when you want to check the overall state of the agent after a significant change — for example, after adding a large knowledge source or enabling a new workflow. This tells you whether the change improved coverage, introduced gaps, or had no effect on unrelated areas.


Before going live

Before deploying the agent, run a comprehensive batch test to confirm that the agent can handle the full range of expected questions. Review Unanswered results and address any gaps in your knowledge base. Then use Preview to walk through the most important end-to-end scenarios — particularly any multi-step workflows or handover paths — to confirm the experience is right.


After making changes

  • Small changes — updating a single knowledge article or adjusting a fallback message: use Preview to verify the specific area affected.
  • Larger changes — adding a new knowledge source, modifying a workflow, or changing agent instructions: run a batch test to confirm the change works and has not broken anything else. Follow up with Preview for any failed or borderline results.


During ongoing quality checks

After the agent is live, use the Test section periodically to run the same set of queries and compare results over time. If you notice a drop in the Answered rate, switch to Preview to investigate the affected topics interactively and trace the cause.


How they work together

Preview and Test are designed to complement each other. A typical workflow looks like this:

  1. Build and refine with Preview. Use the Preview panel while you add knowledge, configure workflows, and adjust instructions. Test specific scenarios interactively as you go.

  2. Run a batch test to check coverage. Once the agent feels solid, use the Test section to run a broader set of queries. Review which topics come back as Unanswered.

  3. Investigate gaps with Preview. For any Unanswered results or unexpected responses in the batch test, switch to Preview and explore those queries conversationally. This often reveals whether the gap is a missing knowledge article, a phrasing mismatch, or something in the agent's instructions.

  4. Fix, then retest. Make the necessary changes in Build, then run the affected queries again in the Test section to confirm the improvement.

  5. Validate end-to-end with Preview. Before deploying, walk through the most critical user journeys in Preview to confirm the full experience — not just individual answers — is ready.


Best practices

  • Do not skip batch testing before deployment. Preview is fast and flexible, but it only tests what you think to test. A batch test surfaces gaps you might not think to look for.

  • Do not skip Preview for multi-step scenarios. Batch tests are single-turn. Anything that requires a back-and-forth — a workflow that collects information across several messages, or a conversation that leads to handover — can only be validated in Preview.

  • Use Generate sample queries to build your first test list. If you are new to the Test section, let Freddy create an initial set drawn from your knowledge base. Then add manual queries for edge cases and workflows.

  • Rate batch test results. Use the thumbs-up and thumbs-down controls on each result. Over time, ratings help you spot patterns in where the agent performs well and where it needs improvement.

  • Keep a consistent query set. Running the same queries across multiple test runs lets you measure whether changes improve or degrade the agent's performance over time.