What is the most important tip for working with Utterance?

Thirty to fifty diverse utterances per intent is the sweet spot. Below fifteen the bot misroutes, above two hundred it overfits. Diversity matters more than the raw count.

AIBeginner

Utterance

Q: What is an Utterance?

Utterances teach bots to recognize user intent.

Q: Is it the same as Training Phrase?

Utterance and Training Phrase are the same concept.

Q: How do you improve recognition?

Diversity and real-world updates improve bot accuracy.

An utterance is a single user input to an AI conversational system.

Hear it

§ 01

Definition

An utterance is a single user input to an AI conversational system. In a chatbot context, an utterance is everything the user types or speaks in one turn before the bot replies. In NLP training, an utterance is one example sentence the model uses to learn what intent or topic it represents. In Salesforce, the term is most common in Einstein Bots and Agentforce, where the utterance is the customer message that the bot's NLP layer must map to a known intent (or, in Agentforce, to an agent topic) before the conversation can advance.

Utterances are the unit of measurement for how well a conversational AI understands its users. A bot trained on 50 carefully curated utterances per intent will recognize a customer's I want to cancel my subscription no matter how it is phrased. A bot trained on five utterances per intent will misroute most variations and dump the customer to a human. Utterance design is the most important and most under-invested part of any bot project. The platform handles the language model. The team owns the utterance set.

§ 02

Why utterances make or break a conversational AI deployment

The intent-utterance pairing

Every intent in an Einstein Bot is defined by its utterances. The intent Cancel Subscription is not a label, it is the set of all utterances tagged with that label: I want to cancel, please cancel my account, how do I unsubscribe, can you end my plan, I am done. The NLP model trains on these examples and learns to predict the intent label for any new utterance, including phrasings nobody wrote down. If two intents share too many utterances, the model gets confused at runtime. Intent boundaries should be sharp at the utterance level, not just in the intent name.

How many utterances per intent

The empirically reasonable range is 30 to 50 utterances per intent. Below 15 the model has not seen enough variation and misroutes on common phrasings. Above 200 the model starts overfitting on idiosyncrasies of the training examples and ignoring real signal. Salesforce recommends a minimum of 20 per intent in Einstein Bots documentation, but 20 is the floor, not the target. The sweet spot is enough variation to cover the phrasings real users produce, without so much that the team cannot maintain it.

Diversity is more important than count

Forty utterances that all say the same thing in slightly different word order are worse than 20 utterances that cover formal, informal, frustrated, brief, and verbose phrasings. Diversity dimensions to cover deliberately: sentence length (one word to one paragraph), tone (polite, frustrated, neutral), specificity (canceling vs canceling my Pro plan), typos, contractions, and language register. If every utterance is a complete sentence in business English, the bot will misroute the customer who types cancel pls.

Negative examples and intent disambiguation

Some utterance sets benefit from explicit out-of-scope examples: utterances tagged with a generic Fallback intent to teach the model what the bot does not handle. Einstein Bots supports this directly. The fallback intent absorbs utterances that should not match any of the real intents, which prevents the model from forcing a poor match into the closest real intent. Without a fallback, requests like what time is it or tell me a joke will match the closest intent in the bot and confuse the customer with a wrong answer.

Utterance management in Einstein Bots

Inside the Bot Builder, the Dialogs tab lists each intent and its training utterances. Add, edit, and remove in the panel. After changes, the bot needs to be retrained: the Build tab runs the training pipeline against the new utterance set. Performance metrics appear after build. The Confidence Threshold setting controls how strong a match must be before the bot acts on it. A higher threshold reduces misroutes but increases the rate of falling back to a human, which is sometimes the right trade.

Agentforce: utterances meet topics

Agentforce changes the model. Instead of a fixed intent set, the Atlas Reasoning Engine matches utterances to agent topics, where a topic is a collection of related actions and instructions. The utterance still matters, but the team writes Topic Triggers and example utterances rather than a flat intent-to-utterance map. The training shift means smaller utterance counts per topic (often 5 to 15) work well because the LLM does more of the generalization. Pure NLU bots still need the bigger numbers.

Detecting misroutes in production

The most common silent failure of a bot project is the misroute that nobody catches. The customer sends an utterance, the bot answers a different intent, the customer rephrases or gives up. Build a weekly review of conversations where the bot replied but the customer disconnected within two turns, or where the customer typed agent or human as the next utterance. Each of these is a candidate misroute. Add the original utterance to the right intent's training set and retrain. Misroute rate should trend down month over month.

§ 03

How to design and maintain an utterance set

Utterance design is the ongoing work of any bot. The steps below cover the initial build and the maintenance cycle that keeps a bot from degrading over time.

Define the intent set first
Before writing utterances, agree on the intents the bot handles. Each intent should be a single, sharp customer goal. Cancel Subscription is one intent. Reschedule Appointment is another. Both can live in the same bot but they should not share utterances.
Mine real transcripts for seed utterances
Pull two to four weeks of human-handled chat or email transcripts. Read the customer's opening message. Tag each with the closest intent from the set. The seed list almost always reveals intents the team forgot to plan.
Add deliberate diversity
For each intent, generate variants along the diversity dimensions: short, long, polite, frustrated, formal, casual, typo-laden. Target the 30 to 50 range. Avoid filling the slots with paraphrases that all sound the same.
Train, test, measure
Build the model. Run the test set provided by Bot Builder. Inspect any utterance the model misroutes and decide whether the utterance is mislabeled (fix it) or the intent boundary is unclear (rewrite the intent).
Schedule weekly misroute review
Review failed conversations weekly. Add corrected utterances to the right intent. Retrain. The bot improves continuously, or it does not, depending on this loop.

Key options

Confidence thresholdremember

The minimum confidence score required for the bot to act on a predicted intent. Higher threshold means more fallbacks, fewer misroutes.

Fallback intentremember

Captures utterances that match no real intent. Routes to a human or a clarifying question rather than forcing a wrong match.

Variations fieldremember

Per-utterance synonyms and slot variations Bot Builder uses to generate additional training samples without manual entry.

Language modelremember

Pick the NLU model behind the bot. The default Einstein NLU works for most use cases. Multi-language bots need a model per language.

Topic triggers (Agentforce)remember

The Agentforce equivalent of utterances. Smaller per-topic counts because the LLM generalizes more than a fixed NLU.

Gotchas

Overlapping utterances across two intents confuse the model and lower both intents' accuracy. Inspect the test report for intent confusion pairs and disambiguate at the utterance level.
Forty paraphrases of the same sentence are not 40 utterances. Diversity matters more than count. Add short, long, frustrated, and casual variants deliberately.
Bots trained only on staff-written utterances misroute on real customer phrasings. Always mine real transcripts before launch.
Skipping a fallback intent forces every out-of-scope utterance into the closest real intent. The customer gets a wrong answer instead of a clean handoff.
Misroute rate is not in any standard report. Build the weekly review process or the bot will degrade silently as customer language evolves.

Trust & references

Sources

Cross-checked against the following references.

Intents and Utterances in Einstein BotsSalesforce Help
Build an Einstein BotSalesforce Help

Official documentation

Straight from the source - Salesforce's reference material on Utterance.

Einstein Bots overviewSalesforce Help
Agentforce overviewSalesforce Help

Was this entry helpful?

Help us write better definitions. Quick reactions or detailed edit suggestions.

About the Author

Dipojjal Chakrabarti is a B2C Solution Architect with 29 Salesforce certifications and over 13 years in the Salesforce ecosystem. He runs salesforcedictionary.com to help admins, developers, architects, and cert/interview candidates sharpen their fundamentals. More about Dipojjal.

Test your knowledge

Q1. What is an Utterance?

Q2. Is it the same as Training Phrase?

Q3. How do you improve recognition?

Discussion

Loading…

Loading discussion…

Back to Dictionary