Einstein Reply Recommendations
Einstein Reply Recommendations is the Service Cloud feature that suggests pre-written reply text to agents handling chat, messaging, and case email conversations.
Definition
Einstein Reply Recommendations is the Service Cloud feature that suggests pre-written reply text to agents handling chat, messaging, and case email conversations. The model trains on the org's historical conversations that led to resolution, learns which reply patterns work for which inbound message types, and surfaces three suggested replies in the agent's sidebar at runtime. The agent can click to insert a reply (with or without editing) or write their own response from scratch. The feature does not auto-send; the agent always reviews before the message goes to the customer.
Reply Recommendations is the conversation-time sibling of Article Recommendations and Case Wrap-Up in the Einstein for Service stack. It pays back fastest on high-volume channels (Messaging, Live Chat) where shaving 30 seconds off each reply compounds across thousands of conversations per day. Its quality depends on training data; teams that train on cleanly resolved conversations with good agent reply patterns get sharp suggestions, while teams that train on the full history of every conversation (resolved or not, well-handled or not) get noisy suggestions agents quickly learn to ignore.
Why the training filter is more important than the model
Where Reply Recommendations lives in setup
Setup, Einstein Service, Reply Recommendations. Pick the channels (Chat, Messaging, Case Email) the feature applies to. Pick the conversation filter that defines training data (resolved within X days, CSAT above Y, agent in tenured group). Add the Reply Recommendations component to the agent console layout for each channel. The component renders as a sidebar panel showing three suggested replies on every inbound message. Click an entry to insert it into the reply pane; the agent edits or sends as needed.
Training data and the filter that matters most
The training filter decides which historical conversations the model learns from. The default (all resolved conversations) is rarely the right choice. A better filter is conversations resolved within 7 days, with CSAT 4 or 5, handled by tenured agents. That filter screens out the long-tail messy conversations and trains the model on the patterns that actually worked. Most teams underspecify this filter and produce a model that learns from too much noise. The first month of suggestions tells you whether the filter is right; if agents are insertion-rate below 20 percent, tighten the filter and retrain.
How suggestions are matched to incoming messages
The model embeds each historical reply and each incoming message. At runtime, it finds the three historical replies whose context (the message they replied to plus the conversation up to that point) most closely matches the current incoming message and conversation. The top three are surfaced. The match is contextual, not just keyword-based; the same incoming message can produce different suggestions depending on what was already said in the conversation. This is one of the larger differences from older quick-text or canned-response systems that matched only on keywords.
Insertion modes and edit-before-send
The agent clicks a suggestion to insert it. Three behaviors: insert as-is, insert and edit (the suggestion appears in the reply pane for the agent to modify before send), insert and review (the suggestion is pre-filled but flagged for explicit acceptance before send). Most teams settle on insert-and-edit because raw insertion produces messages that feel templated and edit-and-review adds friction. The edit step is also the data signal: a suggestion that gets edited heavily is one the model should learn from less; a suggestion inserted as-is is one the model should learn from more.
Insertion rate, edit rate, and the diagnostic loop
Three metrics matter. Insertion rate is the percentage of inbound messages where the agent inserts a suggested reply. Edit rate is the percentage of inserted suggestions the agent modifies before sending. Send rate is the percentage of inserted suggestions that ultimately get sent (vs deleted and replaced). Production deployments target insertion rate above 40 percent, edit rate around 30 percent, and send rate above 90 percent. Numbers significantly off any of those usually point to training data quality issues; tighten the filter and retrain.
Brand voice, tone, and the policy review
Suggestions come from historical agent replies, so they inherit the brand voice of the agents who wrote them. This is mostly good (real voice beats template voice) and occasionally bad (one agent's idiosyncratic phrasing can dominate suggestions for a topic if their replies are over-represented in training data). The fix is filtering training data by agent group or by reply pattern. Tier 1 reply patterns belong in the training set for Tier 1 agents; senior escalation patterns belong in their own training segment. Suggestions that consistently include phrasings the brand has retired (or that legal flagged) need a policy review pass before the model retrains.
Relationship to Article Recommendations and Agentforce
Reply Recommendations suggests text the agent sends; Article Recommendations suggests Knowledge to attach or reference; Agentforce for Service can run as a full conversational agent on its own. The three features stack rather than compete. A common pattern is: deflect simple inquiries through Agentforce, surface Knowledge through Article Recommendations once the agent is in the case, and accelerate reply composition through Reply Recommendations. The compounding speed makes a measurable difference in average handle time without changing the team's core workflow.
How to roll out Reply Recommendations with a training filter that actually works
The default training filter underperforms on most orgs. The work that matters: specify a tight filter that selects historical conversations representative of what good agent replies look like, then monitor insertion and edit rates to iterate.
- Pick the channels for the pilot
Chat and Messaging are usually the right pilots because reply volume is high and reply length is short enough that suggestions match well. Case Email is a good second wave once the chat models are tuned.
- Define a tight training filter
Resolved within 7 days, CSAT 4 or 5, handled by tenured agents (one year plus tenure or top quartile by CSAT). The default all-resolved filter usually trains on too much noise.
- Enable Reply Recommendations and let the model train
Setup, Einstein Service, Reply Recommendations. Apply the filter. Initial training takes 1 to 4 hours depending on volume.
- Add the sidebar component to the agent console for the pilot channels
Without the component on the page, agents see no suggestions. App Builder edits per channel layout.
- Pilot for two weeks and brief agents on insertion modes
Pick 10 to 20 agents. Brief them on the insert-and-edit pattern. Their reactions in week one tell you whether the suggestions are good enough to keep.
- Measure insertion, edit, and send rates
Insertion above 40 percent, edit around 30 percent, send above 90 percent. Significantly off any of those, tighten the training filter and retrain.
- Expand by channel after the metrics stabilize
Move from Chat to Messaging to Case Email as each channel hits target metrics. Each channel needs its own training filter; do not assume one filter works across channels.
Chat, Messaging, Case Email. Pilot one at a time; do not enable all simultaneously.
Filter that selects which historical conversations the model learns from. The most consequential setting in the feature.
Insert as-is, insert and edit, insert and review. Most teams settle on insert and edit.
Weekly by default; can be triggered after a policy change or a major filter adjustment.
The sidebar component must be on the agent console layout per channel. Missing it is the most common reason suggestions appear to be off.
- The default all-resolved training filter usually trains on too much noise. Tighten with CSAT and tenure filters before judging suggestion quality.
- Insertion rate below 20 percent in the first month is a training data problem, not a model problem. Retrain with a tighter filter rather than disabling the feature.
- Brand voice in suggestions is inherited from historical agent replies. Filter by agent group if one agent's phrasing should not dominate suggestions for a topic.
- Per-channel layouts matter. Without the sidebar component on the channel's agent console layout, suggestions exist but no one sees them.
- Suggestions can include retired phrasings if those patterns are over-represented in training data. Run a policy review pass after major brand voice or legal changes.
Trust & references
Straight from the source - Salesforce's reference material on Einstein Reply Recommendations.
- Einstein Reply RecommendationsSalesforce Help
- Set Up Einstein for ServiceSalesforce Help
- Einstein Service InsightsSalesforce Help
About the Author
Dipojjal Chakrabarti is a B2C Solution Architect with 29 Salesforce certifications and over 13 years in the Salesforce ecosystem. He runs salesforcedictionary.com to help admins, developers, architects, and cert/interview candidates sharpen their fundamentals. More about Dipojjal.
Test your knowledge
Q1. What does Einstein Reply Recommendations do?
Q2. Who benefits most from Reply Recommendations?
Q3. What pairs well with Reply Recommendations?
Discussion
Loading discussion…