Skip to content
Salesforce Dictionary - Free Salesforce GlossarySalesforce Dictionary
DictionarySStemming
Core CRMIntermediate

Stemming

Stemming in Salesforce search is the automatic reduction of search terms to their root word form so that queries match across variants.

§ 01

Definition

Stemming in Salesforce search is the automatic reduction of search terms to their root word form so that queries match across variants. A search for running matches records containing run, runs, runner, and ran. A search for invoices matches records containing invoice. The platform performs this reduction at both index time (when records are added or updated) and query time (when a user submits a search), so the algorithm sees the root form on both sides of the comparison.

Stemming is one of several linguistic features the Salesforce search engine applies by default, alongside tokenization, lowercasing, stop-word removal, and synonym expansion. The combined effect is that users do not have to type the exact word form that appears in a record to get a match; the platform handles the linguistic variation for them. This is a strict requirement for good user experience in any system where humans search for records they did not write themselves.

§ 02

How stemming works inside the Salesforce search engine

What stemming does and what it does not

Stemming reduces inflected and derivationally-related words to a common root. The Salesforce search engine uses a stemming algorithm tuned for English (and several other supported languages) that handles common patterns: plural to singular (cats to cat), verb tense (running to run, walked to walk), and adjective forms (faster to fast). The result is that a user typing one form retrieves records containing other forms. Stemming does not handle every linguistic relationship. Compound words, irregular plurals, and morphologically unrelated synonyms (car and automobile) require synonym configuration, not stemming. Stemming also does not cross language boundaries; the algorithm is language-specific and applies the rules for the indexed language.

How stemming interacts with the search index

Salesforce indexes records as they are created or updated, applying stemming to every searchable text field during indexing. The index stores the stemmed root form rather than the original word. At query time, the platform stems the user query and matches against the indexed roots. This means stemming changes the storage in the index, not just the matching logic. If the stemming algorithm is updated in a platform release, existing indexes carry the older stemming decisions until they get re-indexed. For high-volume orgs, full re-indexing can take hours; the platform handles it asynchronously. Most users never notice stemming changes between releases.

Stop-word removal, tokenization, and the linguistic pipeline

Stemming is one stage of a linguistic pipeline the search engine applies. Tokenization splits the input into individual words (the first stage). Lowercasing normalizes case (the second stage). Stop-word removal drops common words like a, the, and is that add no signal (the third stage). Stemming reduces what remains to root form (the fourth stage). Synonym expansion adds related terms (the fifth stage). Each stage is configurable to varying degrees on the Salesforce platform; some are universal defaults, others are tunable per-org. The pipeline runs identically at index time and at query time, which is essential for the platform to find matches without false negatives.

Language-specific stemming and multi-lingual orgs

Stemming rules are language-specific. The English stemming algorithm handles English morphology; the German, French, Spanish, Japanese, and other stemmers handle their respective languages. Salesforce supports stemming for several dozen languages, and the platform applies the right stemmer based on the org locale and the record-level language metadata. For multi-lingual orgs (a global company with French records, English records, and Japanese records in the same org), the platform indexes each record with the appropriate stemmer. Search across multiple languages requires querying each indexed language separately or using a federated search pattern. Confirm the language metadata on records is correct; misclassified records get stemmed with the wrong rules and produce noisy search results.

Stemming versus synonyms versus search aliases

Stemming, synonyms, and search aliases are three different tools that solve overlapping problems. Stemming handles morphological variation (running and run). Synonym configuration handles semantic relationships (car and automobile, or laptop and notebook). Search aliases handle business-specific terminology (your product code SBQ-2300 and its informal name Big Box Pro). Each tool sits at a different layer of the search pipeline. Stemming is built in and not tunable per word. Synonyms are configurable through a Setup page where admins add term-and-equivalent pairs. Aliases live on the record level as searchable fields. Mature search implementations use all three; relying on stemming alone misses the synonym and alias cases that matter most for business search.

When stemming gets in the way

Most of the time, stemming improves search recall. Occasionally it causes false positives. A search for an exact product code (SBQ-2300) should not stem; the platform handles this by keeping non-alphabetic tokens out of the stemming step. But a search for an exact proper noun (the company name Apple, the city Paris) might inadvertently match records containing apples or Parisian. The Salesforce stemmer is designed to be conservative on proper nouns, but the algorithm is heuristic. When stemming causes false positives in a specific business context, the fix is usually adding the exact phrase as a non-stemmed alternate or boosting exact-match results above stemmed-match results in the search ranking. Admins can also exclude specific fields from stemming if a field content should always be matched literally.

§ 03

Working with stemming and tuning search relevance

Stemming is on by default for every Salesforce search. Users do not configure it; the platform handles it automatically. The configuration work around stemming is mostly indirect: improve search relevance by configuring synonyms, aliases, and search settings that work alongside stemming, and audit search behavior when stemming produces unexpected results. The four-step routine covers: confirm stemming is working as expected for your locale, configure synonyms and aliases for terms stemming alone misses, audit search behavior with sample queries, and tune search settings if specific fields need different treatment.

  1. Confirm stemming is working for your locale

    In Salesforce, run a search for a known word and confirm it matches records containing morphological variants. A search for running should match records containing run, runs, runner. A search for invoices should match records containing invoice. If the matching does not work, confirm the org locale and language settings are correct (Setup, Company Information). Confirm the records are indexed in the right language by checking the Language field on a sample record. Misconfigured locale or language metadata is the most common reason stemming behaves unexpectedly. Test in a sandbox with a small set of records before rolling out search-driven features that depend on stemming.

  2. Configure synonyms for terms stemming alone misses

    From Setup, search Synonym Groups. Create a synonym group for each business term that stemming cannot connect. Common examples: car and automobile and vehicle; laptop and notebook; laptop and PC. Add the equivalent terms inside the group. Save and activate the group. Test by searching for one term and confirming results include records containing the other. Synonym groups are global to the org; consider permission sets if certain synonyms should apply only to specific user groups (a Sales team synonyms may differ from a Service team). Document the synonym groups in the search runbook so future admins understand the rationale for each.

  3. Audit search behavior with sample queries

    Build a sample query set that represents what your users actually search for. Include exact words, variant forms (plurals, verb tenses), proper nouns, product codes, and common misspellings. Run each query in production and confirm the results match expectations. Where results are wrong (too few matches, too many false positives), identify the linguistic pattern that caused the issue. Patterns that map to stemming get addressed through synonyms or aliases. Patterns that map to indexing problems get addressed through re-indexing or field configuration. Schedule the audit quarterly so search quality stays high as the underlying record set grows.

  4. Tune search settings for specific fields

    From Setup, configure which fields are searchable on each object. Some fields should always be matched literally (record IDs, product SKUs, customer account numbers) and should not be exposed to stemming; configure them as Text fields with the exact-match behavior. Other fields benefit from stemming (Description, Notes, Subject, Long Text Area); keep them as searchable text fields with default stemming behavior. For each searchable field, decide whether the field should participate in global search, in record-type-specific search, or in both. Document the field-level search configuration in the search runbook for audit and for future admins to understand the search behavior.

Gotchas
  • Misconfigured org locale or record language causes stemming to apply the wrong language rules, producing unexpected matches or missed matches. Audit language metadata before troubleshooting search behavior.
  • Stemming runs at both index time and query time. Updating the stemmer in a platform release does not immediately apply to existing indexes; the platform re-indexes asynchronously over hours or days for large orgs.
  • Synonyms are global to the org by default. Permission sets can scope synonyms to specific user populations, but configuring scope wrong creates confusing user experiences.
  • Stemming does not handle compound words, irregular plurals, or business-specific terminology. Synonyms and search aliases are the right tools for those cases; do not expect stemming alone to solve them.
  • Proper nouns occasionally stem in surprising ways. The stemmer is heuristic and language-specific; if a proper noun causes false positives, add it as an exact-match alternate or boost exact-match results in search ranking.
§

Trust & references

Official documentation

Straight from the source - Salesforce's reference material on Stemming.

Was this entry helpful?
Help us write better definitions. Quick reactions or detailed edit suggestions.

About the Author

Dipojjal Chakrabarti is a B2C Solution Architect with 29 Salesforce certifications and over 13 years in the Salesforce ecosystem. He runs salesforcedictionary.com to help admins, developers, architects, and cert/interview candidates sharpen their fundamentals. More about Dipojjal.

§

Test your knowledge

Q1. What is Stemming?

Q2. What's an example?

Q3. Do users need to configure it?

§

Discussion

Loading…

Loading discussion…