Salesforce Dictionary - Free Salesforce GlossarySalesforce Dictionary
Full Dataset Builder entry
How-to guide

How to build a dataset with Dataset Builder

Dataset Builder is the fastest way to produce a new CRM Analytics dataset from Salesforce objects. The visual canvas turns relationships into clickable nodes; you pick objects and fields, save, and the underlying dataflow handles the rest.

By Dipojjal Chakrabarti · Founder & Editor, Salesforce DictionaryLast updated May 20, 2026

Dataset Builder is the fastest way to produce a new CRM Analytics dataset from Salesforce objects. The visual canvas turns relationships into clickable nodes; you pick objects and fields, save, and the underlying dataflow handles the rest.

  1. Open Data Manager

    In CRM Analytics, open Data Manager, then Datasets, then Create Dataset, then Salesforce Object. This launches Dataset Builder for the new dataset.

  2. Pick the root object

    Search for and select the primary object (Opportunity for a sales dataset, Case for service, Account for account-centric reporting). The builder loads the object's relationships into the canvas.

  3. Add related objects

    Click any related object's plus icon to include it. The canvas grows as you add Opportunities, Accounts, Owners, and so on. You can traverse multiple hops by clicking through each related box.

  4. Pick fields per object

    For each object on the canvas, check the fields you want in the dataset. Default selections include the obvious identifiers; add measure fields (Amount, Quantity), date fields (CloseDate), and segmentation fields (StageName, Industry).

  5. Name the dataset and save

    Click Create Dataset, provide a name (Opportunities_With_Account is better than New_Dataset_1), and confirm. CRM Analytics generates the dataflow and runs the first extract immediately.

  6. Verify the dataset and schedule the dataflow

    Open the new dataset to confirm row counts and fields. Then open the generated dataflow in the Dataflow Editor and set the schedule (hourly, every 3 hours, daily) for ongoing refreshes.

Root objectremember

The primary object the dataset is built around. Each row in the final dataset corresponds to one root record, with related fields denormalized in.

Related objectsremember

Up to 10 levels of relationships are supported. Practical limit is 4 to 5 hops before performance and complexity make a dataflow rewrite worthwhile.

Field selectionremember

Standard fields, custom fields, formula fields, and picklists are supported. Long Text Areas over 32 KB and Encrypted fields are excluded automatically.

Save behaviorremember

Save creates the dataflow and runs it once. Edits to the dataset definition require rebuilding through Dataset Builder or editing the generated dataflow JSON directly.

Gotchas
  • Dataset Builder generates left-outer augments, which means rows on the root object without a related record still appear with nulls. If you need inner-join behavior, edit the dataflow and add a filter step after the augment.
  • Many-to-many junctions are not auto-traversed. Add the junction as an intermediate object, then pick the far side manually.
  • Subsequent edits to the generated dataflow (added fields, changed filters) do not surface back in Dataset Builder. The builder is one-way: visual creation, then text-only editing.
  • Dataset Builder creates a dataflow even in orgs where Data Prep is the preferred path. For warehouse push-down or non-Salesforce sources, build a recipe directly instead.

See the full Dataset Builder entry

Dataset Builder includes the definition, worked example, deep dive, related terms, and a quiz.