Salesforce Dictionary - Free Salesforce GlossarySalesforce Dictionary
DictionaryDData Stream
PlatformIntermediate

Data Stream

A Data Stream is the Salesforce Data Cloud configuration that brings external data into the platform as a continuously updated feed.

Part ofData 360
§ 01

Definition

A Data Stream is the Salesforce Data Cloud configuration that brings external data into the platform as a continuously updated feed. Each Data Stream connects to one source system (Sales Cloud, Marketing Cloud, Amazon S3, Snowflake, Google Analytics, a custom REST endpoint), ingests records on a defined cadence (real-time, hourly, daily), maps source fields to a Data Lake Object, and lands the data in Data Cloud's storage layer. Data Streams are the entry point for every piece of data that powers segmentation, identity resolution, and customer profile assembly.

The Data Stream concept replaces the older one-time bulk-load pattern with a managed, ongoing ingestion pipeline. Each stream tracks high-water-mark cursors so it picks up only new or changed records on each run. The stream stores the raw incoming data in a Data Lake Object (DLO), and downstream Data Model Object (DMO) mappings reshape it into the canonical Customer 360 model. The split between DLO (raw, source-shaped) and DMO (canonical, business-shaped) is what makes Data Cloud's data model flexible across dozens of source systems.

§ 02

How Data Streams continuously feed Data Cloud from every connected source system

Connector types and supported sources

Data Cloud ships pre-built connectors for the most common sources. Sales Cloud and Service Cloud connectors stream CRM data via Change Data Capture. Marketing Cloud Engagement and Marketing Cloud Account Engagement push campaign and email data. Amazon S3, Azure Blob Storage, and Google Cloud Storage handle file-based imports. Snowflake, BigQuery, and Databricks connect to data warehouses. REST API and custom Apex connectors handle anything not built in. Each connector type has its own ingestion cadence and field mapping behaviour.

Refresh cadence: real-time, hourly, daily

Streams pick a cadence per source. Real-time streams (Sales Cloud CDC, REST push) ingest within seconds of source change. Hourly and daily batch streams run on schedule, pulling records changed since the previous run. The cadence choice trades latency against source-system load; a real-time stream from a high-volume ERP can flood Data Cloud and incur ingestion-credit cost. Most production deployments mix cadences: real-time for high-value events, hourly for ops data, daily for batch warehouse exports.

Data Lake Object: the raw landing pad

Every Data Stream creates a Data Lake Object (DLO) with the source-system field structure preserved. DLO field names usually mirror the source table column names. Multiple Data Streams from the same source can write to the same DLO (Daily and Hourly streams of the same CRM table, for instance). The DLO is queryable directly via Data Cloud SQL, but most downstream work uses DMOs because they normalize across source systems.

Data Model Object mapping and field harmonization

Data Model Objects (DMOs) are the canonical Customer 360 entities: Individual, Account, Opportunity, Email Engagement. The DLO-to-DMO mapping is where field harmonization happens. Sales Cloud Contact.FirstName, Marketing Cloud Subscriber.first_name, and external Customer.given_name all map to Individual.FirstName in the canonical DMO. Get the mapping right and downstream segmentation works across systems; get it wrong and Customer 360 is broken before you start.

Ingestion credits and licensing

Data Cloud charges by ingestion volume (rows ingested per month) and processing credits (queries, identity resolution, activations). High-volume Data Streams burn credits fast. Optimization patterns: ingest only changed records (avoid full-refresh streams), filter at source to drop irrelevant rows, schedule batch streams during off-peak hours when credits are sometimes priced lower. License math is one of the largest Data Cloud decision points; watch credit consumption weekly.

Stream history and replay

Each Data Stream tracks ingestion history: how many records each run pulled, success/failure counts, processing time, error details. Failed runs can be replayed manually or skipped. The stream metadata stores high-water cursors per source; resetting a cursor causes the next run to do a full refresh. Use carefully; full refreshes are expensive for large sources and can cause downstream segments to recompute everything.

Data Stream versus legacy Marketing Cloud Data Extension

Customers migrating from Marketing Cloud often confuse Data Streams with Data Extensions. Data Extensions are Marketing Cloud Engagement''s native data storage (single-platform). Data Streams in Data Cloud are the cross-platform ingestion mechanism, not storage. The two coexist: a Data Stream can ingest from a Marketing Cloud Data Extension into a Data Cloud DLO, but they are not the same thing. Get the vocabulary right when discussing customer 360 architecture across teams.

§ 03

Setting up a Data Stream to ingest data into Data Cloud

Setup runs in three phases: configure the source connection, define the Data Stream and its mapping, then verify ingestion and downstream DMO population.

  1. Configure the source connection

    Data Cloud, Data Sources. Add the connector type (Sales Cloud, S3, Snowflake). Provide credentials, source URL, and any authentication tokens. Test the connection. Many connectors require admin privileges on the source side; coordinate with the source-system team.

  2. Create the Data Stream

    Data Cloud, Data Streams, New. Pick the source connection. Select the source object or file. Configure the refresh cadence (real-time, hourly, daily). Set incremental-only or full-refresh behaviour.

  3. Map fields to a Data Lake Object

    The Data Stream wizard creates a DLO with auto-mapped fields. Review the field mapping; rename DLO fields if the source naming is confusing. Set the primary key field so deduplication works correctly.

  4. Map the DLO to a Data Model Object

    Data Cloud, Data Model. Open the target DMO (Individual, Account, Email Engagement). Add a mapping from the new DLO. Map each DLO field to the canonical DMO field. Save and validate.

  5. Run and verify

    Trigger the Data Stream manually or wait for the scheduled run. Check the run history: row counts, errors, latency. Verify rows appear in the target DMO via Data Cloud Explorer or SQL. Build a downstream segment to confirm data is queryable.

Key options
Connector typeremember

Sales Cloud, Marketing Cloud, S3, Snowflake, BigQuery, REST API, custom Apex. Each has different setup and authentication patterns.

Refresh cadenceremember

Real-time (Change Data Capture), hourly, daily, or on-demand. Picks per source based on latency need and ingestion-credit budget.

Incremental vs full refreshremember

Incremental ingests only changed records (cheaper, faster). Full refresh re-pulls everything (expensive, useful for source-data corrections).

Primary keyremember

Required field that uniquely identifies each source record. Used by Data Cloud to deduplicate and update existing DLO rows on incremental runs.

Gotchas
  • Real-time streams burn ingestion credits faster than batch. Match cadence to actual latency need.
  • Primary key must be unique per source row. Without it, deduplication breaks and duplicates accumulate.
  • DLO-to-DMO mapping is where field harmonization happens. Wrong mapping breaks every downstream segment.
  • Full refresh after schema changes is sometimes necessary. Plan for the expensive runs and ingestion-credit hit.
  • Marketing Cloud Data Extensions are not Data Streams. Customers migrating from Marketing Cloud often conflate the two.
§

Trust & references

Sources

Cross-checked against the following references.

Official documentation

Straight from the source - Salesforce's reference material on Data Stream.

Was this entry helpful?
Help us write better definitions. Quick reactions or detailed edit suggestions.

About the Author

Dipojjal Chakrabarti is a B2C Solution Architect with 29 Salesforce certifications and over 13 years in the Salesforce ecosystem. He runs salesforcedictionary.com to help admins, developers, architects, and cert/interview candidates sharpen their fundamentals. More about Dipojjal.

§

Test your knowledge

Q1. What does a Data Stream do?

Q2. What refresh strategies do Data Streams support?

Q3. Why does refresh strategy matter?

§

Discussion

Loading…

Loading discussion…