Setup runs in three phases: configure the source connection, define the Data Stream and its mapping, then verify ingestion and downstream DMO population.
- Configure the source connection
Data Cloud, Data Sources. Add the connector type (Sales Cloud, S3, Snowflake). Provide credentials, source URL, and any authentication tokens. Test the connection. Many connectors require admin privileges on the source side; coordinate with the source-system team.
- Create the Data Stream
Data Cloud, Data Streams, New. Pick the source connection. Select the source object or file. Configure the refresh cadence (real-time, hourly, daily). Set incremental-only or full-refresh behaviour.
- Map fields to a Data Lake Object
The Data Stream wizard creates a DLO with auto-mapped fields. Review the field mapping; rename DLO fields if the source naming is confusing. Set the primary key field so deduplication works correctly.
- Map the DLO to a Data Model Object
Data Cloud, Data Model. Open the target DMO (Individual, Account, Email Engagement). Add a mapping from the new DLO. Map each DLO field to the canonical DMO field. Save and validate.
- Run and verify
Trigger the Data Stream manually or wait for the scheduled run. Check the run history: row counts, errors, latency. Verify rows appear in the target DMO via Data Cloud Explorer or SQL. Build a downstream segment to confirm data is queryable.
Sales Cloud, Marketing Cloud, S3, Snowflake, BigQuery, REST API, custom Apex. Each has different setup and authentication patterns.
Real-time (Change Data Capture), hourly, daily, or on-demand. Picks per source based on latency need and ingestion-credit budget.
Incremental ingests only changed records (cheaper, faster). Full refresh re-pulls everything (expensive, useful for source-data corrections).
Required field that uniquely identifies each source record. Used by Data Cloud to deduplicate and update existing DLO rows on incremental runs.
- Real-time streams burn ingestion credits faster than batch. Match cadence to actual latency need.
- Primary key must be unique per source row. Without it, deduplication breaks and duplicates accumulate.
- DLO-to-DMO mapping is where field harmonization happens. Wrong mapping breaks every downstream segment.
- Full refresh after schema changes is sometimes necessary. Plan for the expensive runs and ingestion-credit hit.
- Marketing Cloud Data Extensions are not Data Streams. Customers migrating from Marketing Cloud often conflate the two.