Salesforce Dictionary - Free Salesforce GlossarySalesforce Dictionary
Salesforce Architect
hard

How do you architect analytics across many systems (Salesforce + ERP + Marketing + Web)?

Cross-system analytics requires unified data, beyond what any single system can deliver.

Architectural pattern: data warehouse + BI layer.

1. Data ingestion.

  • Salesforce — CDC + Mulesoft / native Snowflake connector / Heroku Connect.
  • ERP — vendor-specific (SAP HANA replication, Oracle GoldenGate, etc.) or custom ETL.
  • Marketing Cloud — APIs + ETL.
  • Web analytics — Google Analytics export, Mixpanel API, Segment.
  • Other — bespoke connectors as needed.

All flow into a central warehouse.

2. Data warehouse.

  • Snowflake — most common modern choice.
  • BigQuery — Google ecosystem.
  • Redshift — AWS.
  • Databricks — for data science workloads.

Warehouse holds ALL the data; processed for analytics.

3. Transformation.

  • dbt (data build tool) — SQL-based transformation; declarative.
  • Custom ETL — Python / Node / Spark.
  • Mulesoft / iPaaS — for orchestration.

Transformation: clean, normalise, join across sources, build summary tables.

4. Storage tiers.

  • Raw layer — exactly as ingested.
  • Cleansed layer — standardised, deduped.
  • Conformed layer — joined across sources.
  • Mart layer — domain-specific (sales mart, marketing mart).

5. BI layer.

  • Tableau — visualisation, exploration.
  • Power BI — Microsoft ecosystem.
  • Looker / Looker Studio — Google ecosystem.
  • CRM Analytics — Salesforce-native (works on warehouse data).
  • Custom dashboards in Salesforce or Heroku.

6. Governance.

  • Data dictionary — every metric defined consistently.
  • Lineage tracking — where does this metric come from?
  • Access control — who sees what data.
  • Quality monitoring — alerts on data drift.

7. Salesforce integration with analytics.

  • CRM Analytics — pull warehouse data into Salesforce dashboards.
  • External Objects — read warehouse from Salesforce records.
  • Embedded dashboards in Lightning pages.

Architectural decisions:

1. Real-time vs batch:

  • Most analytics: hourly / daily batch suffices.
  • Real-time only when business decisions depend (fraud detection, inventory).
  • Real-time costs more; justify the value.

2. Warehouse vs lake:

  • Warehouse — structured, optimized for queries.
  • Lake — raw data, schema-on-read.
  • Modern hybrid: lakehouse (Snowflake, Databricks).

3. Salesforce as source vs sink:

  • Most often Salesforce is source — data flows out for analytics.
  • Sometimes sink — analytics insights flow back to enrich Salesforce records.

4. Reverse ETL.

  • After analytics, push insights back to operational systems.
  • "Customer health score" computed in warehouse, written back to Salesforce.
  • Tools: Hightouch, Census.

5. Master data management.

  • Customer master across systems requires deduplication.
  • MDM tool (Informatica, Reltio) or warehouse-based mastering.

Common pitfalls:

  • Salesforce reports trying to do warehouse work — slow, limited.
  • No data ownership — quality drifts.
  • Silos persist — data in warehouse but reports built per system.
  • Stale data — sync intervals too long.
  • Cost surprises — warehouse compute / storage adds up.

Senior architect insight: analytics is its own architecture domain. Don't conflate with operational systems. Both have their place; design each deliberately.

Salesforce Reports & Dashboards work for in-Salesforce analytics. Cross-system analytics needs a warehouse + BI layer. Don't try to make Salesforce do both.

The most senior framing: operational systems serve transactions; analytics systems serve decisions. Different shapes; different optimizations.

Why this answer works

Senior. The lakehouse / warehouse / mart layered model and the "operational vs analytics" insight are mature.

Follow-ups to expect

Related dictionary terms