How do you architect logging and monitoring for a Salesforce org?

Salesforce has built-in observability that's adequate for small orgs and inadequate for enterprise.

Built-in:

Debug Logs — Apex execution detail. Limited retention; user-specific.
Login History — every login. 6-month retention.
Setup Audit Trail — metadata changes. 6 months.
Field History Tracking — record changes. 18-24 months.
Apex Exception Email — failures emailed to admin.

Beyond built-in:

1. Custom error log object.

Error_Log__c capturing every uncaught exception, async failure, integration error. Persistent. Reportable.

apex public class Logger { public static void error(String className, String method, Exception e) { Error_Log__c log = new Error_Log__c( Class__c = className, Method__c = method, Message__c = e.getMessage(), Stack__c = e.getStackTraceString(), Timestamp__c = DateTime.now(), User__c = UserInfo.getUserId() ); insert log; } }

2. Event Monitoring (Shield).

Detailed runtime events: API calls, logins, report exports, file downloads, slow queries.
Hourly file delivery.
Stream to S3 / SIEM.

3. External SIEM integration.

Splunk / Datadog / Sumo Logic.
Centralised observability across systems.
Alerting on anomalies.

4. Application-level metrics.

Custom counters via Custom Metadata or platform events.
Track user journeys, feature usage, performance.

5. Integration health dashboards.

Per integration: success rate, latency, error rate.
Alert on threshold crossing.

6. Synthetic monitoring.

Periodic test transactions exercising critical paths.
Catch outages before users notice.

7. Real-user monitoring.

Lightning page render time tracking.
LWC component performance.

Alerting:

Critical alerts -> pager / Slack / SMS.
Important alerts -> email digest.
Informational -> dashboard.

Don't alert on everything; people stop responding.

Architecture pattern:

Salesforce -> Event Monitoring stream -> Mulesoft/Kafka -> Splunk/Datadog -> Custom Error_Log__c -> dashboard/report -> Synthetic checks -> alerting

Pitfalls:

No central observability — issues hide in different places.
Alert fatigue — too many; ignored.
No retention strategy — logs piling up; storage cost.
Ignoring built-in — admins re-invent what's already there.

Architect role: define observability strategy from day one. Retrofitting is much harder than building in.

The senior insight: you can't fix what you can't see. Investment in observability pays back through faster incident resolution and avoided incidents.

How do you architect logging and monitoring for a Salesforce org?

Why this answer works

Follow-ups to expect

Related dictionary terms