Sub-second latency means user-facing speed for cross-system operations.
Inbound (external -> Salesforce):
- Apex REST endpoint with optimised processing.
- Webhooks with immediate ACK and async processing of heavy work.
- Pub/Sub API subscriber on the external side.
Outbound (Salesforce -> external):
- Apex callout from a record-triggered flow (after-save) — synchronous but governor-limited.
- Platform Events — event published; external subscribes via Pub/Sub API. Sub-second to subscriber but eventually consistent.
- Streaming API for immediate notification.
Architectural patterns:
1. Cache the external data in Salesforce.
If sub-second display matters but external is too slow:
- Periodic sync external -> Salesforce.
- Display Salesforce-cached version to user.
- Refresh in background.
2. Pre-fetch.
When user is likely to need external data, fetch in advance during navigation. By the time they need it, it's ready.
3. Concurrent / parallel calls.
If user triggers needing 3 external calls, fire all 3 in parallel. Total latency = max of the three, not sum.
Tools: Visualforce Continuation (limited), or LWC-side Promise.all calls to multiple Apex methods simultaneously.
4. Skinny payload.
Minimise data transferred:
- Only fetch fields actually needed.
- Compress where possible.
- Use binary protocols (gRPC) where supported.
5. Optimised middleware.
Mulesoft / iPaaS adds latency. For sub-second, sometimes direct Salesforce-to-system call is faster than going through middleware.
6. CDN / edge caching.
For static or semi-static external data, cache at edge for fastest response.
Performance considerations:
- Salesforce CPU time — 10s sync, 60s async.
- Callout timeout — default 10s; configurable to 120s.
- Network round-trip depends on data center proximity.
- External system response time is often the bottleneck.
When sub-second is required vs nice-to-have:
- Required: real-time inventory check during quote, fraud detection, payment authorisation.
- Nice-to-have: customer 360 panels, related insights.
- Not required: reports, dashboards, analytics.
Don't optimise for sub-second when the use case doesn't demand it. Performance optimisations cost engineering time and add complexity.
Monitoring:
- Track P50, P95, P99 latency for each integration.
- Alert when crossing thresholds.
- Identify which step is slow (Salesforce side, network, external side).
Senior architect insight: sub-second latency is rare and expensive. Most "real-time" requirements are actually "few-second" requirements when probed. Senior architects probe before committing to expensive sub-second architecture.
The senior framing: latency is a cost; budget it deliberately. Don't pay for sub-second when several-second is fine.
