LDV is when objects have millions to hundreds of millions of records. Standard Apex patterns break.
Read patterns:
- Selective queries are mandatory. SOQL must filter on indexed fields. Salesforce auto-indexes Id, Name, foreign keys. Custom Indexes can be requested via Salesforce Support.
- Use `Database.QueryLocator` for >50k rows — iterates lazily, doesn't materialise the full result set.
- Pagination via `LIMIT`/`OFFSET` is anti-pattern past 2000 rows — Salesforce has a hard
OFFSETcap of 2000. Use Last-seen Id pagination instead:
apex List<Account> page = [SELECT Id, Name FROM Account WHERE Id > :lastSeenId ORDER BY Id LIMIT 1000];
- Avoid GROUP BY on huge data — aggregate query row limit is 2000. Use external warehousing (Snowflake/BigQuery) for analytics over millions.
- Skinny Tables — for very high-frequency reads on standard objects, Salesforce can provision a denormalised Skinny Table. Request via Support.
Write patterns:
- Batch Apex for any bulk operation over 10k records. Each
execute()chunk is its own transaction. - Bulk-safe DML — bulkify mercilessly. The 200-record-per-trigger pattern that's a guideline at 100k records is a requirement at 100M.
- Defer Sharing Calculations during massive ownership changes. Setup -> Defer Sharing Calculations.
- Avoid Roll-Up Summary fields on LDV objects — they recalculate on every child change. Use periodic batch jobs to refresh aggregate fields instead.
Integration patterns:
- Bulk API 2.0 for inbound loads — never single-record REST calls for high volume.
- Change Data Capture for outbound replication — push to Snowflake/BigQuery via middleware.
- External objects via Salesforce Connect for "we need to see it but not store it" cases.
- Big Objects for archive — billions of rows of historical data, queryable by indexed key.
Sharing model:
- Avoid Private OWD on LDV objects if business allows — sharing recalc takes hours.
- Apex Managed Sharing with surgical RowCause — finer-grained than Sharing Rules; lower recalc cost.
- Don't add sharing rules carelessly — each rule increases recalc time.
Testing:
- Full Sandbox is mandatory for performance testing. Dev sandbox with 100 records doesn't reveal LDV issues.
- Apex tests with 200+ records to confirm bulk safety.
- Production-like data volume testing — load realistic data to a Full sandbox before launch.
Monitoring:
- Event Monitoring — slow query logs.
- System Overview — row counts approaching LDV thresholds.
- Alerting when sharing recalc queues build up.
LDV-aware design is architect-level work. Many decisions made early (OWD = Private, Roll-Up Summaries on hot objects) are extremely expensive to reverse at scale. Plan for LDV during initial design, not in remediation.
