10M records is firmly in Batch Apex territory. You can't process it in a single transaction.
Architecture:
- Implement `Database.Batchable<sObject>`. Three methods:
start,execute,finish.
`apex global class BulkProcessor implements Database.Batchable<sObject>, Database.Stateful { global Integer recordsProcessed = 0;
global Database.QueryLocator start(Database.BatchableContext bc) { return Database.getQueryLocator([SELECT Id, Field__c FROM TargetObject__c WHERE Status='Pending']); }
global void execute(Database.BatchableContext bc, List<TargetObject__c> scope) { for (TargetObject__c rec : scope) { // process } update scope; recordsProcessed += scope.size(); }
global void finish(Database.BatchableContext bc) { // Send completion email, log results, optionally chain another Batch } } `
- Pick batch size carefully.
Database.executeBatch(new BulkProcessor(), 200)— 200 is the default max. Smaller (50, 100) reduces governor pressure per execute() but increases total batches.
- Use `Database.QueryLocator` in
start()for >50k records — it iterates lazily.
- `Database.Stateful` marker preserves instance variables across
execute()calls — useful for accumulating totals or referencing a master record set.
- Bulk-pattern within `execute()` — same rules: no SOQL or DML in loops; build collections; one DML per chunk.
- Error handling per-record:
apex Database.SaveResult[] results = Database.update(scope, false); List<Failed_Record__c> failures = new List<Failed_Record__c>(); for (Integer i = 0; i < results.size(); i++) { if (!results[i].isSuccess()) { failures.add(new Failed_Record__c(...)); } } if (!failures.isEmpty()) insert failures;
- Recovery strategy. If a batch fails mid-flight, your
start()query should be re-runnable — it should target only records that haven't been processed yet (e.g.,WHERE Status='Pending'plus update Status to 'Processed' in execute).
- Schedule it. Use
System.schedule()or Setup -> Apex Classes -> Schedule to fire at off-hours when org load is low.
- Monitoring. Implement notifications on success/failure. Log job runs to a custom
Job_Log__cobject.
- Test thoroughly. Test classes can simulate one chunk. For full-scale testing, test in a Full Sandbox with realistic data volume.
Trade-offs / alternatives:
- Bulk API from outside — for one-time loads, consider running from outside Salesforce via Bulk API 2.0. Same processing, no Apex governor exposure. Good for migrations.
- CDC + external processing — for ongoing streams, Change Data Capture + a Mulesoft/Snowflake processor scales infinitely.
- Big Object archiving — if you're processing to delete old data, archive to Big Object first.
For 10M records as a one-off, Batch Apex with proper monitoring is the right answer.
