Salesforce Dictionary - Free Salesforce GlossarySalesforce Dictionary
DictionaryBBatch Apex
DevelopmentAdvanced

Batch Apex

Batch Apex is a Salesforce execution context that processes large data sets by splitting work into manageable chunks (batches), each running as its own transaction with its own governor limits.

§ 01

Definition

Batch Apex is a Salesforce execution context that processes large data sets by splitting work into manageable chunks (batches), each running as its own transaction with its own governor limits. It is the right tool when a job needs to read or update millions of records, perform long-running aggregations, or chain async work across multiple steps. The platform handles chunking, scheduling, and retry logic, leaving the developer to focus on the per-chunk business logic.

A Batch Apex class implements the Database.Batchable<T> interface with three methods: start (returns a query locator or iterable identifying the records to process), execute (runs on each chunk with the records as input), and finish (runs once when all chunks complete). The platform splits the start query result into batches of 200 records by default, and each chunk runs in its own transaction. This pattern lets a single batch job process 50 million records without hitting any single-transaction limit, at the cost of slower wall-clock time compared to synchronous processing.

§ 02

How Batch Apex handles large-scale processing

The three methods: start, execute, finish

Every Batch Apex class implements three methods. Start returns a Database.QueryLocator (for SOQL-based jobs) or an Iterable<T> (for complex data sources). Execute receives a chunk of records (List<sObject> by default) and performs the business logic. Finish runs once after the last chunk and is the right place to send notifications, kick off the next batch job, or log results. Each method has access to a Database.BatchableContext parameter that provides the job ID for tracking and logging.

Scope size and chunk tuning

The default scope size is 200 records per execute invocation. For most jobs this is the right balance between transaction efficiency and governor limit headroom. Smaller scope (50, 100) helps when each record requires heavy processing or many SOQL queries. Larger scope (up to 2,000 for QueryLocator-based jobs) helps when per-record processing is light and start query latency dominates. Tune scope size in sandbox with realistic data volumes; the right value depends on the work being done per record.

Database.Stateful and state preservation across chunks

By default, instance variables on the batch class are not preserved across chunks. Each execute call gets a fresh class instance. Implementing the Database.Stateful interface preserves the instance state so totals, counters, and accumulated results can carry from one chunk to the next. This is the standard pattern for jobs that compute aggregates across the full data set, like total records updated or list of records that failed validation. Without Stateful, finish receives no information from execute.

Governor limits per chunk versus total job

Each chunk runs in its own transaction with the standard governor limits (100 SOQL queries, 150 DML statements, 10,000 ms CPU). The total job can process arbitrary record counts because each chunk resets the counters. This is the foundational reason Batch Apex exists: it sidesteps the synchronous transaction limits by chunking work into smaller transactions. However, async-context limits (cumulative callouts, future method calls) apply across the whole job and can still bite on long-running jobs.

Job tracking via AsyncApexJob

Every batch job execution creates an AsyncApexJob record with a unique job ID, status, progress, completion time, and error count. Query AsyncApexJob in SOQL to monitor jobs programmatically. Setup > Apex Jobs shows the same data in the UI. Build a dashboard that surfaces failed or stalled batch jobs because production batch jobs that fail silently leak data corruption (records partially processed, totals out of sync) faster than any other Apex problem.

Chaining batch jobs and recursive scheduling

The finish method can kick off the next batch job by calling Database.executeBatch on a new batch class. This pattern chains a sequence of batch jobs into a multi-step pipeline (Cleanup, Enrich, Notify). Recursive scheduling means a batch job can re-schedule itself with updated criteria. Both patterns are powerful but can produce infinite loops if termination criteria are wrong. Always include explicit stop conditions and monitor the AsyncApexJob queue for runaway chains.

Batch Apex versus Queueable Apex versus Schedulable

Batch Apex handles large data sets with automatic chunking. Queueable Apex handles smaller async work with rich data structures and chaining (no chunking, no QueryLocator restrictions, easier to pass complex inputs). Schedulable Apex runs on a cron-style schedule; it usually delegates to a batch or queueable job for the actual work. The three execution contexts compose: a scheduled Schedulable job kicks off a Batch Apex chain, which in turn enqueues Queueable jobs for follow-up work. Most production async pipelines use all three.

§ 03

How to write a Batch Apex job

Batch Apex is the standard way to process large data volumes on Salesforce. The structure is simple (three methods, one interface) but the design choices matter: scope size, state preservation, error handling, and chaining all need deliberate decisions. Build and tune in a sandbox with realistic data volumes before running in production.

  1. Confirm Batch Apex is the right tool

    Use Batch Apex when you need to process more records than a synchronous transaction can handle. For smaller async work with complex data, Queueable Apex is simpler. For cron-based scheduling, use Schedulable Apex that delegates to a batch job for the actual work.

  2. Create the class with Database.Batchable interface

    public with sharing class MyBatchJob implements Database.Batchable<sObject> { ... }. Implement start, execute, and finish methods. Use sObject as the generic type for most jobs; use a custom type for non-SObject iterables.

  3. Implement start to return the records to process

    Return a Database.QueryLocator for SOQL-based selection: return Database.getQueryLocator(''SELECT Id, Name FROM Account WHERE Stale__c = true''). QueryLocator supports up to 50 million records. For custom data sources, return an Iterable<sObject> instead, which has lower limits.

  4. Implement execute with per-chunk business logic

    Receive List<sObject> records as the second parameter. Process them with normal Apex logic: SOQL queries, DML operations, callouts (if Database.AllowsCallouts is also implemented). Each chunk has its own governor limits, so bulkify carefully within the chunk.

  5. Implement finish for post-job cleanup

    Send completion notifications, log job summary to a custom Error_Log__c, or chain the next batch job via Database.executeBatch. Use the Database.BatchableContext parameter to get the job ID for queries against AsyncApexJob.

  6. Add Database.Stateful if aggregation is needed

    Implements Database.Batchable<sObject>, Database.Stateful. Instance variables preserved across chunks. Use for running totals, accumulated error lists, or any state that must survive from execute to finish.

  7. Write a test class with realistic data volumes

    @isTest class with test methods that build 200+ records and call Database.executeBatch from inside Test.startTest/Test.stopTest. Confirm execute fires the right number of times and finish runs once. Test failures, retries, and edge cases.

  8. Schedule or invoke and monitor the job

    Run from Anonymous Apex: Database.executeBatch(new MyBatchJob(), 200). Schedule via Schedulable Apex for recurring execution. Monitor Setup > Apex Jobs for status. Build a dashboard against AsyncApexJob for production visibility.

Key options
Scope Sizeremember

Records per execute invocation. Default 200, up to 2,000 for QueryLocator jobs. Tune based on per-record work and governor limit headroom.

Database.Statefulremember

Optional interface for preserving instance state across chunks. Required for cross-chunk aggregation and accumulated results.

Database.AllowsCalloutsremember

Optional interface that lets execute methods make HTTP callouts to external systems. Required for jobs that need external integration.

Gotchas
  • QueryLocator supports up to 50 million records, but Iterable scope is much smaller. If the data source is not directly queryable via SOQL, profile the iterable performance before assuming it scales.
  • Each chunk runs in its own transaction. Instance variables on the batch class do not preserve across chunks unless Database.Stateful is implemented. Forgetting this is the leading cause of mysterious zero-totals at finish.
  • Async governor limits apply across the whole job (cumulative callouts, total CPU, future method calls). Long-running jobs can still hit these even though per-chunk limits reset.
  • Failed chunks roll back their DML but do not stop the job. The job continues processing remaining chunks. Build error tracking into execute (try/catch with Error_Log__c) because the default behavior silently swallows partial failures.
  • Chained batch jobs from finish can produce infinite loops if termination criteria are wrong. Always include explicit stop conditions and monitor AsyncApexJob for runaway chains in production.
§

Trust & references

Sources

Cross-checked against the following references.

Official documentation

Straight from the source - Salesforce's reference material on Batch Apex.

Keep learning

Hands-on resources to go deeper on Batch Apex.

Was this entry helpful?
Help us write better definitions. Quick reactions or detailed edit suggestions.

About the Author

Dipojjal Chakrabarti is a B2C Solution Architect with 29 Salesforce certifications and over 13 years in the Salesforce ecosystem. He runs salesforcedictionary.com to help admins, developers, architects, and cert/interview candidates sharpen their fundamentals. More about Dipojjal.

§

Test your knowledge

Q1. What is the maximum batch size in Batch Apex?

Q2. What are the three methods of the Database.Batchable interface?

Q3. Why can Batch Apex process millions of records when a single transaction cannot?

§

Discussion

Loading…

Loading discussion…