Bulk API 2.0
Bulk API 2.0 is Salesforce's high-volume data processing API, designed for loading or extracting millions of records in a single job. It is the modern replacement for the original Bulk API (release…
Definition
Bulk API 2.0 is Salesforce's high-volume data processing API, designed for loading or extracting millions of records in a single job. It is the modern replacement for the original Bulk API (released in 2009) and the right tool whenever data volumes exceed what REST or SOAP API can handle efficiently. Data Loader, MuleSoft, Informatica, and most enterprise integration platforms use Bulk API 2.0 for bulk loads behind the scenes.
A Bulk API 2.0 job is a multi-step lifecycle. The integration creates a job with the target object, operation (insert, update, upsert, delete, hardDelete, query), and content type. It uploads CSV data in chunks. It marks the job ready for processing. Salesforce processes the chunks in parallel, returns success and failure records, and reports completion through job status polling. The API handles the chunking, parallelism, and error reporting automatically; integration code just submits CSVs and reads results. Bulk API 2.0 dramatically simplified the original Bulk API's manual chunking and makes high-volume operations approachable without specialized integration tooling.
How Bulk API 2.0 handles high-volume Salesforce operations
When to use Bulk API 2.0 versus REST or SOAP
Bulk API 2.0 fits jobs with more than 2,000 records. Below that, REST or SOAP API is faster because the per-job overhead of Bulk does not pay off on small loads. Above 10,000 records, Bulk is the only sensible option because REST and SOAP hit batch limits or rate caps. Typical Bulk use cases: nightly ETL, customer onboarding data loads, monthly compliance archives, mass updates of legacy data. For interactive or near-real-time integration, REST API is the better fit; Bulk is for batch-style work.
Job lifecycle: create, upload, close, monitor, retrieve
Every Bulk job goes through five steps. Create the job with POST /services/data/vXX.X/jobs/ingest, specifying object, operation, and content type. Upload CSV data with PUT to the job''s contentUrl. Close the job with PATCH to mark it ready for processing. Salesforce processes asynchronously; poll the job state until it transitions to JobComplete or Failed. Retrieve successful records, failed records, and unprocessed records as separate CSV outputs. The lifecycle is more elaborate than REST but lets Salesforce optimize processing in ways per-record APIs cannot.
Insert, update, upsert, delete operations
Bulk API 2.0 supports all standard DML. Insert creates new records. Update modifies existing ones, requiring the Id column in the CSV. Upsert inserts or updates based on a specified external ID field, the standard pattern for integrating with external systems that have their own primary keys. Delete soft-deletes (Recycle Bin); hardDelete bypasses the Recycle Bin and is irreversible. The same job can include up to 150 MB of CSV data (around 100 million bytes for the raw payload).
Parallel processing and chunking behavior
Behind the scenes, Bulk API 2.0 splits the input CSV into chunks of around 10,000 records each and processes them in parallel. Parallel processing dramatically speeds throughput but produces ordering surprises: records that depend on earlier records being processed may fail because the parallel chunks run independently. The serialPolicy option forces sequential processing for jobs where ordering matters. Most jobs accept parallel; jobs with parent-child dependencies between records in the same job need serial.
Lock contention and the master-detail problem
Bulk operations under the same master record can produce UNABLE_TO_LOCK_ROW errors because the master is locked during each detail update. Sorting the CSV by master ID before upload puts records targeting the same parent in the same chunk, dramatically reducing lock contention. This is the single most important optimization for Bulk jobs on master-detail data. Without it, jobs fail with locking errors that look mysterious to debug; with it, the same data loads cleanly.
Querying via Bulk: the query and queryAll endpoints
Bulk API 2.0 supports query operations through /services/data/vXX.X/jobs/query. Submit a SOQL query, poll for completion, retrieve results as CSV. This is the right pattern for extracting large data sets that would exceed REST query limits (over 50,000 records). queryAll includes deleted and archived records, useful for data migration scenarios. The Bulk query model is asynchronous like the ingest model: submit, poll, retrieve.
Comparison to the original Bulk API
The original Bulk API (now called Bulk API 1.0) required the caller to manually chunk data into batches of up to 10,000 records and submit each batch separately. Each batch had its own job entry, error handling was per-batch, and orchestration was the caller''s responsibility. Bulk API 2.0 handles chunking internally, exposes a single job per operation, and simplifies error retrieval. For new integrations, always use 2.0. Bulk API 1.0 is still supported but the simplified API surface of 2.0 is faster to integrate against and easier to monitor.
How to use Bulk API 2.0
Using Bulk API 2.0 follows a predictable five-step lifecycle: create the job, upload CSV, close the job, poll for completion, retrieve results. Most enterprise integration platforms wrap this lifecycle into a single configuration step. For custom integration code, follow the steps in order and handle the asynchronous job completion correctly with polling and exponential backoff.
- Confirm Bulk is the right tool
Use Bulk API 2.0 for jobs over 2,000 records. Below that, REST or SOAP API is faster. For under 50,000 records and interactive needs, REST may still fit; Bulk shines on truly large data loads.
- Create the job
POST /services/data/vXX.X/jobs/ingest with JSON body specifying object, operation (insert, update, upsert, delete, hardDelete), externalIdFieldName (for upsert), and lineEnding. The response includes the jobId and contentUrl.
- Upload the CSV data
PUT the CSV body to the contentUrl returned in the previous step. Content-Type: text/csv. Maximum 150 MB per job. Sort by master ID for master-detail loads to reduce lock contention.
- Close the job to mark ready for processing
PATCH /services/data/vXX.X/jobs/ingest/JOB_ID with state=UploadComplete. Salesforce starts processing asynchronously after this step.
- Poll the job state until processing completes
GET /services/data/vXX.X/jobs/ingest/JOB_ID. The state field transitions through Open, UploadComplete, InProgress, JobComplete, Failed, Aborted. Poll every 30 seconds with exponential backoff for very large jobs.
- Retrieve successful and failed records
GET /services/data/vXX.X/jobs/ingest/JOB_ID/successfulResults and /failedResults. Both return CSV bodies. Log the failed-record CSV for manual review and retry where applicable.
- Build error-handling logic for the failed records
Each failed record includes the original CSV fields plus error description columns. Parse the errors, classify by type (validation, lock, network), and decide whether to retry. Some errors (validation failures) are not retriable without data fixes.
- Clean up completed jobs
Salesforce retains job metadata for 7 days; older jobs are removed automatically. For audit purposes, store the job ID and result CSVs in your own logging system before the retention window closes.
The DML operation the job performs. Upsert needs an externalIdFieldName; hardDelete bypasses the Recycle Bin.
Parallel (default, fast) or Serial (for jobs with inter-record dependencies). Serial trades speed for ordering guarantees.
The field used to match incoming records against existing records. Must be marked External ID and Unique on the target object.
- Lock contention on master-detail data produces UNABLE_TO_LOCK_ROW errors. Sort the CSV by master ID before upload to put records targeting the same parent in the same chunk.
- Job retention is 7 days. Failed and successful result CSVs are also retained for 7 days. Store the results in your own logging system if you need them longer.
- Parallel processing can produce ordering surprises. Records that depend on earlier records being processed in the same job should use Serial concurrency mode instead.
- hardDelete bypasses the Recycle Bin and is irreversible. Triple-check the input CSV before running a hardDelete job because there is no way to recover deleted records.
- Bulk API daily limits apply alongside REST and SOAP limits. Monitor combined API consumption to avoid blowing the daily cap during nightly ETL windows.
Trust & references
Cross-checked against the following references.
- Bulk API 2.0 Developer GuideSalesforce Developer
- Bulk API 2.0 Data LoadsSalesforce Developer
- Bulk API 2.0 QueriesSalesforce Developer
Straight from the source - Salesforce's reference material on Bulk API 2.0.
- Bulk API 2.0 Quick StartSalesforce Developer
- Create a Bulk JobSalesforce Developer
- Get Bulk Job ResultsSalesforce Developer
About the Author
Dipojjal Chakrabarti is a B2C Solution Architect with 29 Salesforce certifications and over 13 years in the Salesforce ecosystem. He runs salesforcedictionary.com to help admins, developers, architects, and cert/interview candidates sharpen their fundamentals. More about Dipojjal.
Test your knowledge
Q1. What is a main advantage of Bulk API 2.0 over 1.0?
Q2. Which operations does Bulk API 2.0 support?
Q3. When might Bulk API 1.0 still be preferred over 2.0?
Discussion
Loading discussion…