Uploading
Uploading in Salesforce data management is the process of importing data from external files (CSV, XML, JSON) into Salesforce records using one of several tools: Data Import Wizard for small to med…
Definition
Uploading in Salesforce data management is the process of importing data from external files (CSV, XML, JSON) into Salesforce records using one of several tools: Data Import Wizard for small to medium loads through the UI, Data Loader for desktop-driven CRUD on any object including custom ones, the Bulk API or Bulk API 2.0 for large-volume programmatic loads, and third-party tools like Workato, Mulesoft, or Jitterbit for orchestrated workflows. The term covers both first-time data migration (loading historical records into a new org) and ongoing synchronization (nightly pushes from an ERP or warehouse).
Uploading is not the same as inserting through the UI one record at a time. The defining characteristic is bulk: tens, thousands, or millions of records moved through a file-based pipeline rather than typed by hand. The choice of tool depends on the record count, the object, the user persona, and whether the load needs to repeat on a schedule.
The Salesforce upload toolkit: pick the right tool, prepare the file, recover from errors
The four main uploading tools and when to pick each
Data Import Wizard is the UI-driven tool for standard objects with up to 50,000 records per load. It handles Account, Contact, Lead, Solution, Custom Object imports through a wizard with field mapping and duplicate handling built in. Data Loader is the desktop application for any object up to 5 million records per load. It supports insert, update, upsert, delete, and hard delete operations, and includes a command-line interface for scheduled runs. Bulk API and Bulk API 2.0 are the programmatic equivalents for very large volumes; they run asynchronously through a job queue and can process tens of millions of records per day. Third-party tools wrap one or more of these and add transformation, scheduling, and error handling that Data Loader and the Bulk API do not provide natively.
File formats, encoding, and the gotchas at the file level
CSV is the dominant upload format, with UTF-8 encoding the safest default. CSV files with BOM (byte order mark) headers cause Data Loader to misread the first column; strip the BOM before upload. Commas inside quoted fields are fine but require careful quoting; commas inside unquoted fields are the most common cause of off-by-one column issues during load. Date and DateTime values must be in ISO 8601 format or in the user locale format that matches the org user setting; mismatches cause silent date corruption on insert. Boolean values must be true or false (case-insensitive), not Y/N or 1/0. Picklist values must match the org defined picklist values exactly (case-sensitive). Validate the file against these rules before kicking off the upload to avoid the slow round-trip of partial loads and rollbacks.
Insert versus Update versus Upsert
Insert creates new records and fails with a UNIQUE_VIOLATION or similar error if a record with the same key already exists. Update modifies existing records and requires the Salesforce record ID in the source file; it fails on records the ID does not match. Upsert is the safest option for ongoing syncs because it inserts new records and updates existing ones based on an external ID field; no error on either path. Upsert requires that the external ID field be defined as External ID on the object schema and that the file contain the external ID column. For one-time historical migrations, insert is fine; for any recurring sync from an ERP or warehouse, upsert is the right choice. Hard delete and soft delete are the inverse operations and have their own gotchas around recycle bin behavior.
Error handling and the partial-load recovery pattern
Uploads rarely run clean the first time. Data Loader and Bulk API both produce success and error CSV files alongside the job log. The error file contains every record that failed plus the error reason from Salesforce. The standard recovery pattern is: read the error file, fix the data issues in the source, run a second load that includes only the failed records. Iterate until the error file is empty. Common errors include picklist value mismatches, required field nulls, duplicate detection rule blocks, validation rule failures, lookup field broken references (the parent ID in the file does not exist in Salesforce), and field-level security blocking the upload user from writing certain fields. Build the error-handling pattern into the upload runbook so the operator does not invent it each time.
Permissions, governor limits, and what the upload user needs
The Salesforce user who runs the upload needs Create or Edit permissions on the target objects, field-level security access to every column the file references, and (for Bulk API) the appropriate API license. Lookup field permissions matter: if the upload user cannot read the parent record, the lookup fails even when the lookup value in the file is correct. Bulk API loads count against the org daily API request limit but use fewer calls per record than equivalent REST loads (one batch of 10,000 records uses far fewer calls than 10,000 individual REST inserts). Hitting the daily limit blocks the rest of the day uploads, so monitor usage. Field history tracking, sharing recalculation, and Apex triggers all fire on uploaded records; large uploads can hit governor limits if the triggers are not bulk-safe.
Scheduled uploads and the integration-tool angle
Repeat uploads (nightly ERP sync, hourly warehouse refresh, weekly pricing update) are usually run through a scheduled pipeline rather than manual Data Loader runs. Salesforce own Data Loader has a command-line mode that supports scheduling through OS cron. Third-party tools (Mulesoft, Workato, Jitterbit, Informatica, Boomi) add visual orchestration, error notification, transformation rules, and the ability to chain Salesforce uploads with non-Salesforce systems in the same pipeline. For any sync that runs more than weekly or that touches more than one system, evaluate a third-party tool rather than scripting Data Loader runs by hand. The total cost of ownership for a custom scripted pipeline almost always exceeds the licensing cost of a managed integration platform within 12 months.
Running a successful Salesforce upload end to end
Running a successful upload into Salesforce is a six-step routine: pick the right tool, prepare the file with clean data, validate the source against the target schema, run a test load against a sandbox first, run the production load with monitoring, and process the error file. The order matters. Most upload failures come from skipping the validation step and discovering the data issues only at production load time. Build the routine into a runbook and follow it every time, regardless of whether the load is one-off historical migration or a nightly sync.
- Pick the right tool for the volume and the operator
Match the tool to the load characteristics. Data Import Wizard for standard objects under 50,000 records run by a non-technical user. Data Loader for any object up to 5 million records run by an admin or developer comfortable with the desktop UI. Bulk API or Bulk API 2.0 for programmatic loads from an integration pipeline. Third-party tools for scheduled syncs or multi-system orchestration. Picking too lightweight a tool forces a redo when volume grows; picking too heavyweight a tool wastes setup effort on a one-off load. Document the choice in the runbook so future operators know which tool to reach for.
- Prepare and validate the source file
Open the CSV in a text editor (not Excel, which silently mangles long numbers and dates) and confirm the encoding is UTF-8 without BOM. Confirm every column matches a Salesforce field by API name. Confirm date and DateTime values are in the expected format. Confirm picklist values match the org definitions exactly (case-sensitive). Confirm lookup field values reference real Salesforce record IDs or external IDs that exist. Run a small sample (10 to 100 rows) through Data Loader against a sandbox; the success file should match the input row count. If the sample fails, fix the file before proceeding.
- Run a sandbox test load and review results
From Data Loader (or the chosen tool), connect to a sandbox refreshed from production. Load the full file or a representative sample (10 to 25 percent for very large files). Review the success and error CSVs. Open Salesforce and spot-check the loaded records: are field values correct, are lookups linked properly, did validation rules fire as expected, did triggers run without governor-limit errors. If anything is wrong, fix the file and run again. Do not move to production until the sandbox load is clean. This is the single step that saves the most pain across the lifecycle of the upload.
- Run the production load and process errors
Run the upload against production with the same configuration that worked in sandbox. Monitor the job log for progress. After the job completes, immediately review the error CSV. Fix the source data for any failed records and run a corrective load that includes only the failed records. Iterate until the error file is empty. Document the final record counts (inserted, updated, errored) in the upload log along with the date, the operator, and any noteworthy issues. For recurring uploads, automate the error notification: if more than N records fail, email the runbook owner.
- Excel silently mangles long numbers (turns 18-character Salesforce IDs into scientific notation) and dates (reformats to local locale). Open CSVs in a text editor or in a dedicated CSV tool, never in Excel for upload preparation.
- Picklist values must match the org definitions exactly, case-sensitive. Inconsistent spellings (Hot vs hot vs HOT) are the most common reason records fail with INVALID_OR_NULL_FOR_RESTRICTED_PICKLIST.
- Apex triggers, validation rules, and sharing rules all fire on uploaded records. Large loads can hit governor limits if the triggers are not bulk-safe; disable non-essential triggers or run loads off-peak.
- Bulk API counts against the daily API request limit but uses fewer calls than equivalent REST loads. Watch the daily usage gauge if the upload is part of a larger integration pipeline; one runaway load can starve the rest of the day.
- Upsert requires an External ID field defined as such in the schema. Without it, Data Loader rejects the upsert operation. Confirm the field flag is set before scoping a recurring sync that relies on upsert.
Trust & references
Straight from the source - Salesforce's reference material on Uploading.
- Data Loader GuideSalesforce Help
- Bulk API 2.0 Developer GuideSalesforce Developer Docs
- Data Import WizardSalesforce Help
About the Author
Dipojjal Chakrabarti is a B2C Solution Architect with 29 Salesforce certifications and over 13 years in the Salesforce ecosystem. He runs salesforcedictionary.com to help admins, developers, architects, and cert/interview candidates sharpen their fundamentals. More about Dipojjal.
Test your knowledge
Q1. What is Uploading?
Q2. What tools support it?
Q3. Why validate before uploading?
Discussion
Loading discussion…