Bad data corrupts every downstream feature. Data quality strategy spans inputs, ongoing maintenance, and measurement.
1. Prevention (best leverage):
- Validation rules at every save — required fields, format, cross-field constraints.
- Picklists, not free-text — constrain value sets. Use Picklist Value Sets globally.
- Lookup relationships, not text — foreign keys, not name strings.
- Field-level help — guide users on what to enter.
- Required at API level — no "optional via the API" loopholes for integrations.
2. Duplicate prevention:
- Matching Rules + Duplicate Rules on Lead, Contact, Account.
- Merge UI for users to consolidate dupes.
- External Id fields — stable keys that integrations use to upsert without creating dupes.
3. Enrichment:
- Third-party data — Clearbit, ZoomInfo, D&B Hoovers append firmographic data.
- Pardot/MCAE auto-enriches Lead data.
- Geocoding — address standardisation.
4. Cleansing (one-time):
- Profile current data — assess completeness, accuracy, consistency.
- Identify duplicates — surface and merge.
- Standardise formats — country codes, phone formats, capitalisation.
- Fill gaps — required fields with sensible defaults or via enrichment.
- Archive stale records — old, inactive Leads/Contacts to a Big Object or external archive.
5. Ongoing maintenance:
- Daily duplicate checks — automated reports flagging recent dupes.
- Stale data alerts — Accounts not updated in 12 months.
- Validation rule audit — periodically review which rules fire most (often signals where data is bad).
- Field utilisation audit — fields never populated (consider deprecating).
- Data Quality dashboard — completeness, dupes, age, key field percentages.
6. Tooling:
- DemandTools (managed package) — bulk dedup, mass actions.
- Apsona — admin power tools.
- Custom Apex scripts for one-off cleanups.
7. Governance:
- Data steward role — owns data quality across the org.
- Field ownership — each field has an owner who decides changes.
- Change requests — adding/removing fields goes through review.
Common pitfalls:
- Treating data quality as one-time — degrades immediately without ongoing process.
- No metrics — "we have a data quality problem" is qualitative; track quantitatively.
- Cleansing without prevention — cleaning the same dupes monthly without fixing the source.
Senior consultants build the data quality flywheel into the implementation, not bolt it on later.
