Salesforce DictionarySalesforce Dictionary

Salesforce QA / Tester

medium

How do you anonymise production data for non-production use?

Production data in sandboxes is risky (compliance, privacy). Anonymise before use.

What needs anonymising:

PII: names, emails, phones, addresses, SSN.
PCI: payment card data.
PHI: health records.
Other sensitive: salary, performance reviews, etc.

Approaches:

1. Mask values.

Replace with realistic-looking but fake data:

"John Doe" -> "Anonymous-1"
"john@example.com" -> "user-1@example.com"
"555-123-4567" -> "555-000-0001"

2. Hash values.

Replace with hash; same input -> same output. Preserves uniqueness without revealing real values.

3. Tokenise.

Replace with token; map kept secure for de-tokenisation if needed (rare in test contexts).

4. Delete.

Remove fields entirely.

5. Anonymise in place.

Apply transformations to existing records.

6. Sample.

Take subset of production; anonymise only that subset.

Tools:

OwnBackup Sandbox Seeding — automated anonymisation.
Gearset Compare — selective deploy with masking.
Custom Apex for one-off anonymisation.
DataPrivacyManager — third-party.

Process:

Identify sensitive fields — data classification.
Define anonymisation rules per field.
Apply during sandbox refresh or post-refresh.
Verify — spot-check that no PII remains.
Document for compliance.

Common pitfalls:

Production data in sandbox unmasked — compliance violation.
Inadequate anonymisation — reverse-engineerable.
Field mismatch — anonymise some fields, leave others sensitive.

Senior QA insight: anonymisation is mandatory in regulated industries. Test environments shouldn't carry PII.

The senior framing: trust no one with production data; anonymise.

Why this answer works

Senior. The approach catalog and "trust no one" framing are mature.

Follow-ups to expect

What is OwnBackup Sandbox Seeding?
How do you verify anonymisation worked?
When is sample-based anonymisation acceptable?

Related dictionary terms