Production data in sandboxes is risky (compliance, privacy). Anonymise before use.
What needs anonymising:
- PII: names, emails, phones, addresses, SSN.
- PCI: payment card data.
- PHI: health records.
- Other sensitive: salary, performance reviews, etc.
Approaches:
1. Mask values.
Replace with realistic-looking but fake data:
- "John Doe" -> "Anonymous-1"
- "john@example.com" -> "user-1@example.com"
- "555-123-4567" -> "555-000-0001"
2. Hash values.
Replace with hash; same input -> same output. Preserves uniqueness without revealing real values.
3. Tokenise.
Replace with token; map kept secure for de-tokenisation if needed (rare in test contexts).
4. Delete.
Remove fields entirely.
5. Anonymise in place.
Apply transformations to existing records.
6. Sample.
Take subset of production; anonymise only that subset.
Tools:
- OwnBackup Sandbox Seeding — automated anonymisation.
- Gearset Compare — selective deploy with masking.
- Custom Apex for one-off anonymisation.
- DataPrivacyManager — third-party.
Process:
- Identify sensitive fields — data classification.
- Define anonymisation rules per field.
- Apply during sandbox refresh or post-refresh.
- Verify — spot-check that no PII remains.
- Document for compliance.
Common pitfalls:
- Production data in sandbox unmasked — compliance violation.
- Inadequate anonymisation — reverse-engineerable.
- Field mismatch — anonymise some fields, leave others sensitive.
Senior QA insight: anonymisation is mandatory in regulated industries. Test environments shouldn't carry PII.
The senior framing: trust no one with production data; anonymise.
