SQL Patterns for GDPR Compliance

Defensive SQL, data validation, null handling, and compliance-safe joins. Here are all the patterns I rely on to safely segment email lists in highly regulated industries.

Defensive SQL in Practice

To stay compliant with regulations (GDPR, CASL, Bill 64, you name it), means you can’t just run queries on a whim. A missing condition can mean someone receives an email they shouldn’t, or worse, legal headaches. So I’ve built a few reliable patterns.

1. Null Handling

Always check for NULLs. Simple, but so easy to forget. I usually do:

SELECT
  COALESCE(email, '') AS email_address,
  COALESCE(first_name, 'Valued Customer') AS first_name
FROM contacts
WHERE status = 'active';
  

Ensures no errors in joins or personalization tokens in SFMC.

2. Compliance-Safe Joins

Never join on tables that might have unverified opt-ins. I usually pre-filter the base table:

WITH verified_contacts AS (
  SELECT id, email
  FROM contacts
  WHERE consent_flag = 1
)
SELECT *
FROM verified_contacts vc
JOIN orders o ON vc.id = o.contact_id;
  

This way, any downstream workflow can’t accidentally include unsubscribed or unconsented users.

3. Gradual Segmentation

Even with a few hundred thousand rows, I like to step through segments:

-- Step 1: 5% sample
SELECT * FROM verified_contacts
WHERE MOD(id, 20) = 0;

-- Step 2: 25% sample
SELECT * FROM verified_contacts
WHERE MOD(id, 4) = 0;
  

Gives a chance to catch any edge cases before blasting the full audience. If anything misfires, you’ve only exposed a tiny subset.

4. Validation Queries

Before a send, I run checks:


This mirrors the same defensive mindset I use in Tealium/mParticle pipelines. Basically treat SQL like a safety net, not just a query engine.

5. Cross-Referencing with Workflows

Segment outputs feed SFMC workflows. I map each SQL output to a send classification: transactional, triggered, or nurture. This reduces human error.

Rough sketch:

[Raw Contacts Table] 
      |
      v
[Filtered / Verified Contacts] -- COALESCE & Null checks
      |
      v
[Segmented Lists] -- step-wise sample / percentage
      |
      v
[SFMC Triggered Send / Workflow]
  

These patterns have kept email sends clean at scale, protected deliverability, and avoided any regulatory mistakes.

Also see my Tealium vs mParticle post for how these defensive principles apply upstream in CDPs before the data even reaches SFMC.