Recipes

Enrichment Data Testing Guide

May 5, 2025

10 minutes

Purpose

Welcome! We’re excited to see you testing People Data Labs (PDL) to measure our fit into your central workflow. A well‑designed test establishes a baseline match‑rate expectation, highlights blind spots in your sample, and prevents last‑minute surprises when you transition to production. Treat this as due diligence: verify fit, quantify coverage for you, and decide early whether you must adjust inputs, field bundles, or any other downstream logic.

Pre‑test Checklist

Create an API key in the API Dashboard
Put together a random, and statistically significant, representative sample
Ensure each row contains at least one, high‑confidence identifier such as email, linkedin URL, website, etc
Ensure you aren’t using role‑based or catch‑all inboxes e.g. info@acmeco.com
Deduplicate the file so every record counts once
Conduct standard data cleaning e.g. obvious typos or last names in the first name value

Best practices

Check out the API Dashboard Quickstart guide to get up and running quickly
Once you have a solid grasp of the API Dashboard, check out the API Quickstart Guide to start sending your first API calls
Your sample should be drawn from your typical live production sources rather than demo records for more accurate results
Some inputs like LinkedIn URL provide the highest probability of returning a match. Use these where possible
While the data doesn’t need to be perfectly cleaned as our algorithm will do some of the work for you, misspelled company names or mismatched first and last name will not typically return a result
Verify inbox validity before assuming the dataset is the culprit. Regulatory removals under GDPR or CCPA reduce EU and Canadian coverage, so adjust expectations accordingly

Sample Size

Statistical confidence relies on sample size (n) and diversity in the records chosen for the test. Use the table below to pick a minimum record count. Smaller, homogeneous lists require larger n because correlated attributes (e.g., all employees from one startup) inflate variance and distort averages. Each time you filter by a new variable—country, seniority—double the sample again to maintain confidence.

“Range of Normal” Match Rates*

Linkedin URLs → 95-100%
B2B contact enrichment with valid work email → 40‑70%
Consumer/social enrichment with fresh personal email → 60‑85%
Niche segments (stealth, early‑stage, non‑US) → 15‑40%

*Ranges are highly dependent on sample size and use case

Evaluate Results

When responses return, filter the records where status equals 200 to get your total matches. You can compute your overall match rate by dividing your total matches by total inputs.

Total matches Total inputs 100

475 500 100= 95% match rate

Segment the results by dimensions that matter to your business, e.g. industry, company size, region, seniority, etc. to expose coverage dips that a blended strategy or extra identifier might fix. Compare each segment to the “range of normal” above; if numbers fall outside the expected band, review the “lift actions: below.

It’s important to understand the field fill rates across our dataset to set reasonable expectations about match rate. For example, for all profiles with at least a LinkedIn URL, as of the April 2025 release, we show a fill rate of 7.5% with a mobile_phone. Therefore, expecting 80% of your matches to return a mobile_phone would be unreasonable as the data doesn’t exist for a broad portion of the data.

Lift Actions if Coverage Feels Low

Enrich each record with a second strong identifier such as LinkedIn URL or a sanitized company domain if available; additional signals typically boost matches by several points.
Standardize company domains (acme-inc.com → acme.com)
Remove accents or special characters from names
Ensure emails are active inboxes rather than stale aliases
Use required parameters to help raise a better match
Set a reasonable min_likelihood score

If enrichment still underperforms, run the list through /person/search, which trades precision for broader recall and can sometimes surface otherwise hidden profiles.

Enrichment Data Testing Guide

Purpose

Pre‑test Checklist

Best practices

Sample Size

“Range of Normal” Match Rates*

Evaluate Results

Lift Actions if Coverage Feels Low

Resources & Self‑help Paths

Other Blog Posts

Data Recipes: Startup Discovery & Sourcing (DDVC #1)

Data Recipes: Benchmarking Middle Management Quality

Data Recipes: Building a Custom Audience