Batch Enrichment: Keep Your CRM's Dutch Company Data Up to Date
How to use KVKBase's batch lookup API to enrich and verify Dutch company records in your CRM at scale.
Batch Enrichment: Keep Your CRM’s Dutch Company Data Up to Date
Company data decays faster than most teams realize. Businesses move offices, change trade names, merge with other entities, or shut down entirely. If your CRM contains Dutch company records, a significant percentage of them are probably outdated right now.
The fix is not to hire someone to manually check each record. It is to run batch enrichment against the KVK registry on a regular schedule.
The Cost of Stale Data
Bad company data is not just an annoyance — it has real business impact:
- Sales teams waste time calling companies that have moved or closed
- Marketing campaigns bounce when sent to outdated addresses
- Invoices get rejected when the legal name or address does not match the official registry
- Compliance audits fail when your records do not reflect the current state of your business relationships
A study by Gartner estimated that poor data quality costs organizations an average of $12.9 million per year. Even if your costs are a fraction of that, the ROI on automated data enrichment is hard to argue with.
What Batch Enrichment Looks Like
The process is straightforward:
- Export your Dutch company records (with KVK numbers) from your CRM
- For each record, call the KVKBase API to get the current data
- Compare the API response with your stored data
- Update any records where the data has changed
- Flag records where the company is no longer active
You can run this as a one-time cleanup or schedule it to run weekly or monthly.
Python Implementation
Here is a complete script that reads a CSV of company records, enriches them via the KVKBase API, and writes the results to a new CSV:
import csv
import time
import requests
API_KEY = "your-api-key"
BASE_URL = "https://api.kvkbase.nl/api/v1"
def lookup_company(kvk_number):
"""Look up a single company by KVK number."""
response = requests.get(
f"{BASE_URL}/lookup/{kvk_number}",
headers={"Authorization": f"Bearer {API_KEY}"}
)
if response.status_code == 200:
return response.json()
elif response.status_code == 404:
return {"error": "not_found"}
elif response.status_code == 429:
# Rate limited -- wait and retry
retry_after = int(response.headers.get("Retry-After", 60))
time.sleep(retry_after)
return lookup_company(kvk_number)
else:
return {"error": f"status_{response.status_code}"}
def enrich_csv(input_file, output_file):
"""Read a CSV, enrich each row, write results."""
with open(input_file, "r") as infile:
reader = csv.DictReader(infile)
fieldnames = reader.fieldnames + [
"api_trade_name",
"api_address",
"api_postal_code",
"api_city",
"api_legal_form",
"api_is_active",
"data_changed"
]
with open(output_file, "w", newline="") as outfile:
writer = csv.DictWriter(outfile, fieldnames=fieldnames)
writer.writeheader()
for i, row in enumerate(reader):
kvk = row.get("kvk_number", "").strip()
if not kvk or len(kvk) != 8:
row["data_changed"] = "invalid_kvk"
writer.writerow(row)
continue
result = lookup_company(kvk)
if "error" in result:
row["data_changed"] = result["error"]
writer.writerow(row)
continue
# Map API fields
addr = result.get("address", {})
row["api_trade_name"] = result.get("tradeName", "")
row["api_address"] = f"{addr.get('street', '')} {addr.get('houseNumber', '')}"
row["api_postal_code"] = addr.get("postalCode", "")
row["api_city"] = addr.get("city", "")
row["api_legal_form"] = result.get("legalForm", "")
row["api_is_active"] = str(result.get("isActive", ""))
# Check if anything changed
changed = (
row.get("company_name", "") != result.get("tradeName", "") or
row.get("city", "") != addr.get("city", "")
)
row["data_changed"] = "yes" if changed else "no"
writer.writerow(row)
# Progress
if (i + 1) % 100 == 0:
print(f"Processed {i + 1} records")
# Rate limiting: small delay between requests
time.sleep(0.1)
print(f"Done. Results written to {output_file}")
# Usage
enrich_csv("companies.csv", "companies_enriched.csv")
Your input CSV should have at least a kvk_number column. The script adds columns for the API data and a data_changed flag so you can quickly filter for records that need updating.
JavaScript / Node.js Implementation
If you prefer JavaScript, here is the equivalent:
const fs = require('fs');
const { parse } = require('csv-parse/sync');
const { stringify } = require('csv-stringify/sync');
const API_KEY = process.env.KVKBASE_API_KEY;
const BASE_URL = 'https://api.kvkbase.nl/api/v1';
async function lookupCompany(kvkNumber) {
const response = await fetch(`${BASE_URL}/lookup/${kvkNumber}`, {
headers: { 'Authorization': `Bearer ${API_KEY}` }
});
if (response.status === 200) return response.json();
if (response.status === 404) return { error: 'not_found' };
if (response.status === 429) {
const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
await new Promise(r => setTimeout(r, retryAfter * 1000));
return lookupCompany(kvkNumber);
}
return { error: `status_${response.status}` };
}
async function enrichCsv(inputFile, outputFile) {
const input = fs.readFileSync(inputFile, 'utf-8');
const records = parse(input, { columns: true });
const results = [];
for (let i = 0; i < records.length; i++) {
const row = records[i];
const kvk = (row.kvk_number || '').trim();
if (!kvk || kvk.length !== 8) {
results.push({ ...row, api_trade_name: '', data_changed: 'invalid_kvk' });
continue;
}
const result = await lookupCompany(kvk);
if (result.error) {
results.push({ ...row, api_trade_name: '', data_changed: result.error });
} else {
const addr = result.address || {};
results.push({
...row,
api_trade_name: result.tradeName || '',
api_address: `${addr.street || ''} ${addr.houseNumber || ''}`,
api_postal_code: addr.postalCode || '',
api_city: addr.city || '',
api_is_active: String(result.isActive),
data_changed: row.company_name !== result.tradeName ? 'yes' : 'no'
});
}
if ((i + 1) % 100 === 0) console.log(`Processed ${i + 1} records`);
await new Promise(r => setTimeout(r, 100));
}
fs.writeFileSync(outputFile, stringify(results, { header: true }));
console.log(`Done. Results written to ${outputFile}`);
}
enrichCsv('companies.csv', 'companies_enriched.csv');
Scheduling Regular Enrichment
A one-time cleanup is useful, but the real value comes from running enrichment on a schedule. Here are a few approaches:
Cron Job
The simplest option. Run the script weekly or monthly:
# Run every Sunday at 2 AM
0 2 * * 0 cd /path/to/scripts && python enrich.py >> /var/log/enrichment.log 2>&1
CI/CD Pipeline
If your team uses GitHub Actions or similar, you can run enrichment as a scheduled workflow:
name: CRM Enrichment
on:
schedule:
- cron: '0 2 * * 0' # Weekly on Sunday at 2 AM UTC
workflow_dispatch: # Allow manual trigger
jobs:
enrich:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install requests
- run: python scripts/enrich.py
env:
KVKBASE_API_KEY: ${{ secrets.KVKBASE_API_KEY }}
CRM-Native Integration
Most CRMs support webhooks or automation rules. You can trigger a KVK lookup whenever a new Dutch company record is created, or run a bulk update on all records that have not been verified in the last 30 days.
Handling Edge Cases
A few situations to plan for:
- Missing KVK numbers. Some records might not have a KVK number at all. You can try searching by company name using the
/searchendpoint, but be prepared for ambiguous matches. - Duplicate records. Two records with different data but the same KVK number are duplicates. Use enrichment as an opportunity to merge them.
- Inactive companies. When the API returns
isActive: false, decide whether to flag the record, archive it, or notify the account owner. - Rate limits. With large datasets (10,000+ records), spread your requests over time. The 100ms delay in the examples above gives you about 600 lookups per minute, which is a safe pace for most plans.
Measuring Results
After your first enrichment run, check these metrics:
- Records updated — how many had outdated data
- Inactive companies — how many were no longer active
- Missing data filled — how many records gained new fields (address, legal form, etc.)
- Duplicates found — how many KVK numbers appeared more than once
These numbers tell you how much value regular enrichment delivers and help you justify the investment to stakeholders.
Getting Started
Sign up at kvkbase.nl for an API key. Start with a small test batch of 50-100 records to validate your pipeline before running it on your full database. The API documentation covers rate limits, response formats, and error codes in detail.