KVKBase

Batch Enrichment: Keep Your CRM's Dutch Company Data Up to Date

How to use KVKBase's batch lookup API to enrich and verify Dutch company records in your CRM at scale.

batchcrmenrichmentapi

Batch Enrichment: Keep Your CRM’s Dutch Company Data Up to Date

Company data decays faster than most teams realize. Businesses move offices, change trade names, merge with other entities, or shut down entirely. If your CRM contains Dutch company records, a significant percentage of them are probably outdated right now.

The fix is not to hire someone to manually check each record. It is to run batch enrichment against the KVK registry on a regular schedule.

The Cost of Stale Data

Bad company data is not just an annoyance — it has real business impact:

  • Sales teams waste time calling companies that have moved or closed
  • Marketing campaigns bounce when sent to outdated addresses
  • Invoices get rejected when the legal name or address does not match the official registry
  • Compliance audits fail when your records do not reflect the current state of your business relationships

A study by Gartner estimated that poor data quality costs organizations an average of $12.9 million per year. Even if your costs are a fraction of that, the ROI on automated data enrichment is hard to argue with.

What Batch Enrichment Looks Like

The process is straightforward:

  1. Export your Dutch company records (with KVK numbers) from your CRM
  2. For each record, call the KVKBase API to get the current data
  3. Compare the API response with your stored data
  4. Update any records where the data has changed
  5. Flag records where the company is no longer active

You can run this as a one-time cleanup or schedule it to run weekly or monthly.

Python Implementation

Here is a complete script that reads a CSV of company records, enriches them via the KVKBase API, and writes the results to a new CSV:

import csv
import time
import requests

API_KEY = "your-api-key"
BASE_URL = "https://api.kvkbase.nl/api/v1"

def lookup_company(kvk_number):
    """Look up a single company by KVK number."""
    response = requests.get(
        f"{BASE_URL}/lookup/{kvk_number}",
        headers={"Authorization": f"Bearer {API_KEY}"}
    )

    if response.status_code == 200:
        return response.json()
    elif response.status_code == 404:
        return {"error": "not_found"}
    elif response.status_code == 429:
        # Rate limited -- wait and retry
        retry_after = int(response.headers.get("Retry-After", 60))
        time.sleep(retry_after)
        return lookup_company(kvk_number)
    else:
        return {"error": f"status_{response.status_code}"}

def enrich_csv(input_file, output_file):
    """Read a CSV, enrich each row, write results."""
    with open(input_file, "r") as infile:
        reader = csv.DictReader(infile)
        fieldnames = reader.fieldnames + [
            "api_trade_name",
            "api_address",
            "api_postal_code",
            "api_city",
            "api_legal_form",
            "api_is_active",
            "data_changed"
        ]

        with open(output_file, "w", newline="") as outfile:
            writer = csv.DictWriter(outfile, fieldnames=fieldnames)
            writer.writeheader()

            for i, row in enumerate(reader):
                kvk = row.get("kvk_number", "").strip()
                if not kvk or len(kvk) != 8:
                    row["data_changed"] = "invalid_kvk"
                    writer.writerow(row)
                    continue

                result = lookup_company(kvk)

                if "error" in result:
                    row["data_changed"] = result["error"]
                    writer.writerow(row)
                    continue

                # Map API fields
                addr = result.get("address", {})
                row["api_trade_name"] = result.get("tradeName", "")
                row["api_address"] = f"{addr.get('street', '')} {addr.get('houseNumber', '')}"
                row["api_postal_code"] = addr.get("postalCode", "")
                row["api_city"] = addr.get("city", "")
                row["api_legal_form"] = result.get("legalForm", "")
                row["api_is_active"] = str(result.get("isActive", ""))

                # Check if anything changed
                changed = (
                    row.get("company_name", "") != result.get("tradeName", "") or
                    row.get("city", "") != addr.get("city", "")
                )
                row["data_changed"] = "yes" if changed else "no"

                writer.writerow(row)

                # Progress
                if (i + 1) % 100 == 0:
                    print(f"Processed {i + 1} records")

                # Rate limiting: small delay between requests
                time.sleep(0.1)

    print(f"Done. Results written to {output_file}")

# Usage
enrich_csv("companies.csv", "companies_enriched.csv")

Your input CSV should have at least a kvk_number column. The script adds columns for the API data and a data_changed flag so you can quickly filter for records that need updating.

JavaScript / Node.js Implementation

If you prefer JavaScript, here is the equivalent:

const fs = require('fs');
const { parse } = require('csv-parse/sync');
const { stringify } = require('csv-stringify/sync');

const API_KEY = process.env.KVKBASE_API_KEY;
const BASE_URL = 'https://api.kvkbase.nl/api/v1';

async function lookupCompany(kvkNumber) {
  const response = await fetch(`${BASE_URL}/lookup/${kvkNumber}`, {
    headers: { 'Authorization': `Bearer ${API_KEY}` }
  });

  if (response.status === 200) return response.json();
  if (response.status === 404) return { error: 'not_found' };
  if (response.status === 429) {
    const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
    await new Promise(r => setTimeout(r, retryAfter * 1000));
    return lookupCompany(kvkNumber);
  }
  return { error: `status_${response.status}` };
}

async function enrichCsv(inputFile, outputFile) {
  const input = fs.readFileSync(inputFile, 'utf-8');
  const records = parse(input, { columns: true });
  const results = [];

  for (let i = 0; i < records.length; i++) {
    const row = records[i];
    const kvk = (row.kvk_number || '').trim();

    if (!kvk || kvk.length !== 8) {
      results.push({ ...row, api_trade_name: '', data_changed: 'invalid_kvk' });
      continue;
    }

    const result = await lookupCompany(kvk);

    if (result.error) {
      results.push({ ...row, api_trade_name: '', data_changed: result.error });
    } else {
      const addr = result.address || {};
      results.push({
        ...row,
        api_trade_name: result.tradeName || '',
        api_address: `${addr.street || ''} ${addr.houseNumber || ''}`,
        api_postal_code: addr.postalCode || '',
        api_city: addr.city || '',
        api_is_active: String(result.isActive),
        data_changed: row.company_name !== result.tradeName ? 'yes' : 'no'
      });
    }

    if ((i + 1) % 100 === 0) console.log(`Processed ${i + 1} records`);
    await new Promise(r => setTimeout(r, 100));
  }

  fs.writeFileSync(outputFile, stringify(results, { header: true }));
  console.log(`Done. Results written to ${outputFile}`);
}

enrichCsv('companies.csv', 'companies_enriched.csv');

Scheduling Regular Enrichment

A one-time cleanup is useful, but the real value comes from running enrichment on a schedule. Here are a few approaches:

Cron Job

The simplest option. Run the script weekly or monthly:

# Run every Sunday at 2 AM
0 2 * * 0 cd /path/to/scripts && python enrich.py >> /var/log/enrichment.log 2>&1

CI/CD Pipeline

If your team uses GitHub Actions or similar, you can run enrichment as a scheduled workflow:

name: CRM Enrichment
on:
  schedule:
    - cron: '0 2 * * 0' # Weekly on Sunday at 2 AM UTC
  workflow_dispatch: # Allow manual trigger

jobs:
  enrich:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: pip install requests
      - run: python scripts/enrich.py
        env:
          KVKBASE_API_KEY: ${{ secrets.KVKBASE_API_KEY }}

CRM-Native Integration

Most CRMs support webhooks or automation rules. You can trigger a KVK lookup whenever a new Dutch company record is created, or run a bulk update on all records that have not been verified in the last 30 days.

Handling Edge Cases

A few situations to plan for:

  • Missing KVK numbers. Some records might not have a KVK number at all. You can try searching by company name using the /search endpoint, but be prepared for ambiguous matches.
  • Duplicate records. Two records with different data but the same KVK number are duplicates. Use enrichment as an opportunity to merge them.
  • Inactive companies. When the API returns isActive: false, decide whether to flag the record, archive it, or notify the account owner.
  • Rate limits. With large datasets (10,000+ records), spread your requests over time. The 100ms delay in the examples above gives you about 600 lookups per minute, which is a safe pace for most plans.

Measuring Results

After your first enrichment run, check these metrics:

  • Records updated — how many had outdated data
  • Inactive companies — how many were no longer active
  • Missing data filled — how many records gained new fields (address, legal form, etc.)
  • Duplicates found — how many KVK numbers appeared more than once

These numbers tell you how much value regular enrichment delivers and help you justify the investment to stakeholders.

Getting Started

Sign up at kvkbase.nl for an API key. Start with a small test batch of 50-100 records to validate your pipeline before running it on your full database. The API documentation covers rate limits, response formats, and error codes in detail.