MinusOneDB logoMinusOneDB
Case Study

Identity Resolution

A global advertising holdco’s customer records, shredded and matched into one identity graph — on one MinusOneDB environment.

1 system Ingestion, indexing, entity resolution,
and queryable identity graph — all on the same node.

The engagement

Atlantic Fox — a data-strategy agency — was engaged by a global advertising holdco to consolidate customer records from across its agencies into a single, queryable identity graph.

The goal: stop treating each agency’s customer list as its own island. Start knowing which customer is which — across every agency, across every email address they’ve ever used — in near real time.

Objectives & Results

Objective

Consolidate customer records from every agency into a single, queryable identity graph.

Result

All records shredded and ingested into one MinusOneDB environment. Every field — emails, names, addresses, account IDs — indexed at write time for structured, full-text, and fuzzy matching.

Objective

Resolve duplicate and near-duplicate customer identities: the same person with different emails across different agencies.

Result

Matching runs as a sequence of search queries against pre-built inverted, doc-value, and range indexes — not as full-table scans. Every resolved identity carries provenance and confidence.

Objective

Let the holdco iterate on matching rules without re-ingesting data or triggering per-query cost spikes.

Result

Every rule change reruns against the existing indexes. No scans, no re-ingest, no meter — just another query against the search layer.

Records ingested
Every row, shredded
Customer records from every agency, broken into fielded entities and indexed at write time.
Identities resolved
One graph, many brands
Every resolved identity carries provenance and match-confidence. Subsidiary lists unified into one queryable graph.
Rule iteration
No re-ingest
Every rule tweak reruns against the existing indexes. No scans, no data movement, no meter.
Cost model
Flat capacity
50 rule variants cost the same as one. Iteration isn’t a budget conversation.

Why this is hard anywhere else

1

Pay-per-query warehouse

Every matching pass is a full-table scan, and every scan runs the meter. Iterating rules — trying fuzzy thresholds, domain-normalisation passes, phonetic matches — gets priced out. The project dies in the budget, not the tech.

2

Specialised ER tool

You end up with a standalone entity-resolution cluster that doesn’t talk to your analytics or your applications. Weeks of setup, a second data pipeline, and the resolved identities still have to be re-imported wherever they’re used.

3

MinusOneDB

Every row is indexed — inverted, doc-value, range — at ingest. Matching is a sequence of search queries against those indexes. Iteration is free at the margin. The resolved identity graph is queryable by every agency’s application through the same REST API + JS SDK.

Enterprise identity resolution, on an enterprise timeline.

Shredding a holdco-wide customer database and matching emails across every agency is one of the hardest things a data platform can be asked to do. The brute-force approach is N×M row comparisons. The smart approach requires inverted indexes, doc-value columns, and range trees all co-located with the data — plus the budget to iterate matching rules without paying per question.

On a pay-per-query warehouse, the matching passes run the meter into the ground before the first rule-tuning cycle finishes. On a dedicated entity-resolution tool, you end up with a parallel pipeline that doesn’t feed the applications that need the resolved data.

Atlantic Fox’s engagement ships on a single MinusOneDB environment — the same environment that stores the raw records, runs the matching, and exposes the resolved identity graph to every agency’s application through one REST API and JS SDK.

Atlantic Fox is running enterprise identity resolution on a timeline the enterprise can actually afford — because the heavy lifting is a search query, not a scan.

The Solution

  • A single MinusOneDB environment holding every agency’s customer records under a unified schema
  • Each record shredded into fielded entities and indexed at write time — inverted, doc-value, and range indexes all built once
  • Entity-resolution passes run as search queries against those indexes, not full-table scans — and cost nothing extra per run
  • Rule iteration without re-ingest, without data movement, without a query meter
  • Resolved identity graph queryable from every agency’s application through the same REST API + JS SDK
In closing

Entity resolution at enterprise scale — on one system, with room to iterate.

Atlantic Fox’s engagement is the kind of project that normally dies in the budgeting meeting: take multiple agencies’ customer records, shred them into fielded entities, resolve overlapping identities across every email and every agency, then keep iterating the rules until the match confidence is production-grade. The work is unavoidable. The platform usually isn’t affordable.

On MinusOneDB, every row is indexed at write time, every matching pass is a search query against those indexes, and every rule iteration costs the same as the last. The alternative — a warehouse to store it, a separate ER tool to match it, a cache to serve it, and a budget per iteration — is the version that never ships.