Guides

What Is Identity Resolution? A Technical Primer

Identity resolution matches anonymous signals to known people. Learn how it works, deterministic vs probabilistic methods, and why it matters for B2B sales.

Elene Marjanidze Elene Marjanidze · · 9 min read
What Is Identity Resolution? A Technical Primer

Somewhere between “anonymous visitor” and “qualified lead” sits identity resolution — the process of figuring out who someone actually is from the digital breadcrumbs they leave behind.

If you’ve ever wondered how tools like Leadpipe can tell you the name, email, and company of someone who never filled out a form, this is how. No magic. No guessing. Just data infrastructure working at a scale most people never think about.

This guide breaks down what identity resolution is, how the underlying technology works, the two fundamental matching approaches (and why the difference matters more than you’d think), and where the market is heading in 2026.


Table of Contents

  1. What Is Identity Resolution?
  2. How Identity Resolution Works: The 3-Step Process
  3. The Identity Graph: The Engine Behind the Curtain
  4. Deterministic vs Probabilistic Matching
  5. Data Signals Used in Identity Resolution
  6. B2B vs B2C Identity Resolution
  7. The Market Landscape
  8. Privacy and Compliance
  9. The Shift to API-First Identity
  10. FAQ

What Is Identity Resolution?

Here’s the definition, plain and simple:

Identity resolution is the process of matching anonymous or fragmented data signals — IP addresses, device IDs, cookies, behavioral patterns — to verified real-world identities (name, email, phone, company). It collapses multiple data points into a single, unified profile.

Think of it this way. Someone visits your website. Your analytics tool sees a session: an IP address, a browser fingerprint, some pages viewed, a referrer URL. That’s data, but it’s not actionable. You can’t email a browser fingerprint. You can’t call an IP address.

Identity resolution bridges that gap. It takes the anonymous data and resolves it to a person — a real name, a work email, a phone number, a job title, a company, a LinkedIn profile.

The term comes from the data management world, where it originally described the process of deduplicating and merging customer records. In 2026, it’s become the foundation for visitor identification, sales intelligence, ad targeting, and the data layer powering AI sales agents.

Why does it matter? Because 97% of website visitors leave without filling out a form. Without identity resolution, that traffic is invisible. With it, every visitor becomes a potential lead with a name, company, and intent signal attached.


How Identity Resolution Works: The 3-Step Process

Every identity resolution system — whether it’s a $300K/yr enterprise platform or a self-serve API — follows the same three steps.

Step 1: Data Collection

A pixel or JavaScript tag on your website captures signals from every visitor. These signals include:

  • IP address — identifies the network the visitor is on
  • User agent — browser type, OS, device
  • Referrer URL — where they came from (Google search, LinkedIn ad, direct)
  • Pages viewed — which URLs they visited, in what order, for how long
  • Device fingerprint — a composite of screen resolution, installed fonts, timezone, and other browser attributes
  • Cookies / first-party identifiers — if the visitor has been seen before

This data is collected passively. The visitor doesn’t fill out a form, log in, or take any explicit action. They just browse.

Step 2: Matching

The collected signals are compared against an identity graph — a massive database that links data points to verified identities.

The system asks: “Given this IP address, this device fingerprint, this cookie, and this browsing behavior, does this combination match a known person in our graph?”

If the signals match strongly enough, the system returns an identity. If they don’t meet the confidence threshold, no match is returned (this is how you avoid false positives).

Step 3: Resolution

The final output is a resolved identity — a unified profile that collapses the anonymous signals into a real person:

  • Full name (e.g., Jane Smith)
  • Work email (jane@acme.com)
  • Personal email (optional, depending on provider)
  • Phone number (direct dial)
  • Company (Acme Corp)
  • Job title (VP of Marketing)
  • LinkedIn profile URL
  • Firmographics (company size, industry, revenue)

This resolved identity is then delivered to your CRM, webhook, Slack, or workflow tool like Clay — typically within seconds of the visit.


The Identity Graph: The Engine Behind the Curtain

The identity graph is the core asset behind any identity resolution system. Without it, you’re just collecting signals with nothing to match them against.

What is an identity graph? A massive database of relationships between identifiers. Think of it as a network:

  • Nodes = individual data points (email addresses, phone numbers, cookies, IP addresses, device IDs, LinkedIn profiles)
  • Edges = verified connections between those data points

Here’s a simplified example: Email A (jane@acme.com) is connected to Phone B (+1-555-0123), which is connected to Device C (Chrome on MacBook, 2560x1440), which visited Site D (yourwebsite.com) from IP E (74.125.xxx.xxx).

When a visitor hits your site, the system finds the node in the graph that corresponds to the visitor’s signals and traverses the edges to resolve the full identity.

Not all identity graphs are equal. This is where the market diverges significantly:

  • Leadpipe builds and maintains its own proprietary identity graph. This means the data quality, freshness, and matching logic are fully controlled. No reselling third-party data.
  • Most competitors license identity data from the same handful of third-party providers (the “same identity graph, different wrapper” problem). This is why you’ll see similar match rates and the same wrong answers across multiple tools.
  • Enterprise providers like LiveRamp, Experian, and Oracle Data Cloud maintain the largest graphs — but charge $300K+/yr and require 6-month implementation cycles.

The size of the graph matters, but the quality of the edges matters more. A graph with 500 million nodes and weak connections will produce worse results than a graph with 200 million nodes and verified connections. That’s why accuracy testing matters so much.


Deterministic vs Probabilistic Matching

This is the most important technical distinction in identity resolution. Everything else — match rate, accuracy, data quality — flows from which approach a provider uses.

Deterministic Matching

Deterministic matching relies on exact matches against verified identifiers. The system isn’t guessing. It knows.

“We KNOW this is Jane because her verified email in our identity graph is linked to this device, which is the device visiting your site right now.”

How it works: The visitor’s signals (cookie, device, IP) are matched against verified, first-party data in the identity graph. The match is binary — it either meets the verification threshold or it doesn’t. No gray area.

Pros:

  • High accuracy — low false positive rate
  • Reliable contact data — the email and phone returned are verified
  • Safe to automate against — you can feed this data to an AI SDR without worrying about sending emails to the wrong person

Cons:

  • Lower match rate — only returns results when there’s a verified match (typically 30-40% of traffic)
  • Requires a strong identity graph — building and maintaining verified connections is expensive

Who uses it: Leadpipe (8.7/10 accuracy in independent testing)

Probabilistic Matching

Probabilistic matching uses statistical inference from multiple weak signals to guess an identity.

“Based on the IP range, device type, browsing pattern, and time of day, there’s a 73% chance this is Jane.”

How it works: The system collects dozens of weak signals and runs them through a statistical model. Each signal alone isn’t enough to identify anyone, but the combination produces a probability score. If the score crosses a threshold, the system returns its best guess.

Pros:

  • Higher match rate — more visitors get “identified” (more guesses = more results)
  • Works with less data — doesn’t require verified first-party connections

Cons:

  • Higher false positive rate — wrong names, wrong companies, wrong emails
  • Confidence varies wildly — a “73% match” means 27% of the time it’s wrong
  • Dangerous to automate against — feeding probabilistic data into automated outreach means your AI agent will email the wrong person regularly

Who uses it: RB2B (5.2/10 accuracy), Warmly (4.0/10), most tools using third-party identity graphs

For a deeper breakdown of these two approaches and when each makes sense, see Deterministic vs Probabilistic Matching Explained.


Data Signals Used in Identity Resolution

Not all signals carry the same weight. Here’s how the major data points stack up:

SignalStrengthDeterministic?Example
Email matchVery highYesLogin, form fill, identity graph
Phone matchHighYesVerified in identity graph
Cookie / 1st partyMediumPartiallyReturning visitor match
IP addressLow-mediumNoCompany identification
Device fingerprintLowNoBrowser + OS + resolution
Behavioral patternVery lowNoBrowsing habits similarity

The key insight: deterministic signals (email, phone) produce high-confidence identifications. Probabilistic signals (IP, device fingerprint, behavior) produce educated guesses.

The best identity resolution systems use a layered approach — starting with the strongest deterministic signals and only falling back to weaker signals when necessary. The worst systems treat all signals equally and return whatever clears the lowest confidence bar.

This is why you can have two tools that both claim “30% match rate” but deliver wildly different data quality. One is returning 30% deterministic matches. The other is returning 30% probabilistic guesses. The match rate is the same. The value is not.


B2B vs B2C Identity Resolution

Identity resolution serves fundamentally different purposes depending on the context.

B2B Identity Resolution

Goal: Identify the person and company visiting your website.

Key signals: Business email, LinkedIn profile, company domain, job title, firmographics.

Use cases:

What matters most: Getting the right person at the right company. In B2B, emailing the wrong person at the right company is almost as bad as emailing a stranger. Identity resolution needs to nail both the person and the company.

B2C Identity Resolution

Goal: Unify a single customer across devices and channels.

Key signals: Email, phone, loyalty ID, device ID, app installs.

Use cases: Personalization, cross-device retargeting, attribution, customer journey analytics.

What matters most: Stitching together the same person on mobile, desktop, tablet, and in-store. The focus is on unification across touchpoints, not identification from scratch.

The Critical Difference

B2B identity resolution starts from zero — a completely anonymous visitor — and tries to figure out who they are. B2C identity resolution usually starts with a known customer (they’ve logged in, bought something, or signed up) and tries to connect their activity across channels.

Leadpipe focuses exclusively on B2B, with the deepest set of business signals: work email, direct phone, job title, company, LinkedIn, and firmographics. If you need B2C identity resolution, you’re looking at platforms like LiveRamp, Amperity, or Segment.


The Market Landscape

The identity resolution market spans a wide range of price points and capabilities. Here’s how it breaks down:

TierProviderPriceApproachBest For
EnterpriseLiveRamp$300K+/yrGraph licensingLarge enterprises, ad platforms
EnterpriseExperian$200K+/yrConsumer + business dataFinancial services, regulated industries
EnterpriseOracle Data CloudCustomThird-party data marketplaceAdvertising, enterprise CDP
Mid-marketLeadpipe$147/moDeterministic, own graphB2B sales teams, agencies, platforms
Mid-marketClearbit (Breeze)CustomProbabilistic enrichmentMarketing teams, HubSpot users
Mid-market6sense$60K+/yrProbabilistic + MLEnterprise ABM
SMBRB2B$99/moProbabilisticIndividual reps, small teams
SMBWarmly$900/moProbabilisticMid-size sales teams
SMBLeadfeeder~$99/moIP-based (company only)Company-level identification

Notice the gap. Enterprise solutions are powerful but cost $200K-$300K+/yr and require months to implement. SMB tools are cheap but sacrifice accuracy. The mid-market — where you need enterprise-quality data at a self-serve price — is where the action is.

For a detailed comparison of the leading tools, see Top 10 Visitor Identification Tools. For pricing deep-dives, see Visitor Identification Pricing Compared.


Privacy and Compliance

Identity resolution lives at the intersection of powerful technology and privacy regulation. Here’s what you need to know.

CCPA (California Consumer Privacy Act)

  • Businesses must disclose that data collection is happening
  • Visitors have the right to opt out of data sale/sharing
  • You must honor opt-out requests and maintain suppression lists
  • Leadpipe is fully CCPA compliant with built-in suppression list support

GDPR (EU General Data Protection Regulation)

  • Person-level identification requires explicit consent in the EU — this is the law, not a recommendation
  • Company-level identification (e.g., “someone from Acme Corp visited”) is permissible under legitimate interest
  • Leadpipe handles this automatically: company-level only for EU visitors, person-level for US visitors where permitted

Best Practices

  1. Use suppression lists — remove anyone who opts out, permanently
  2. Respect Do Not Contact preferences — both legal and ethical
  3. Be transparent — your privacy policy should disclose visitor identification
  4. Limit data retention — don’t store identified visitor data longer than you need it
  5. Choose compliant providers — ask your vendor about their compliance posture before signing

The cost of ignoring compliance goes beyond fines. It erodes trust with the exact audience you’re trying to reach.


The Shift to API-First Identity

This is the biggest structural change in the identity resolution market right now, and it’s worth understanding if you’re building anything in the B2B stack.

The Old World

Enterprise contracts. Six-month implementations. $300K+ annual fees. Custom integrations requiring dedicated engineering teams. Identity resolution was a product that only large companies could afford and only IT departments could implement.

The result: a small number of companies had access to high-quality identity data, and everyone else was locked out.

The New World

API-first providers like Leadpipe have collapsed the entire stack into a self-serve product. Install a pixel, get an API key, start resolving identities in minutes. $147/mo. No contracts. No implementation team.

This shift isn’t just about price. It’s about what becomes possible when identity resolution is infrastructure instead of a standalone product:

  • AI agents can call an identity API in real-time, enriching every website visit with a resolved identity before deciding whether to engage. That’s the missing data layer most AI SDR platforms need.
  • SaaS platforms can embed identity resolution natively, adding visitor identification as a feature without building their own graph.
  • Clay workflows can pull identity data as a waterfall enrichment step, combining it with other data sources.
  • Agencies can white-label the entire experience, offering identity resolution to their clients under their own brand.

Identity resolution is becoming a utility — something you plug into, not something you build or buy as a standalone platform. The providers who understand this shift are building APIs and embeddable SDKs. The ones who don’t are still selling enterprise contracts and wondering why their pipeline is shrinking.

Try Leadpipe free with 500 leads —>


FAQ

How is identity resolution different from IP lookup?

IP lookup identifies the company behind a visit (e.g., “someone from Acme Corp visited your pricing page”). Identity resolution goes further — it identifies the individual person (e.g., “Jane Smith, VP of Marketing at Acme Corp, visited your pricing page at 2:14 PM, viewed 3 pages, and came from a Google search for ‘visitor identification tools’”). IP lookup is one signal used in identity resolution, but it’s far from the whole picture.

Yes, when done correctly. In the US, identity resolution using publicly available and commercially licensed data is legal under CCPA, provided you disclose the practice and honor opt-out requests. In the EU, person-level identification requires explicit consent under GDPR. Compliant providers like Leadpipe handle this automatically — person-level in the US, company-level only in the EU.

What match rate should I expect?

For deterministic matching (the kind that produces accurate, actionable data), expect 30-40%+ of US website traffic to be resolved to a named individual. This varies based on traffic quality — B2B traffic from LinkedIn ads will match at higher rates than consumer traffic from TikTok. Probabilistic tools may claim higher match rates, but a significant percentage of those matches will be wrong. Accuracy matters more than match rate.

Can I use identity resolution data with my CRM?

Absolutely. Most identity resolution providers integrate with major CRMs (HubSpot, Salesforce, Pipedrive) either natively or via webhook/Zapier. Leadpipe offers 200+ integrations out of the box. The typical workflow: visitor hits your site, identity is resolved, contact is automatically created or updated in your CRM with the visit data attached. Your reps see who visited, what they looked at, and when — right inside the tools they already use.


The Bottom Line

Identity resolution is the infrastructure layer that turns anonymous website traffic into named, contactable leads. The technology isn’t new — enterprise companies have been doing this for years. What’s new is that it’s now accessible to any B2B team with a website and $147/mo.

The key decisions you need to make:

  1. Deterministic or probabilistic? If accuracy matters (and it should), go deterministic.
  2. Own graph or resold data? Providers who build their own identity graph control quality. Everyone else is reselling the same data.
  3. API-first or standalone? If you want identity data flowing into your existing stack, choose a provider with a real API, not just a dashboard.

If you’re evaluating identity resolution tools, start by testing accuracy with real traffic. Match rates are marketing — accuracy is what determines whether the data actually drives revenue.

Try Leadpipe free — 500 identified leads, no credit card required —>