Definition
An identity graph is a database that connects multiple identifiers - email addresses, phone numbers, device IDs, IP addresses, cookies, and offline records - to unified individual profiles. Each node in the graph represents an identifier, and the edges represent verified links between them. When a visitor identification tool matches an anonymous visitor to a real person, it is querying an identity graph to find the connection between browser-level signals and a known identity.
How It Works
Identity graphs are built by ingesting data from many sources and linking identifiers together over time. The raw inputs include email lists from opt-in data partnerships, device registrations, public records, transactional data, cookie syncs, and login events across websites and apps.
The graph-building process starts with a seed identifier - typically an email address or phone number that is tied to a verified individual. From there, the system links additional identifiers. If that email was used to log into a website from a specific device, the device ID gets linked to the profile. If that device later visits another site, and the IP address is associated with a business, the company record gets linked too.
Over time, a single person might accumulate dozens of linked identifiers: two email addresses, a work phone, a personal phone, three device IDs, a LinkedIn profile, multiple cookies, and several IP addresses. The identity graph maintains all of these connections and keeps them current as data changes - people switch jobs, get new phones, move offices.
The quality of an identity graph depends on three factors. Coverage describes how many people and businesses are in the graph. Major graphs contain 1-3 billion individual profiles. Accuracy measures how often the links between identifiers are correct. Freshness reflects how quickly the graph updates when someone changes jobs, switches email providers, or moves locations. A graph with great coverage but stale data will return outdated contacts - which is worse than returning nothing.
Why It Matters
The identity graph is the engine behind every visitor identification tool, every people search engine, and every data enrichment API. Without it, there is no way to connect an anonymous website session to a real person. The graph is what separates a tool that says “someone from Acme Corp visited” from one that says “Sarah Chen, VP of Marketing at Acme Corp, visited your pricing page.”
For B2B teams, the quality of the underlying identity graph determines everything downstream. A strong graph means higher match rates, more accurate contact data, and fresher information. A weak graph means missed visitors, wrong emails, and outdated job titles.
This is also why not all visitor identification tools are equal, even if they describe their capabilities similarly. The pixel and the dashboard are commodities. The identity graph is the moat. It takes years and massive data partnerships to build a graph with broad coverage, high accuracy, and real-time freshness. Vendors that license thin or outdated graphs will always underperform on match rate and data quality.
Examples
-
Visitor identification: A JavaScript pixel captures a device fingerprint and IP address from a website visitor. The identity graph links that fingerprint to a cookie from a previous session, which is linked to an email address from a data partnership, which is linked to a verified person record. The system returns the person’s name, work email, phone, and company.
-
Cross-device resolution: A person researches a product on their phone during lunch, then revisits the vendor’s website from their work laptop in the afternoon. The identity graph connects both sessions to the same individual because the device IDs, IP transitions, and behavioral patterns match a single profile.
-
Data enrichment: A CRM contains a lead with just a name and company. An enrichment API queries the identity graph using those inputs, returns the person’s current email, direct phone number, LinkedIn URL, and updated job title - filling in the missing fields.
Related Concepts
| Concept | Description | Learn More |
|---|---|---|
| Identity Resolution | The process of matching signals against the graph | What Is Identity Resolution? |
| Match Rate | How often the graph produces a successful match | What Is Match Rate? |
| Deterministic Matching | Linking identifiers with exact matches (email to email) | Deterministic vs Probabilistic Matching |
| Data Enrichment | Using the graph to add missing data fields | What Is Data Enrichment? |
| Visitor Identification | The consumer-facing application of identity graphs | What Is Visitor Identification? |