Your guide to entity resolution
Every decision made in your enterprise relies on accurate and complete data.
And while you’ve got access to more data than ever before, connecting today’s volumes of data and turning it into actionable, valuable insight is a big challenge.
This essential guide tells you how to overcome prominent data challenges, why current approaches to data matching no longer work, and which capabilities are the most important to look for when investing in entity resolution software.
You’ve got all the data. You just need the right technology to harness its value.
Read on to find out how to effectively use entity resolution to transform your organization.
Imagine you have a customer called John Citizen and a corporate data record for ABC Inc that has a director called Jonathan Citizen. Are they the same person?
If you can’t answer that basic question, how can you make an accurate decision about the risks or opportunities associated with John Citizen?
Enter entity resolution.
It’s the best way to connect billions of data points spread across multiple systems into a trusted, accurate single view.
It creates a complete, meaningful view of data across the enterprise that reflects real-world people, places, and organizations—and the relationships between them.
And it builds a contextual data foundation that enables you to enhance decision making across the customer lifecycle, uncover hidden risk, and discover new unexpected opportunities.
For most organizations, more data equals more problems. From duplicate customer records to never-ending data science experiments, there are endless challenges. And trying to overcome these challenges is difficult: connecting disparate data sets into a comprehensive single view of data is time-intensive and laborious, with a maze of security issues.
No organization can build a 360-degree view of customers, prospects, partners and organizations without a reliable data foundation. And if you don’t have this complete view, there’s no way to turn this data into insights.
Only 10% of data deemed potentially useful by enterprises is being analyzed today
Source: Maximize Your Decision Intelligence by Analyzing Contextual Data, IDC, 2020
All of this stems from data-based challenges.
Challenges that entity resolution can help solve.
Why is connecting data
Data is trapped in silos across internal and external systems, resulting in teams repeating work and customers receiving a disjointed experience. Siloed data makes it impossible to see the full picture—which leads to inaccurate decision-making. These data silos are built independently and are not designed to be connected, so they have their own formats and structures. This leads to major challenges when teams try to build a single view, and users often resort to manually assembling the data—which is time-intensive, laborious, and unreliable.
Each user (or use case) requires specific data sources and have different requirements around how fuzzy or strict the matching must be. For example, a credit risk team making automated lending decisions need a far tighter level of matching than a financial crime team, who want to cast the net wide to ensure they don’t miss any links. Plus, this team has highly sensitive sources (such as money laundering cases) which legally can’t be used outside of financial crime use cases. Therefore, multiple views of entities for different requirements.
Data quality challenges
Poor data quality results from duplicated entities, missing information, and intentional manipulation by criminals. Traditional matching, with its record-to-record approach, struggles to deal with sparse and inaccurate data. It’s also ill-suited to the natural variations that are seen within data. For example, you can write an address in several ways, with information omitted, added, or abbreviated without causing issues in receiving the post. Furthermore, if global data is present, challenges arise around languages, dialects, and scripts.
59% of organizations see the lack of a single version of the truth as a key challenge
Source: Maximize Your Decision Intelligence by Analyzing Contextual Data, IDC, 2020
Turning today’s volumes of data into commercially useful, business-oriented, valuable insights is a big challenge.
But it’s one that entity resolution can take on.
This is how.
Turning records into reality
This animation shows three customer records in an organization’s database: John Smith, John Citizen, and J. Citizen. Using entity resolution, the organization can connect these records to gain a single view of the data to show that John Citizen and J. Citizen are actually the same person.
It overcomes variations in address, phone number, and company name records to match them—then draws on these attributes to merge John Citizen and J. Citizen customer records into a single entity. The process also reveals a connection between John Citizen and John Smith which wasn’t evident before.
Use data sources to maximize business value
Importing external data on a project-by-project basis isn’t efficient. Instead, use a central integration point to make integrated internal and external data available to all decision makers.
Leverage entity resolution to integrate and connect internal and external data sources using the most sophisticated integration algorithms to automatically create a single analytical view
Source: Maximize Your Decision Intelligence by Analyzing Contextual Data, IDC, 2020
1. Traditional data matching
Traditional record linking tools have been around for decades and offer a basic approach to resolving entities. However, the accuracy achieved by these products is significantly lower than true entity resolution tools, and they struggle to handle the reality of low-quality data.
2. DIY custom builds
Some organizations try to solve the problem in-house, tasking their development teams to build a data matching tool from the ground up. In most cases, these initiatives deliver some early success with ‘simple’ matches, but they miss a lot. Teams often struggle to operationalize their initial proof of concepts into a production-grade system that can cope with complex challenges, resulting in long lead times before seeing value and accruing high development costs.
3. Doing nothing
Many organizations cite quality issues as a blocker to connecting their data—and instead, decide to wait until their data is perfect. This forces organizations to make do with their disconnected silos, as attempting to manually improve data quality is far too arduous and time-consuming.
How entity resolution differs from traditional data matching:
- Compares data records directly using a paired matching approach
- Relies on many attributes matching to produce a high enough match score
- Struggles with sparsely populated records or those where there is variance in the information
- Uses an iterative matching approach that continually enriches records with additional data to provide the most accurate view possible
- Makes connections between data even when quality is low or where there has been manipulation
An independent assessment from Aite Group
“Best-in-class solutions often bring data enrichment capabilities as well, which are useful not only in entity resolution, but also to provide additional context about the customer that is useful in both financial crime and marketing analytics.”
Source: Entity Resolution and Linking: Enabling Next-Generation Financial Crime Detection, Aite Group, 2019
The best entity resolution tools resolve entities in the same way a human would — but automatically, at massive scale, and with a full understanding of the data. They don’t need to be programmed to know how common a name is or how large a business is: they mine this from the data.
You’ll place a lot of trust in your entity resolution tool, so it must be accurate. Look for independent validation, client testimonials and proven metrics to ensure your tool of choice meets the highest accuracy standards. Entity resolution software can range from 30% accuracy all the way up to 99%. Look for a solution that’s been proven in the fraud and financial crime space as these are built to overcome challenges like intentionally manipulated data, so are better at dealing with poor quality data and incomplete information.
With a trusted foundation of data, you can do everything from improving operational agility to automating decision-making. But you need to understand and trust how your system works—this is transparency. Choose an entity resolution tool that’s white-box by design. This ensures the underpinning logic is accessible, transparent, and explainable so all decisions are aligned to policy, and you can verify how any data-driven decisions are made. Look for regulator-approved products that have been through model risk governance processes.
Real-time and batch ingestion
Batch ingestion enables large scale resolution for data science use cases, while real-time ensures you’re always getting the most up-to-date and accurate view possible. Ensure you get the best of both worlds when it comes to data processing: choose a tool that offers both real-time and batch.
Security is one of the main reasons why entity resolution tools are deployed within specific areas of the business rather than across the entire enterprise. Look for software that supports dynamic processing, which resolves entities based on the data each use case requires and the user has the right to access—all within a single build.
Entity resolution is designed to bring all your data together into a single view, so it’s vital that the technology you choose can scale to the largest volumes of data possible. Look for a solution that’s proven and in production at large Tier 1 organizations. Also, ensure it can scale linearly with hardware via a distributed architecture—otherwise, you’ll end up with long batch times, delayed insights, and ugly interim processing workarounds. It’s critical that entity resolution software doesn’t hit its limits as you bring in more data.
Time to value
One of the main challenges with entity resolution is the time taken to onboard new data. Some solutions require all data to be normalized into a standard schema, which is time-consuming. An entity resolution tool that is able to accept data in almost any format means you can onboard new data sources quickly and easily. Tools with out-of-the-box integrations to a wide range of trusted external data sources will give you a highly accurate, complete view of every entity—at speed.
Use case flexibility
While many organizations use entity resolution for one initial use case, it’s actually a foundational layer suited to many parts of the business—from faster customer onboarding and sharper financial crime detection, to single view augmentation for MDM. As each use case has different requirements, it’s vital you implement entity resolution software that can support multiple use cases—with all their different matching and data source requirements—without requiring data replication.
Dynamic Entity Resolution is the next evolution of real-time entity resolution.
Rather than keeping an existing single view of an entity up to date, Dynamic Entity Resolution re-generates the entity in real-time from the underlying raw data.
This lets the software dynamically include or exclude particular data points and allows users to specify the match confidence they require for their specific use case.
It’s this unique capability that lets you deploy entity resolution across your entire organization for any use case.
The evolution of data matching
Focuses on direct matching, which results in poor match quality
Batch entity resolution
Offers improved matching, producing a single entity view across all data
Real-time entity resolution
Keeps the single view up to date as new data comes in
Dynamic Entity Resolution
Adds the ability to re-generate entities at the time of request from the underlying data
What makes Dynamic Entity Resolution different?
‘One size fits all’ is rarely a good thing—especially when it comes to entity resolution. All other systems assume a single view of ‘John Citizen’ can be used for every use case within your enterprise.
However, it can’t. Here are two reasons why it isn’t possible:
Different use cases have different requirements.
Different use cases require different levels of matching and have varying restrictions around different data sources. For example, if you use a highly sensitive source in the entity resolution process, the resulting entities will also be classified as highly sensitive—and therefore can’t be used elsewhere. This capability is critical if you have external data that’s only been licensed for use within one part of the business.
Multiple use cases lead to multiple instances.
Other systems can support different use cases—but only by replicating data. This might seem like a reasonable approach, but as you start using entity resolution more broadly across your enterprise, it quickly becomes unmanageable. Plus, when you add the matrix of data source permissions to the mix—a ‘multiple instances’ approach quickly becomes impossible.
A Dynamic Entity Resolution tool builds on demand, at the time of request.
Which means you can:
Deploy one instance of the platform to serve all the needs in your organization
Specify the level of fuzziness required per use case, at the time of request
Control access to data sources depending on what the user or use case can see
The results speak for themselves. This isn’t just about creating a single view across your enterprise—it’s about improving efficiency, laying the foundations for accurate analytics, and scaling to your needs.
Dynamic Entity Resolution lets you:
Create a single, complete view of customers, prospects and organizations—across
Deduplicate records across your entire database, giving you a single source of truth.
Improve the quality of your data and automatically fill in missing information.
Centralize access to both internal and external data throughout your organization.
Drive up productivity by providing access to accurate, consolidated information.
Create the foundation for data-driven decisions and automated/augmented decision-making.
Dynamic Entity Resolution across your enterprise:
Drive business growth with a complete single customer view.
Here’s a snapshot of what’s possible with a complete view of your customers and prospects:
- Discover new opportunities within your data
- Forge new relationships with more relevant offers
- Increase wallet share and improve retention
- Maintain data quality by automatically finding duplicates
- Speed up processes, creating a better customer experience
- Automate checks and implement continuous monitoring
Customer risk assessments
- See customer context, including connections and supply chains
- Build full, deep customer profiles
- Spot hidden risks and red flags
Financial crime detection
- Reveal hidden risks and detect criminal activity faster
- Reduce false positives to manage the cost of compliance
- Improve investigations to make faster and more consistent decisions at scale
With an accurate, trusted single view in place, you can:
Understand the connections and relationships between entities, and form networks of connected data
Use both the entities and networks within analytical models to enable better decision making across your enterprise
This gives your organization the power of Contextual Decision Intelligence.
Applying entity resolution and network analysis to your data produces context. This context lets your organization see the complete picture, and spot risks and opportunities that were previously hidden.
Contextual Decision Intelligence (CDI) combines this context with analytics to augment or automate decisions. CDI enables your entire organization to make faster, more accurate decisions.
With Contextual Decision Intelligence, your organization can:
Drive automation and deliver greater business value from enterprise data.
Process operational decisions faster and more accurately.
Spot hidden risks and identify high-value growth opportunities.
Want to see how entity resolution can help your organization?
Find out how you can utilize Quantexa’s Dynamic Entity Resolution software to make data meaningful so you can drive faster and more accurate decisions across your organization.
It’s time to maximize the value of your data.
Book a demo
Get in touch and we’ll help you take the first step towards building a data fabric that unlocks a single view, empowers your people, and enhances decision-making.
Financial Crime Analytics: Aite Report
Banks’ customer data stores are typically siloed, with widely varying degrees of data quality. Aite Group looks at the best entity resolution and linking tools to combat these challenges to drive actionable value from data.
IDC Report: Maximize Your Decision Intelligence by Analyzing Contextual Data
By adopting best practices for AI and analytics, companies can enable data-driven decision intelligence to become more agile and competitive.
How Government Agencies Can Improve Pandemic Relief Fraud Prevention Using Data Analytics
Government agencies using data-driven, analytical approaches to fraud prevention can discover risks and threats faster compared to traditional methods – a critical advantage that helps reduce fraud losses and increase the likelihood of recovering funds.
Dun & Bradstreet Partner With Big Data Company Quantexa to Tackle Financial Crime
Dun & Bradstreet and Quantexa have announced a new partnership that aims to empower businesses to make better, context driven decisions using innovative data and analytics to provide a detailed picture of connected relationships.
AML Detection & Investigation Management Report: Standard Chartered Bank [Case Study]
Financial institutions across the globe are facing increasing and evolving money laundering and financial crime threats. Regulators have imposed a […]
A Guide to Using Contextual KYC to Better Understand Your Customers
With laborious onboarding, refresh and remediation processes, the challenge of KYC compliance is continuously growing. Find out how a contextual approach helps you to reduce the time and cost of KYC by increasing automation and leveraging decision intelligence for continual monitoring.
Contextual Decision Intelligence (CDI)
CDI technology creates a connected view of data to reveal the relationships between people, places and organizations, helping users unlock the context they need to drive better decisions at scale. It is underpinned by Entity Resolution, Network Generation, Analytics, and Visualization.
Data resilience is an organization’s ability to ensure business continuity despite any unexpected disruptions. It leverages an automated approach that standardizes data protection and provides centralized visibility and management across all workloads and locations. When data is resilient, it can’t be accessed or modified by unauthorized entities.
Dynamic Entity Resolution
Re-generates entities in real time from the underlying raw data, letting the software dynamically include or exclude particular data points and allows users to specify the match confidence they require for their specific use case.
An entity can be anything “real world” that is referenced in your data. For example: a person, business, address, account, phone number, email, device, place, IP address.
The process of working out whether multiple records are referencing the same real-world entity, such as a person, organization, address, phone number, bank account, or device. Entity resolution takes multiple, disparate data points—from external and internal sources—and resolves them into a single, unique entity.
Any data mastered outside your organization. For example: corporate registry sources and external watchlists.
Any information that your organization creates and manages. For example: customers, transactions, products, communications.
Master Data Management
Master data management (MDM) is the process of business and IT working together to ensure uniformity, accuracy, stewardship, consistency and accountability of an enterprise’s official shared master data.
Once entities are resolved, Network Generation can be applied to show real-world connections in your customers’ network to better understand who they are, what their behavior looks like, and how they relate to the government agency, program, or payment they’re administering.
Single Customer View
The result of collecting and connecting data from disparate external and internal data sources to form a single, complete, accurate record for every customer.
Traditional Data Matching
Compares data records directly using a paired matching approach, and relies on many attributes matching to produce a high enough match score.