+44 203 808 8299
I have been chatting to a lot of data teams across many organisations, and there seems to be a few GDPR technical issues that are bubbling up to the surface. In particular, dealing with a subject’s right to access, rectification or erasure. Under GDPR such requests have to be dealt with in a timely manner and also account for changes in a subject’s consent.
What is the problem?
The challenge in most organisations is that customer data finds itself spread across many systems, and it is not always easy to track down the data. This is further compounded if the data is of variable quality. For example: some systems may not have a unique customer ID and may have dropped the date of birth, a customer may have changed address. How do you search for a customer’s record and be confident you have found all the right records across all the systems?
In looking for a solution, it is interesting to consider the problem being akin to that of the ‘single view of customer’ issue. Another similar challenge is the avoidance of creating multiple customer records at time of onboarding, as many organisations find it a challenge to check if the customer record already exists in the system. The approach to solving these issues is the ability to perform entity resolution (single view) within the data and allow this to be searchable using a range of different facts that could identify a customer. For example, using a fuzzy search and combining a range of facts such as fragments of: name, address, employer, date of birth, email, contact phone, ID numbers, etc. This will provide a more effective way to narrow down the results.
How does a data lake help?
For organisations that have started down the journey of dumping all their data into a data lake on a regular and automated basis, then you have a great place to land a searchable single customer view using entity resolution capabilities. As an aside, this is a generally valuable capability for many use cases within customer centric organisations – finally a GDPR upside? If you don’t have a data lake, the availability of open source technologies such as Hadoop make this a relatively low cost and accessible exercise. However, one should not underestimate the effort if you have a lot of diverse systems as they will all need to dump data into the data lake.
The next step is to deploy suitable technology to create the searchable single customer view in the data lake. There are a few options, but Quantexa does provide such technology, and have been able to achieve this in large data lakes for global financial institutions in a relatively short time period. Other benefits would be the ability to quickly check if a customer or counter party already exists before creating a new unconnected record. However, the largest upside is having these single customer views and linked customer data available for all other manner of analytics and AI use cases.
Imam Hoque, COO and Global Head of Product Quantexa