Data breaches and misuse of private information continue to erode consumer trust. In response, companies are pouring resources into implementing security controls to block or restrict access to their data. However, the bigger question looms around how the data is being used and why, and many of these inquiries are coming in the form of Data Subject Requests (DSRs).
What’s more, there are several complexities making the onslaught of DSR’s even more challenging. For example, the massive growth in data collection and proliferation has not been accompanied by an equally matched effort in data management and governance.
Organizations today are becoming increasingly concerned about the volume of DSRs, especially in a world where data is king, but compliance is ruling the kingdom. It’s more important than ever to know how to address the growing amount of data subject requests, which if handled poorly, could cripple an organization.
Recent compliance mandates are forcing organizations to respond
Regulations like GDPR and CCPA are forcing companies to respond to DSR’s and answer consumer concerns over privacy. But achieving compliance requires that companies understand what personal information they have, where it’s located and how it’s being used.
Until now, the basic data inventory process has been a manual one of application data owner surveys and spreadsheets. The Integris Software 2019 Data Privacy Maturity Study found that 77% of respondents were still relying on manual processes to manage sensitive data. As a result, businesses have been pushed to their breaking point to assess personal data location and intent. To add further complexity, the rules around personal information (PI) are changing. CCPA defines personal information as data that “could reasonably be linked, directly or indirectly, with a consumer or household.” For example, IP addresses and GPS location data are now considered health and safety information. Resolving identities across hundreds of sources is a data processing and data quality nightmare.
In the era of big data and shifting regulation, a new approach is required to process petabytes of data, extract key data points and derive the relationships between them. For many organizations, the most complex, tedious and resource-intensive step is finding PI and tying it back to the data subject.
Here are five key ways to solve the data subject rights’ big data problem and one thing you should never do!
Reduce your DSR surface area. I recommend a three step approach. The first step is to use sampling scans to discover which systems contain personal information. This helps identify high-risk systems, locate PI and detect classification and labeling issues. Next use deeper scans to identify the tables that contain personal data and any data handling issues. In addition to the timely fulfillment of DSR’s, regulations like GDPR and CCPA also require good data handling practices. Step three requires you to remediate data handling issues. Continuously monitor your sensitive data against your data handling policies and raise these issues so that you can take the appropriate actions. Fixing these data handling issues has the added benefit of further reducing your DSR surface area.
Prepare for a DSR “Denial of Service” attack. If you get flooded with thousands of DSRs at once, the impact is a denial of service attack that overwhelms your CSR and IT staff. Under this scenario, your manual processes reach the breaking point and you can’t respond to requests within the required timeliness (usually 30 to 45 days depending on the regulation). Consider solutions that will allow you to automate your DSR processes, so you can fulfill thousands of requests automatically.
Adhere to the data handling best practice of de-identification. De-identification prevents data analysts from connecting an individual to their personal information. This enables the data analyst to access useful data without compromising customer privacy. Many organizations are de-identifying data for continued analytics after a right-to-forget request is received from a consumer. But not all sensitive information is linked to an identity. Eighty-seven percent of the U.S. population can be identified using only their zip code, gender and birthdate. Each of these data points are benign on their own, but when combined become toxic. These combinations can reveal customer identities along with highly sensitive personal information. That’s why it is critical to inspect down to the data element level to inform you exactly what’s in your data lake, not just what the metadata implies. When you operate here, you can identify highly sensitive combinations of data across your ecosystem.
Comply with privacy and security design principles. Any DSR fulfillment process and associated systems must comply with privacy and security by design principles. For example, companies need to prepare for potential fraudulent DSRs aimed at stealing personal data. Last summer, the BBC reported that a security expert contacted dozens of companies to test how they would handle a DSR made in someone else’s name. He was able to secure all of the data held on his fiancée, including her social security number, credit card information, account logins and passwords, travel details, and even a criminal activity check. Companies in highly regulated industries like financial services already have sophisticated ID verification systems in place. If your firm doesn’t have one of these systems, you may want to explore adding this capability into your DSR workflow. Also make sure to obfuscate any PI from customer service reps.
Apply the concept of identify resolution to improve accuracy. Apply the concepts of identity resolution to identify your data subjects across multiple data sources. Data subject information changes over time and your data subjects may use different information in their interactions with your company, such as nicknames, maiden names, address changes, initials and (Jr./Sr.) When you receive a DSR, it’s helpful to run a quick search to confirm their existence in your data ecosystem. This helps validate that your data subject exists and provides instant access to additional attributes that help disambiguate their identity.
Never make the problem worse by creating additional copies of customer data or violating your own security policies by moving sensitive data between secure network zones. Not only are copies of data inherently outdated, but they exacerbate data sprawl, and open you up to additional security risks.
Massive growth in data collection and proliferation has not been accompanied by an equally matched effort in data management and governance. #respectdata
Click to Tweet
Big data coupled with an era of modern data privacy and protection regulations such as GDPR and CCPA have created the perfect DSR nightmare, as organizations scramble to resolve identities across hundreds of sources. Companies need the right tools in place to assess and monitor the volume, variety and velocity of personal data flowing in, out and across their organizations.