Over the past twenty years, a digital revolution produced large volumes of data. Concurrently, litigation, regulatory compliance and cybersecurity events increased.
New volumes of data overwhelm risk managers, causing them to react rather than stepping up with new ways to control and prioritize sensitive data, with less risk, at a fraction of the cost.
With help, better methods of managing data using innovative technology avoids a reactive posture to data, and instead turns to a proactive one.
But how can leaders become more proactive in their approach? And how can they make the data more proactive to meet the growing challenge of comprehensive data risk management?
Let us look at four points to understand the present data environment and how leaders can be more proactive in data risk management.
1. Our data environment has changed and continues to change. So should our approach to data management.
First, we need to understand that our data environment has changed and continues to change. The steady growth of enterprise data, which can be created by anyone anywhere, makes proactive data risk management necessary, but challenging. In addition, a rise in regulatory compliance forced industry leaders to act in their data risk management practices.
After an incident such as data breach or malware invasion, data managers must act. But they often react in an ineffective way.
They survey their enterprise users seeking vulnerable, sensitive data. Then perhaps, they develop new policy and ask users to sign a pledge to safeguard private and sensitive data. It may scratch the itch to take action to safeguard data, but it does not get at the underlying problem or the risk of the exposure of sensitive data. It is simply not enough.
Sometimes they try to index all data. “Index everything” was a logical objective and business approach in the early 2000s, and with good reason. It offers incredible precision and ensures confidence in its results and supported by court-issued discovery orders.
Then reality set in. There is so much data! How can anyone ever manage it all? Is it even feasible to keep up with such a volume of it by going through millions of pieces of data and spending countless hours and dollars indexing everything?
The changing nature of enterprise data, infrastructure, jurisdictional requirements and the adoption of cloud technologies has caused data to accumulate faster than we can react to it. GDPR and similar privacy regulations have compounded this challenge further. Where companies in the past only had to deal with hundreds of requests per year, data subject access requests (DSAR) have significantly increased the number of requests to locate data. Keeping up with these changes became difficult, time-consuming and cost-prohibitive. Something had to break. This created an opportunity to use data sampling to control data and be proactive while still satisfying the demands of discovery, compliance and security.
2. Data sampling has emerged as an effective data risk management best practice.
Using advanced tools and technology, statistical sampling of data offers a better approach than archaic data indexing. Data sampling, via AI-based segmentation, meets the demands of regulators and other authorities. By statistically sampling data, it can be made more proactive to facilitate risk management with a methodical approach.
AI-based segmentation or sampling collects representative data and allows for the development of models across a wide range of data classification use cases. Such sampling provides a way to statistically home in on the vulnerable areas within your data fabric and iteratively test for risk and compliance. It is efficient and effective means of reducing overall risk, in less time and at less cost.
The Census Bureau uses sampling by using selective demographic records and drawing inferences on a population. Manufacturers have used statistical quality control to selectively sample widgets to determine, with confidence, that the quality of a population of manufactured parts met specification, without checking every single widget.
For example, data within a three-year project in which the first six months represent the greatest amount of risk, may be approached first. Sample it and go over it until you attain a confidence level regarding the underlying risk of that sample. Then proceed to the next riskiest area and address it. You continue until you cover your entire data footprint.
Using sampling in this way is statistically sound and acceptable to regulators and other authorities. When used in combination with statistical confidence intervals, it yields accurate results in less time, at a fraction of the cost.
3. Proactive data management is best implemented with AI-based segmentation and sampling.
Employing advanced technology solutions, such as AI-based segmentation to sample data, employs machine learning, effective algorithms and complex cloud computing to enable data sampling to be useful and effective. Applied algorithms go through data iteratively, make inferences using the iterations to and achieve a confidence level and conclusion.
Typically, a statistical sampling pass will establish a confidence rating and standard deviation. If you ask the system to return with 99 percent confidence and a 1 percent interval and a particular scan returns 5 percent policy violations, 99 times out of 100, you will see 4–6 percent of the policy violations.
Such sampling and segmentation does not require 100 percent of the data be reviewed. Courts and regulators understand the sheer volume and velocity of information and are accepting of the statistical method. This approach is proactive.
Proactive risk management is effective. It begins with finding sensitive data through effective sampling. Then takes care of future data as it is being created to better understand it and mitigate risk before an adverse event ever happens.
4. Be vigilant in terms of the people, process and technology needed to move forward and mature.
Proactive data management has become mandatory due to our present business climate and overall proliferation of data volume – that just keeps growing. It starts with a cohesive establishment of goals, followed by a plan for achievement.
Data sampling with available tools, such as AI, helps execute your plan swiftly, effectively and economically to achieve your goals.
Organizations must understand their data fabric and the environment in which they work, as well as the increasing demands on their data. Concurrently, they must be vigilant regarding the people, process and technology necessary to move forward while using proactive (vice reactive) data risk management.