How Important Is Your Data?

Data encryption. It’s a bit like insurance. We all know we need it but it’s difficult to decide what we protect and select the right policies. So, we tend to get cover only when we feel it’s absolutely necessary – for our property, our cars and when we travel, for example.

Many organizations feel the same about data encryption – they consider it a necessary evil so just do the bare minimum. Full disk encryption is frequently deployed across endpoints but really just to check the ‘Do you encrypt everything?’ checkbox, and this is creating a false sense of security. It’s a great protection if someone loses their laptop but is of absolutely no use in protecting data against unauthorized access or theft from a running system.

The next easy win is encrypting databases. Transparent Data Encryption (TDE) add-ons are available from most database vendors, and these systems live up to their title – they encrypt the database without noticeable impact.

And then there is everything in the cloud. Is it secure? In general, yes, but stories in the news should caution us all to be wary of becoming complacent; clever hackers, disgruntled insiders and human error all abound.

Using these kinds of technologies to establish protected locations, or security silos, makes us feel that the sensitive data is better secured. But what happens to the data when it’s moved or copied outside its security silo? Staff need to run reports, analyze data, make presentations, work on proposals, for example, all extracting data from applications and databases. This exported ‘ad-hoc’ data, copied out from its security silo, is now unprotected and scattered throughout a corporate network on endpoints, file server storage and so on.

How do we go about securing all this ad-hoc data? Most organizations recognize this as a problem, and also have to admit that they do not know where all this information is. Just one successful ransomware attack that cruises around the corporate network, is capable of siphoning off all this locally stored data.

In a 2020 Ponemon report, 67% of respondents say discovering where sensitive data resides in the organization is the number one challenge in planning and executing a data encryption strategy. Data classification technology is often used to identify ‘important’ or ‘sensitive’ data so that it can be encrypted. But this is itself a significant challenge; the report found that 31% cited classifying which data to encrypt as difficult. If information classification is used to drive data encryption policy, then a significant amount of sensitive information will be missed.

Let’s take a look at the steps that are required to classify information so that the most important may be more strongly protected. The first step is to perform a thorough assessment of the data held by the organization, such as intellectual property, source code, merger and acquisition plans, financial records, customer records, personally identifiable information (PII), human resources records etc.

Then for each type of information, a detailed risk and business impact analysis must be executed, measuring the value of data to the business, taking into account aspects such as financial and operational considerations, regulatory requirements and the cost to reputation and brand in the event of a breach.

These first few steps raise some significant issues. Firstly, as we’ve established, most organizations don’t know where this data is stored. And even if it can all be located, how accurate is the classification process? Manual classification is impractical for most organizations, but automation means that search patterns and rules must be developed, all involving their own inaccuracies so that it is highly likely that a proportion of sensitive data will be mis-classified.

The initial effort to catalog and assign classifications to all existing data must then become an ongoing process for users to assign classification tags to data as new information is created, modified and shared. This is likely to be automated – with the same potential for misclassification as before – but often, the user is allowed to override the assigned classification.

And this raises the next problem: Classification and Data Loss Prevention (DLP) rules are unfair. They penalize everyone because of a few bad actors, making employees less efficient and encouraging risky behavior. Staff who just want to get their jobs done will often subvert or circumvent the system, or intentionally mis-classify data to avoid draconian policies and procedures.

Let’s now assume that the organization has performed a successful deployment of a classification and DLP system. What happens when the world changes? Perhaps data privacy legislation is altered; or a new line of business is opened; or you notice that some kinds of sensitive data have been mis-classified. So now the classification and security rules need to be updated. In fact, this should be an on-going process. If the organization is small, or it holds a relatively small amount of data, this approach may be feasible. But, for most organizations, it is challenging—bordering on impossible—to implement effective data labeling policies for the purpose of assigning security measures and to maintain accurate asset tagging at scale.

Going back to basics, what is it that we’re trying to achieve? Data needs protection, from theft by external parties, from insider exfiltration, and from accidental exposure. And isn’t all data inside the organization important? Otherwise, what is its purpose? Even seemingly trivial information can be useful to a cybercriminal, since they are adept at amalgamating small pieces of data to form a bigger picture, to build a spear phishing attack for example.

So, why is it that the accepted norm is to encrypt only the ‘most important’ data? I believe that this stems from the abundance of access controls and authentication mechanisms that put control barriers in front of information. But adding more stringent access controls and authenticating at every step with multi-factor systems is just like building higher security fences with stronger locks. The data behind the fence is still unprotected if someone manages to digitally pick the lock, or to cut through the fence.

Data encryption has been with us for decades. It’s tried and trusted technology and should be used to protect all data – not just that which is classified as the most important. This way, classification can be used for what it’s good at, leaving data encryption to ensure that stolen information remains protected and useless to the thief.

By actively choosing to encrypt all data, all files, no matter where they are stored, we are finally building security into the only piece which has value – the data. After all, firewalls, smartcards, networks, switches and applications are of no value to the business without information.