The recent decision by the Court of Justice of the European Union that declared the EU-US Privacy Shield framework invalid stripped American companies of a useful GDPR compliance mechanism. Considering that approximately $1.7 trillion in transatlantic trade comes from companies who were relying on Privacy Shield, this news strengthens the need for organizations to make determinations about where their data lives and how that data is used.
Standard contractual clauses (SCCs) were upheld as the primary means to safeguard data-sharing. There is a major caveat, though: the onus is on data controllers to ensure appropriate security protections are in place.
With companies no longer able to rely on Privacy Shield for protection when exporting data out of the EU, and further scrutiny promised around the implementation of SCCs, companies have two main options available to them: to localize data storage and/or to strengthen their SCCs.
Localizing data storage
One way to avoid misstepping when transferring data outside of Europe is to keep the data right where it is instead of storing it on servers located in the US. Big tech companies like Microsoft, a major cloud operator, already have data centers in Europe. As Chief Privacy Officer Julie Brill explains on the Microsoft blog, the tech giant was not reliant on Privacy Shield to begin with (it was always a voluntary proposition). Meanwhile Dashlane, the password manager and digital wallet provider, has been storing data in Europe since its inception, weaving the tighter privacy regulations into their value proposition to customers.
The problem is, 70% of companies previously relying on Privacy Shield are small or medium-sized businesses that lack even a fraction of the resources and global infrastructure that Microsoft has and have not proactively located within Europe as Dashlane did. For these companies, building data centers in Europe or establishing new relationships with European-based cloud providers can cut deep into the bottom line.
And it’s an investment that may not be necessary for the long term, as leading officials on both sides of the Atlantic commit to finding a new solution. EU officials like Margrethe Vestager, leader of digital policy and competition at the European Commission, are actively trying to calm concerns on the heels of the ruling. “We will work hard to make sure that data can be transferred,” she said. “We are in a data-driven economy.”
The Court of Justice made it clear that the current state of SCCs is largely not good enough. The recent ruling moves compliance beyond the transfer of data, to now put exporters on the hook for destination storage practices, too. Legal guidance is broad, pointing to a “case-by-case basis” for review, putting companies in the arduous position to analyze and scrutinize compliance measures on their own whenever they want to send data abroad.
And we can’t forget that while the reversal on Privacy Shield presents a clear vulnerability for transatlantic data sharing, improving privacy and data security should be a top priority for business leaders already. After all, IBM found that the average data breach costs a company $3.86 million per occurrence.
Specific policies and clauses can define the appropriate security measures for each step of a data transfer and storage, but to save companies from applying their own case-by-case scrutiny for every operation, they can find the right data security partner to tie secure encryption to the data itself—no matter its location or position in the transfer process.
With the stakes high, and responsibility for regulatory compliance laying firmly in the lap of the data exporter, what should companies look for in a technology partner?
Here are three features to require in a data security solution:
It’s not enough to protect against deliberate data breaches like hacking. Proper encryption must also account for inadvertent leaks that come from procedural blindspots or employee error and neglect—an occurrence 79% of CIOs report having realized across a 12-month period.
The most comprehensive solution will encrypt sensitive information across various data storage and sharing platforms and devices — as well as be able to detect a potential breach —helping teams and even different organizations grow comfortable with sharing data and making decisions for safer and more powerful data science.
Federated data science
Federated data science is any data science done in a federated way — meaning that individual data points data sets are not sent over the network, only the updates. It allows multiple companies or multiple departments and locations within a company to build machine-learning models while keeping proprietary data with its original owner and location (for example, Europe).
Federated learning, introduced in 2017, enables developers to train machine learning (ML) models across many devices without centralized data collection, making sure that only the user has a copy of their data — and is used to power experiences like suggesting “next words” and improving the quality of smart replies, according to Google.
Federated learning sends small updates to a centralized aggregator, but if those updates are deemed sensitive, one can also employ secure aggregation — a technique that allows those updates to be sent and computed while encrypted. Only after those updates are computed are they decrypted and sent to all participants.
And let’s not forget federated analytics (also called federated aggregates or computations), another recent advance in federated technologies (and invested in heavily by Google) which enables better accuracy and privacy for a range of data science needs. It’s a method of applying data science to the analysis of raw data stored on users’ devices, but can also be used across siloed data in organizations or applied across several organizations who would like to share data for analytics or other data science computations.
Similar to federated learning, it runs computations over each device’s data, and makes the aggregated results — never any data from an individual device — usable by product engineers. It’s highly similar in concept to federated learning without actually learning (including the machine learning steps). Federated analytics only supports basic data science needs.
A use case for federated analytics would be if you have many different data sources and want to calculate the average or sum — for example the average bank account balance for customers in specific age ranges. Federated analytics would help each jurisdiction to tally and securely send those individual sums which can then be extrapolated to global averages. Secure aggregation can also be applied to federated analytics, allowing the small updates to be computed as encrypted with only the final result shown to the members.
Differential privacy is also a relatively new privacy-enhancing technique, only around in the last dozen years or so, and still little known to many of the people governing privacy law and regulations — and even some data scientists. Yet it’s a robust de-identification technique which needs to get more air time, as it can be used to help follow the EU guidelines from GDPR on state-of-the-art anonymization.