To succeed in the digital economy, organizations need to view their data as an asset and unlock its value. This drive has led to ‘data is the new oil’ headlines that are conceptually true, but vastly oversimplify the resource itself. Unlike oil which can be concretely described and defined, data has depth and layers, and quite often, sensitivities. And, its value, meaning, and protection requirements shift depending on the circumstances in which it is obtained and used. Frequently a snapshot of something much larger, data is only as valuable as the insights that can be extracted from it. To address this unprecedented need, machine learning (ML) is proving to be a powerful tool for uncovering the value locked within data. However, with power comes responsibility, and in order to leverage ML effectively, businesses must also understand and mitigate the risks that these capabilities can introduce.
A tactical subset of Artificial Intelligence, ML broadly refers to a set of algorithmic techniques used to build and utilize ‘smart’ constructs called models in order to extract insights from data. Models are built via being trained on a body of data in order to produce high-quality, meaningful insights. When such models are utilized or evaluated, they offer unparalleled capabilities for harvesting intelligence, hence the broad interest in the space. A 2020 survey of IT and line-of-business executives conducted by Deloitte showed that 67% percent of respondents were using ML today, and 97% are using or planning to use it in the next year.
ML capabilities prove their full worth as they move out of experimental environments facilitated by data science or research teams and start delivering business value that can’t otherwise be attained. This critical shift is already happening. In its recent report, the U.S. National Security Commission on Artificial Intelligence (NSCAI) offered up numerous examples of how AI and ML are being deployed across industries like healthcare, education, and smart cities. MIT computer science professor Aleksander Madry, director of the MIT Center for Deployable Machine Learning, recently summed up the transformational power of this increasingly pervasive field with this statement: “Machine learning is changing, or will change, every industry, and leaders need to understand the basic principles, the potential, and the limitations.”
Two such limitations to ML reaching its full potential as a key business capability: privacy and trust. When asked their top-three priorities for AI applications, respondents to Pricewaterhouse Cooper’s 2021 AI Predictions Survey named responsible AI tools to improve privacy, explainability, bias detection, and governance as their top choice. In particular, as ML models are a reflection of the data over which they were trained, data which may have privacy or other sensitivities associated with it, they are valuable and vulnerable. The privacy and security of ML was a topic of discussion during the annual Cryptographer’s Panel at the 2021 RSA Conference with several speakers suggesting that current ML systems are in need of increased trust guarantees.
Perhaps unsurprisingly, one solution to ML’s trust and privacy problem comes in the form of further technology breakthroughs. An increasingly visible category known as Privacy Enhancing Technologies or PETs is delivering business-enabling capabilities for privacy-challenged use cases across verticals. When leveraged for ML applications, these technologies manifest as a specialty called Privacy-Preserving Machine Learning (PPML). PPML uses PETs to ensure that privacy is both protected and prioritized when building and utilizing models. This protection also extends to ensuring that any sensitivities regarding the data itself, including regulatory requirements or competitive advantage, are respected and protected.
PPML enables robust and reliable insight extraction to overcome many of the barriers that can limit broader business applications today. Utilizing PETs in this capacity can also uniquely facilitate decentralized frameworks for collaboration and monetization, applications that are drawing the attention of business leaders because they can lead directly to new opportunities for revenue. A decentralized approach is especially critical to achieving maximum value by broadening the potential pool of data assets and partners that can be leveraged as part of data monetization efforts.
But while establishing trust, minimizing risk, and prioritizing privacy in ML are all worthy pursuits in their own right, the value of the category will ultimately be measured by its impact, bringing us back to PPML’s unique ability to unlock real, measurable value. The overarching business drivers for PPML fall into two broad categories: rich insight extraction and new revenue generation. Organizations interested in insight extraction are using PPML to leverage third-party data for model evaluation and training. They are able to protect their interests, intentions, competitive advantage, and IP without sacrificing usability. And when it comes to uncovering new revenue streams, PPML opens the door to previously untapped data monetization opportunities by allowing other entities to leverage existing data assets for model evaluation and training. PPML protects interests, intentions, and IP while allowing data owners to retain positive control of their data assets. This is applicable in industries including financial services, healthcare, manufacturing, as well as marketing/advertising.
Despite the momentum in the space and widening use of ML, effectively harnessing these capabilities into real-world applications is still a challenge for business leaders. In any machine learning initiative, maintaining and respecting trust, privacy, and sensitivity around the data itself is critical for success. Privacy-Preserving Machine Learning is a critical component to delivering these outcomes, accelerating the path to broad, real-world adoption.