The pace of change in the field of artificial intelligence (AI) is accelerating so quickly that regulators and legislators are having a hard time keeping up. And that’s especially true in the healthcare industry, where rapid new advances in AI technology are already starting to make healthcare professionals re-think the efficacy of the landmark 1996 Healthcare Insurance Portability and Accountability Act (HIPAA) and consider possible new protections for health data privacy.
How HIPAA established the modern foundations of health data privacy
One major result of HIPAA was the creation of the Privacy Rule, which establishes national standards to protect individuals’ medical records and other personal health information. According to HIPAA, covered entities include health plans, health care clearinghouses, and health care providers who electronically transmit any health information. Under the terms of this federal law, individually identifiable health information was known as protected health information (PHI). Healthcare providers needed to guarantee the security of this PHI, and then take timely measures to alert individuals if any identifying information had been compromised, such as through a data breach. The goal was simple: to protect health data privacy at all costs in the digital era.
At the time, HIPAA was considered a real breakthrough for health data privacy, since it forced health organizations to think in terms of protecting digital records, not just written records. But that was back in 1996, before the rise of the Internet boom, before the social media era, and certainly before the current era of artificial intelligence. In short, a lot has changed since then and the stakes are higher than ever before for health data privacy.
To underscore that fact, researchers at the University of California-Berkeley recently published a new study in the journal JAMA Network OPEN, describing just how easy it is for modern AI systems to sift through thousands of medical records, connect that information with other readily accessible health data, and then generate the identity of specific individuals. The lead researcher, Anil Aswani, noted that the study used two years’ worth of data from 15,000 Americans.
In the case of the Berkeley study, the health data being used by AI systems was step data – the sort of innocuous data that a FitBit activity tracker might collect during a workout or on a walk around town. The AI is able to identify individuals by learning daily patterns in step data, and then correlating that data with large repositories of demographic data. In short, as hard as healthcare institutions work to protect healthcare data, AI systems appear robust enough to “crack the code” and determine the identity of a person. As one researcher noted, the AI system is essentially “putting it all back together,” no matter the efforts made to pull out identifying health information.
The AI arms race
The Berkeley study is worrisome, of course, and points to the development of an unofficial AI arms race in the healthcare industry. On one side will be the AI systems attempting to break into healthcare data records and link names and identities to underlying health information, and on the other side will be AI systems trying to stop them. Now that more people than ever before are using fitness trackers and smartphones for health and wellness, the risks to health data privacy appear to be escalating.
At last year’s World Medical Innovation Forum in Boston, a panel of healthcare practitioners and AI experts discussed just how fast this AI arms race is starting to take shape. According to some experts, the compound annual growth rate (CAGR) of healthcare AI will be 60% per year through 2022. That’s a blistering rate of growth. IBM and MIT, for example, have pledged nearly $250 million to build a state-of-the-art AI research lab in Boston that will focus primarily on issues relevant to IBM’s Watson Health AI system.
Future scenarios for health data privacy
Concern over health data privacy is growing primarily because data breaches at covered entities are becoming more and more common. It’s not just that hackers are going after large medical companies known to have thousands of patient records – they are also going after just about any entity with large repositories of healthcare data, including the U.S. Department of Health and Human Services (HHS). In the era of AI, what is important is the raw data. When that data can be correlated and compared against other sets of health data – that is when unique identities can be discovered.
In one scenario mentioned by researchers, employers, credit card companies and insurers might use this data to discriminate against certain classes of individuals. For example, an employer might not extend a job offer to someone if that person has a history of substance abuse. A credit card company might not extend credit to someone who is pregnant or who has a disability. Before AI, all of that information was protected by HIPAA.
At the same time, the costs of data breaches continue to mount for healthcare institutions. According to a new Ponemon Institute/IBM Security study, the average cost of a single lost or stolen health care record is $408. That’s nearly two to three times higher than the cost of data breaches in other industries. A large healthcare provider with 2,500 records, for example, would potentially be facing a $1 million cost for every single data beach. You can begin to appreciate why health data privacy is such a hot topic right now: there are very real costs involved here.
The problem, quite frankly, is that artificial intelligence is what can be called a black box technology. Researchers can guarantee what goes into that black box, but they no longer know for certain how or why an AI system comes up with a decision. If a human makes a decision, you can ask him or her how they came up with the decision. It’s a lot harder to ask a machine how it came up with a decision. Thus, AI researchers talk a lot about “poisonous” biases, or the fact that feeding the wrong data to the wrong AI system can lead to some pretty negative outcomes.
Impending regulation for health data privacy
Based on the above, it’s easy to see how both state laws and federal law will need to be re-thought and re-imagined for the AI era. For the American medical and healthcare establishment, there will need to be a modern AI version of the HIPAA Privacy Rule that will require covered entities to ensure health data privacy, no matter how intelligent AI systems become. The reality for now is that AI has opened health data privacy to attack, and something needs to be done to deter and defend against those attacks.