Italy was one of the first EU nations to take OpenAI and ChatGPT to task over data privacy violations, even banning the app from the country briefly, and it has now issued the bloc’s first GDPR fine of this nature to the company. OpenAI is being ordered to pay €15 million over failing to identify a valid legal basis for processing user personal information and to be adequately transparent with users, as well as failing to properly notify the Italian Garante privacy regulator of a March 2023 data breach.
First EU penalty for generative AI chatbots over data privacy violations
OpenAI says that it will be challenging the GDPR fine, which was assessed primarily due to the company’s collection and use of personal information as users queried ChatGPT, and called it “disproportionate.” In addition to the fine, the company has been ordered to undertake a six-month public media campaign in the country to educate users about how the chatbot applies personal data to train its algorithm.
ChatGPT was briefly banned in Italy in 2023 for similar reasons, both over concerns about data transparency and for failing to have an age verification system in place to prevent data privacy violations for users under the age of 13. That was lifted after OpenAI implemented an age verification tool and added a user content opt-out form, as well as tuning up its privacy policy and improving its visibility. But the incident also triggered an extended Garante probe into potential data privacy violations by ChatGPT, which resulted in this GDPR fine roughly a year and a half later.
OpenAI claims that the GDPR fine is too large as it represents nearly 20 times the revenue the company has made in Italy with its product over this period. Garante countered this by stating that OpenAI’s cooperation throughout the process was taken into account, and that the data privacy violations were serious enough to merit an even higher penalty had the company been more recalcitrant.
GDPR fine may just be the opening act for OpenAI
Chatbots saw some relief from a recent opinion by the European Data Protection Board (EDPB), which stated that any data privacy violations created by user input would not be considered for GDPR fines and penalties if the information was anonymized prior to deployment. But their developers face a range of new regulations just beginning to come online, as well as uncertainty about how existing rules apply to their models.
The issue of the use of copyrighted content for training has been contentious since the first mass-market generative AI chatbots launched in late 2022. European Union lawmakers have been working on copyright rules for these AI systems since early 2023, and very recently settled on terms requiring developers to disclose any such material that was used to build and train their systems. This is related to the broader terms of the bloc’s AI Act, which went into force this past August (though individual terms will continue to roll out on a staggered schedule until August 2026). The Act clarifies what is considered to be a “harmful” AI system, what prohibited uses are, and sets special terms for both high-risk systems and generative systems that might pose a systemic risk.
OpenAI is specifically dealing with an EDPB task force set up to investigate ChatGPT. Other generative systems, such as Google’s assorted AI models, face similar probes from EU regulators. Some, such as Meta and X’s Grok, have suspended training and product availability in the EU indefinitely after being threatened with GDPR fines and other regulatory action.
Regulators have also shown concern about data privacy violations caused by breaches of these systems. A portion of ChatGPT’s GDPR fine is owed to its March 2023 data breach, in which some users of the “Plus” premium service had contact and payment information exposed to other random users due to a bug in the system. In August, the Dutch data protection authority issued public guidance indicating that employers can be held responsible if employees enter protected personal information into these systems as part of their workplace duties.
Privacy group noyb has also been taking on the issue of chatbot “hallucinations” that provide false information in return to queries about a person. These may be considered data privacy violations if the hallucinated information is harmful to the subject’s reputation or career. The noyb complaints centered on the fact that ChatGPT seemingly has great difficulty admitting when it is not able to find certain information on a person, for example making up birth dates in response to assorted queries instead of answering that it cannot be certain what the factual date is. A study of assorted chatbots cited by the New York Times earlier this year found that they make up information at least 3% of the time, and some had rates of hallucination for as many as 27% of queries.