A complaint filed with Poland’s Office for Personal Data Protection (DPA) is alleging that OpenAI is guilty of multiple GDPR violations, ranging from unlawful processing of training data to failure to meet transparency standards.
The complaint was filed by Lukasz Olejnik, privacy and security researcher with a law firm based in Warsaw. It alleges GDPR violations by ChatGPT in the areas of lawful basis for data processing, data access, fairness, transparency and personal privacy. The popular chatbot is operational throughout Europe, but has hit a number of snags including a temporary ban in Italy over privacy concerns and several ongoing investigations.
Alleged OpenAI GDPR violations center on ChatGPT’s training methods, opaque handling of data
The complaint spans some 17 pages in total and alleges that OpenAI has been racking up GDPR violations from the moment it began operating in Europe, opening with an Article 36 failure to properly consult with regulators in the bloc before beginning to engage with users and collect their personal data. The company did not conduct an initial risk assessment of ChatGPT to determine if there were potential issues that would require mitigating measures to be put in place.
It also asserts that OpenAI has no legal basis to process personal data under the GDPR, nor does it even attempt to communicate one to the end user. It also fails the regulatory transparency test, not providing adequate detail to the user about what personal data is being collected or how it is being used.
OpenAI has stated that it does not feed personal data into its training model, and that it makes efforts to screen it out. However, it does take in personal data and at times it appears that it “sticks” in ChatGPT’s memory and can wind up in later outputs. Olejnik’s complaint stems from just such a situation, as he reports personally attempting to generate a biography of his own life using the chatbot. He then asked OpenAI to meet GDPR obligations in providing the source of certain personal information it has obtained about individuals.
Olejnik went back and forth with OpenAI in an extended email exchange from late March to June of this year, receiving only a portion of the Subject Access Request (SAR) data that is guaranteed by the GDPR terms. The report was particularly short on anything that might have even indirectly provided information about the internal workings of ChatGPT.
Olejnik also noted another common chatbot issue thus far: “hallucinations” and the confident declaration of information that is clearly wrong. The biography he generated contained a number of errors, which he asked OpenAI to correct. OpenAI’s first response was not to correct this information, but to block any requests to ChatGPT involving his name. When he followed up on the issue, OpenAI told him it was not able to make corrections of that nature. All of this sequence of events could represent GDPR violations, as the regulations guarantee the right to view and correct personal data that companies hold.
ChatGPT: GDPR violations by design?
The wide-ranging pattern of potential GDPR violations has some convinced that OpenAI simply does not care about the regulation. While no company would want to court fines and potential bans that span Europe, OpenAI may be looking at what others like Facebook and Google are able to get away with and deciding that barreling forward while perhaps paying some fines along the way is a viable strategy for something so popular and potentially revolutionary.
The company does not have the same comfortable situation that other tech firms based in Dublin have, however. Without a central EU state to call home, OpenAI is presently open to regulatory sniping from anyone in the bloc. It saw its biggest trouble thus far in Italy, when it was banned for a short period until it addressed similar concerns about its legal basis for data processing and transparency along with its ability to identify and offer special protections to children.
ChatGPT negotiated its way out of its immediate problems in Italy by disclosing more about how data is processed, instituting age verification and adding a set of opt-out toggles that can keep input away from its training algorithm. But it remains under investigation by the Italian DPA, as well as France, Spain and the Netherlands among others. And all of this takes place amidst the backdrop of the European Data Protection Board (EDPB)’s developing broader regulations for AI-powered products.
OpenAI doesn’t have to worry about GDPR violations in the US, but it is facing an assortment of civil actions there for allegedly using a broad variety of intellectual property as training material without permission. Most of this has come from book authors claiming that the books were included without OpenAI obtaining a proper legal license (or even purchasing a copy), but NPR has reported that lawyers at the New York Times are mulling a lawsuit based on the AI’s scraping of newspaper archives.
Back in Poland, Olejnik may be left waiting some time for justice as his lawyer anticipates the DPA’s investigation will take anywhere from six months to two years to complete.