Rio de Janeiro downtown showing generative AI tools halted

Meta’s Generative AI Tools Paused in Brazil Due to Data Protection Restrictions on Training Data

Meta is pulling its generative AI tools out of Brazil, at least for the moment, after the country’s data protection regulator banned its use of the personal data of residents for training.

The National Data Protection Authority (ANPD) ordered Meta to stop using the personal data of Brazilians due to “imminent risk of serious harm and irreparable or difficult-to-repair damage to … fundamental rights.” The company would have faced a daily fine of 50,000 reais, or about $9,000 USD, if it had continued to violate this ruling. The company is currently facing similar challenges to its rollout of generative AI tools in the EU and other markets.

Meta generative AI tools face another setback

Brazil’s population of 200 million people is one of Meta’s largest single markets. The country generally comes in somewhere in the top five in terms of total user count for each of its major apps and services, and is the #2 audience for WhatsApp as well as the #4 for Facebook. The company has been actively courting businesses there as of late, holding a conference in Sao Paulo in early June to announce a new AI-driven ad targeting program for WhatsApp.

Meta has suspended its generative AI tools in the country indefinitely as it seeks talks with ANPD about the issue. ANPD has said that Meta must change its privacy policy to exclude a section related to the processing of personal data for generative AI training in order to come into compliance. Meta has also issued a statement indicating that it is “disappointed” by the decision and believes that it will hurt competition and the pace of AI development in the country.

The backbone of Meta’s generative AI tools is “Llama 3,” its most advanced learning model as of yet. The primary app that Llama 3 backs is Meta AI, a generative AI model pitched primarily as a virtual assistant. Llama 3 was released in April and has since become quite popular as it is widely seen as the most powerful LLM available for free, and is even capable of outperforming the paid version of GPT-4 in terms of language skills (though it is largely outperformed in other categories). The most advanced version of Llama 3, 400B, is still in its initial training phase and not yet publicly available.

Leading companies willing to pull generative AI tools from countries with strict regulation

Thus far, some of the major players have proven to be willing to pull their generative AI tools from major markets rather than make proscribed changes to training methods. Meta followed up its action in Brazil by declaring that it will also withhold its latest AI model from the EU for now, citing an “unpredictable” regulatory environment. This follows an early June rash of complaints filed by prolific privacy group “noyb” in 11 EU member nations, claiming that Meta is in violation of the General Data Protection Regulation (GDPR) in its use of user data. Meta joins Apple, which said that the EU can expect to miss out on Apple Intelligence for at least the remainder of 2024 due to concerns about actions being brought against it under the Digital Markets Act (DMA).

AI developers will also be contending with the EU AI Act going forward, the world’s first comprehensive AI law. The Act was agreed upon politically this past December and published in the Official Journal of the EU on July 12. The act’s rollout is somewhat complicated, with the “entry into force” set to take place on August 1 but many of its prohibitions and obligations not going into effect until sometime in 2025. Obligations for “high risk” systems also do not go into effect until August 2026, and some are delayed to as far as 2030.

Though it is complex to navigate, the EU AI Act may well serve as a template adopted around the world as the GDPR has for data protection. Brazil is one of those countries currently weighing its regulations for generative AI tools (and taking some inspiration from the EU), and a draft law has been circulating since early 2023. The latest development has been a preliminary report presented to the country’s senate in May, and a number of legislators have called it a priority item for the remainder of 2024.

The pressure was turned up by a June report from Human Rights Watch that documented how assorted generative AI tools were scraping photos of Brazilian children from the open web for training, supplied primarily by the LAION-5B dataset that is used by Stability AI and Midjourney among other companies and services. Some of these scraped photos had personally identifiable information attached, found among a sample of just 0.0001 percent of the 5.85 billion images and captions the dataset contains.