A number of companies, Clearview AI most prominent among them, have found themselves in trouble recently due to mass scraping of personal data from social media sites and other public platforms. A whistleblower has revealed that the practice is taking place in China as well, but big data firm Zhenhua is thinking globally instead of locally. The Zhenhua data leak reveals that the firm has scraped the personal data of at least 2.4 million people from around the world, with the subject selection indicating that it is an operation designed to gather intelligence and determine potential targets of influence.
Zhenhua data leak: Global espionage?
While data scraping is both in violation of the terms of service of most sites and the trust that individual users place in them, there are more than just procedural and privacy concerns in play in this particular case. Zhenhua has been linked to the Chinese military, amplifying the concerns that the data it is gathering is meant for use in espionage and potentially acquiring foreign intelligence assets. The CCP and the People’s Liberation Army are both clients of the firm.
The database of 2.4 million people was provided to Australian cybersecurity consultancy Internet 2.0 by Shenzhen-based academic Christopher Balding, an American citizen who received it from an anonymous source whom he said was concerned about CCP surveillance. Internet 2.0 found the records of about 52,000 Americans, 35,000 Australians, 10,000 Britons and Indians, and 5,000 Canadians in the data set. These ranged from private persons who do not have much of a public profile to major figures such as national leaders, the British royal family, celebrities and members of the military. The Zhenhua data leak records have not been fully explored as a large portion were found to be corrupted when turned over to Balding.
Database search queries were also available that indicate that someone at the Chinese tech company had been probing for information on the UK military and British politics, as well as searching for employees at Australia’s Gilmour Space Technologies that might have a criminal record. A database index was also found that listed thousands of UK citizens convicted of crimes related to drugs, fraud or terrorism as well as directors of UK companies who had been removed from their positions for various reasons. The term “politically exposed” was also attached to some profiles without further explanation as to what that was supposed to mean.
The personal data collected on thousands of academics almost exclusively deals with research done in the areas of hard sciences such as biology, as well as technology and business. The fact that other academic disciplines such as linguistics or history are barely included provides further circumstantial evidence of a government-backed intelligence-gathering operation.
Zhenhua denied that there was a database of two million people, claimed that the personal data it collected was protected by “trade secrets” and denied that it had any links to the Chinese government. But Balding’s initial analysis of the database found that it was very sophisticated and that the individuals included were highly targeted and placed in influential positions in various institutions of political interest. Zhenhua referred to it as the “Overseas Key Information Database” and framed it as a standard research database collected for its private business and research organization clients. It said the primary purpose was to link individuals to their social media accounts.
The Zhenhua data leak information was scraped from a number of social media sites including LinkedIn, Twitter, Facebook, TikTok and Crunchbase. Most of these organizations ordered Clearview AI to cease similar scraping earlier this year, citing violation of each platform’s terms of service. While the majority of the information included in the database seems to have been scraped from public postings on these services, certain items indicate that confidential financial, employment and health information has been included in some of the profiles.
The increasing security risks of posting public personal data
Public figures have always had to think more carefully than most about what they choose to share online, but the Zhenhua data leak shows that nation-states are now actively sniffing around in the affairs of common people in foreign countries looking for any public data they might be able to exploit.
The new security risk that the Zhenhua data leak represents has been summed up by Dr. Zac Rogers of Australia’s Flinders University as “agglomeration.” Revealing bits of potentially sensitive personal or contact information here and there in scattered public places was previously not something to worry about, but aggressive scraping methods are sniffing out and collating all of these scraps of information in a way that makes them much more potentially threatening. Even more sensitive personal data obtained via leaks and breaches can be added to this recipe to make things much worse.
Nation-states might use this personal data for intelligence or attempts to recruit an asset, or they might turn it over to hackers for use in targeted spearphishing or identity fraud campaigns. The Zhenhua data leak also illustrates the added danger of collated personal data being in the hands of a private company, one that is prone to being breached at any time.
Specific to the Zhenhua data leak is the revelation that the world may be underestimating China’s attempts to digitally project influence and run foreign interference throughout the world, to the point of sizing up even relatively low-level private citizens that have criminal records or some sort of checkered past that could open them up to potential exploitation. It would appear that one no longer needs to be a high-profile figure to be the target of a nation’s intelligence apparatus.