Hands of woman using smartphone with Twitter showing data leak

Twitter Now Claims Data Leak of 200 Million Profiles Was Phony, Data Set Was Assembled From Pre-Existing Sources

An apparent data leak of private information from 200 million user profiles is a hoax, at least according to Twitter. The company claimed that the available information could not be correlated with any data originating from a breach of the platform’s systems, and that it could find no evidence that a previously patched vulnerability from early 2022 was used to gather it.

Confusion and controversy have surrounded the data set since it first appeared for offer on the dark web in early January, purporting to be a cleaned-up version of an earlier data dump of some 400 million profiles that had removed all duplicate information. The data was thought to have been taken via an API scraping vulnerability that Twitter patched in January 2022. Security researchers had matched email addresses to account names, providing an indication that the data leak was legitimate, but Twitter says that the data was gathered via a variety of publicly available sources.

Status of Twitter data leak remains unclear as company denies legitimacy

Twitter’s denial of the data leak came in a January 11 blog post, in which it said that “there is no evidence” the data being sold was obtained via a breach of its systems. It found that the original set of 400 million files for sale, along with the pared-down version containing 200 million files, were not connected to previous incidents and did not contain any information related to user passwords.

Twitter’s theory is that the alleged data leak is actually a collection of information gathered via public sources, timed to allow someone to make a few quick dollars off of dark web buyers. However, the collection apparently does contain at least some legitimate information and Twitter has warned users to expect scam messages in attempts to compromise their credentials.

The whole incident traces back to a vulnerability that was present in Twitter’s API from June 2021 to sometime in early January 2022. It was a fairly standard API scraping bug, allowing malicious parties to feed lists of email addresses or phone numbers to the system and receive back any user names they were connected to (along with a small assortment of other profile information). Twitter has acknowledged that prior vulnerability and confirmed that at least 5.4 million profiles were exposed during the breach window, with that prior data leak appearing on an underground forum with an asking price of $30,000 USD.

There has been quite a price drop since then; the hacker was asking a mere $2 in early January for access to the purported cleaned-up set of 200 million files. That might point to the data leak being bogus, but it might also point to the information already being in such broad circulation that it has virtually no value beyond the convenience of the packaging.

Wording of Twitter denial, independent analysis raises some doubts

Some independent security professionals have performed their own analysis of the Twitter data leak, found it to seemingly originate from a vulnerability and to contain unique data, and stand by their assertion that the 200 million files are as advertised by the hacker. Some also note that the wording of Twitter’s denial is not entirely definitive, leaving legal wiggle room for a future turnaround if “new evidence” should emerge.

One question that remains open is exactly how many attackers were able to take advantage of the 2021 vulnerability before Twitter was able to close it. The company reportedly only discovered it as the result of a bug bounty submission in December 2021, and confirmed it had been exploited by “multiple” attackers prior to it being patched out. In this case, the person offering the files for sale claimed that they were not the one responsible for the data leak, they simply came across a collection of profile information that “looked like a script kiddie had dumped it.”

Twitter has had a checkered cybersecurity record in recent years, though most of the incidents are from before the controversial tenure of Elon Musk. The most notorious was a social engineering scheme in mid-2020 that saw high profile accounts taken over to tweet out links to a crypto scam, causing all of the platform’s verified “blue check” accounts to be temporarily suspended for the better part of a day as the issue was figured out. It also suffered a 2019 data leak that was specific to Android app users, causing tweets labeled as “private” to be accessible to non-followers via a software oversight.

Security researchers had matched email addresses to account names, providing an indication that the #dataleak was legitimate, but Twitter says that the data was gathered via a variety of publicly available sources. #cybersecurity #respectdataClick to Tweet

While Twitter does not make the news for regulatory issues quite as often as some of its big tech contemporaries, it is currently under investigation by Ireland’s Data Protection Commission for both the 2019 Android issue and the more recent data leaks. Twitter says that it is maintaining contact with the relevant authorities and supporting the investigations. It remains to be seen if US authorities will take interest in the data leak.