People and network showing data breach of national public data

National Public Data Breach May Be Packed With Incorrect and Duplicate Information

The news of the data breach of broker National Public Data last week came with splashy headlines attached, in some cases implying that every Social Security number in the United States had been stored in plaintext and was now exposed to criminals. After having had some time to pore through the massive collection of data, some security researchers now believe that the breach may contain a good deal of inaccurate information and that well over half of its 2.9 billion records of data are duplicates.

The updated analysis of the data breach is not entirely good news, as researchers have verified that there are still probably about 899 million unique entries and that legitimate Social Security numbers are among the records. But some, such as “Have I Been Pwned”‘s Troy Hunt, believe that the leaked data contains many records from deceased individuals, records that are simply email addresses with incorrect personal information attached to them, and numerous records that consist of public and non-sensitive personal information likely scraped from sources on the open internet.

Actual damage level of data breach thrown into question

All of this is not to say that legitimate Social Security numbers, including some that may not have been previously exposed, were not included in the data breach. Individuals with credit and dark web monitoring service subscriptions have been getting notifications of this information being leaked to the dark web since the breach was announced. But the splashy claims about “every Social Security number” being exposed do not appear to be holding up to closer analysis.

So what exactly is in the National Public data breach? Given the huge volume of information, it’s hard to break everything down exactly. But Troy Hunt’s analysis finds that what most likely happened is that there was a legitimate theft of some amount of Social Security numbers from the data broker some months ago, but that it was later “enriched” with other information taken from public sources or prior breaches. The total amount of completely new information included in the breach thus remains a mystery at this point.

Hunt notes that the first public rumblings of the data breach were appearing on the dark web and social media back in April, with the attackers looking to sell the data privately for $3.5 million. His analysis found that about 899 million of the 2.9 billion records offered had what appeared to be unique Social Security numbers attached, but that would be about triple the current population of the US; even if the term was used as shorthand to also include comparable Canadian and UK identifiers, it is still about double the total population.

What Hunt found to explain this discrepancy is that many of these records appear to belong to deceased individuals, some who have been dead for as long as 20 years. National Public was likely storing this data in connection with a service it offers to identify and locate relations, including those that are deceased.

If only about a third of the 2.9 billion records contain some kind of Social Security number, what is the rest of the data breach made up of? Hunt found that about 134 million records were a unique email address paired with basic contact information, such as full names and residential addresses. However, by checking his own information, he found multiple entries for a legitimate email address that had wildly incorrect personal information attached. Not every entry also has a complete set of this information (whether or not any parts of it are wrong).

Leaked data may not be entirely new, but centralization continues

While the data breach was initially offered for private sale in April, it was openly dumped to the dark web by a hacker calling themselves “Fenice” earlier this month, prompting the headlines that made national news. National Public Data still has yet to publicly confirm the breach, but has told media sources that it is investigating “third party claims” about the issue. However, there is one big clue that the leaked information did come from a data broker of this type: there has yet to be a known breach notification from people who used a data opt-out service to require these companies to remove their personal information from their stores.

Regardless of the amount of new information leaked, the incident once again highlights the fact that hackers are actively combining the contents of publicly available breaches for convenience (even when it doesn’t seem to profit them anything to do so). Some form of credit monitoring may serve as an emergency measure, but in the long term the issue will likely not be resolved until US data brokers are faced with some sort of national-level regulation. As Nick Tausek, Lead Security Automation Architect at Swimlane, observes: “This incident emphasizes the necessity for stricter data protection regulations. Whether information is scraped or voluntarily provided, organizations must take greater responsibility in securing the vast amounts of personal data they collect. By adopting stronger regulations and proactive security measures, organizations can better protect sensitive information and mitigate the risks posed by such breaches.”

Paul Laudanski, Director of Security Research at Onapsis, notes that businesses must also account for this new reality: “Businesses need to remain vigilant for potential crimes such as IRS tax refund fraud. Monitoring financial accounts, credit reports, and IRS correspondence is essential. Businesses, meanwhile, must ensure the security of their supply chains, infrastructures, and applications. This includes conducting regular security assessments, implementing strong encryption, and training employees to follow security best practices. While complete prevention is challenging due to the evolving nature of the landscape, proactive measures can be taken to significantly reduce the risk of attacks at this scale. Investing in strong cybersecurity defenses, employee training, and incident response planning is essential. By staying informed and adaptable, organizations can better protect themselves against these attacks and mitigate these threats swiftly.”

Bala Kumar, Chief Product & Technology Officer at Jumio, adds: “As hackers continue to breach massive amounts of information, businesses must adopt more advanced and secure identity verification strategies. In this data breach climate, organizations must implement robust identity verification systems that go beyond basic credentials, as it is significantly difficult to determine if information is real or fraudulent. Biometric verification, for example, can help ensure that only legitimate users gain access to sensitive information, reducing the risk of unauthorized access and subsequent harm. As cyber threats continue to evolve, so must the defenses employed by companies responsible for protecting users’ identities.”