It is pretty much an open secret that various intelligence agencies and military branches in the United States have a vested interest in monitoring social network posts. These posts are often used by terrorist organizations such as Al Qaeda and Islamic State (amongst others) to pass on coded messages between members and promote their agenda, as well as trawl for new recruits. This type of Internet surveillance is now common. However, the security of the data collected is essential – a leaked database serves to alert terrorist organizations of vulnerabilities in their use of the social media as a communications tool.
Leaked databases may provide clues as to the keywords or the algorithms used to identify behavior that raises flags with U.S. agencies. This may assist ‘foreign intelligence’ (including non-government players such as terrorist organizations) in gaining insight around how to avoid behavior that raises a red flag with agencies conducting the online surveillance.
Internet surveillance slipup
In mid-November 2017, it was revealed by long time security breach hunter Chris Vickery of UpGuard that three misconfigured AWS buckets containing ‘dozens of terabytes’ worth of social media messages, including Facebook posts and Twitter feeds, were literally open to the public. This information was gathered by the U.S. military as part of their ongoing efforts to identify so called ‘persons of interest’.
Vickery’s examination of the leaked database revealed that one of the buckets contained a staggering 1.8 billion social media posts which have been automatically gathered over the past 8 years. That’s just the tip of the iceberg. The leaked data consists of compressed text files and in his estimation, these can be expanded out by a factor of ten.
‘There’s dozens and dozens of terabytes out there and that’s a conservative estimate,” said Vickery.
The leaked database was discovered during a routine scan of Amazon hosted data buckets – and they weren’t hidden or password protected. The buckets went by the names of slightly ominous names of centcom-backup, centcom-archive, and pacom-archive.
The CENTCOM connection
Ominous in that CENTCOM is an abbreviation of U.S. Central Command. This is the umbrella body that oversees the army, navy, air force, marines as well as special operations in the Middle East, North Africa and Central Asia. PACOM is the name for US Pacific Command, covering the rest of South Asia, China and Australasia. Even a cursory knowledge of geography, politics and threat analysis indicates that some of these nations are among the global hotspots for terrorism related activity. These are not regions where the United States will want to be revealing information to those who are antagonistic to its presence.
The data appears to have been collected as part of the United States government’s ‘Outpost’ program. Which monitored youths and tried to influence them to avoid the temptation of joining terrorist groups.
The sheer scale of the leak reveals some worrying trends in Internet surveillance. Although it’s common knowledge that social media activity is of great interest to U.S. authorities – and other governments, the reach and extent of the surveillance makes many uncomfortable. Although the CENTCOM and PACOM areas of operation do not include the United States, users of social networks in that country should not rest easy. The leak showed that American social network user activity was also swept up by the keyword searches and algorithms used in the ‘Outpost’ program.
Leaked database – The Coral Reef connection
One of the files examined referred to ‘Coral’ and is most likely a reference to the ‘Coral Reef’ data mining program. Coral Reef is designed to analyze a major data source and provide analysts with the ability to mine significant amounts of data and provide suggestive associations between individuals active on a social network.
As worrying as the developments related to the public availability of the information are – the U.S. military has hardly been coy about its monitoring activities. In fact, it has even revealed that the ‘Coral’ reef software has won awards in the past. But even this laissez-faire attitude hides something deeply disturbing. The sheer ease with which the leaked database saw the light of day and just how poorly it was protected begs the question: Are organizations like CENTCOM, who are involved in Internet surveillance taking the question of whether this sort of information should be stored on a public database at all seriously?
It’s fairly obvious that those organizations that mean America harm would love to get their hands on this information.
Amazon is not ignoring the problem, in part because this is not the first time that U.S. military has let the genie out of the database bottle. So the Internet giant is adding folder encryption tools, displaying warning messages to inform users that their buckets are not locked down and it is also tightening access controls. But in the end, it’s that vexing question of public vs private clouds. It seems obvious the U.S. government should not be using publicly accessible storage for data like this – the risks are simply too great.Bad idea for internet #surveillance data including 1.8B #socialmedia posts to be stored in public #cloud services?Click to Tweet
The buckets are now hidden, thanks to Vickery notifying the U.S. military of the foul-up – but it seems a good bet that something like this will happen again if organizations like CENTCOM do not radically overhaul their approach to gathering and storage of the results of online surveillance.