Python Module’s 15 Year-Old Flaw Introduces Supply Chain Vulnerability to Over 350,000 Open Source Projects

Trellix Advanced Research Center detailed a 15-year-old python module flaw that introduces a software supply chain vulnerability to over 350,000 open source projects.

Tracked as CVE-2007-4559, the path traversal vulnerability with a medium score of 6.8/10 on the CVSS version 2 rating exists in the python tarfile module.

Attackers could exploit the module by uploading a malicious file generated with a few lines of simple code to allow arbitrary code execution or control of the target device.

According to Trellix, the tarfile vulnerability was a “massive supply chain issue” because it requires little to no knowledge of computer security.

Trellix’s discovery coincided with the New Advanced Research Center launch mobilizing security researchers worldwide.

The center will provide real-time intelligence and actionable insights to help customers detect and respond to cyber threats.

Closed and open source projects have supply chain vulnerability

Trellix noted that some software developers, including open source projects, have addressed the issue independently.

However, the vast majority (61%) are introducing supply chain vulnerability in their software by unsafely using the module.

“When we talk about supply chain threats, we typically refer to cyber-attacks like the SolarWinds incident, however building on top of weak code-foundations can have an equally severe impact,” Christiaan Beek, Head of Adversarial and Vulnerability Research at Trellix, said.

Trellix analyzed 590,000 unique open source projects, excluding proprietary software whose code was unavailable for analysis.

“Today, left unchecked, this vulnerability has been unintentionally added to hundreds of thousands of open- and closed-source projects worldwide, creating a substantial software supply chain attack surface,” Trellix said.

The most affected industries were software development, artificial intelligence and machine learning, web, data science, and IT management.

Many software development tools, SDKs, CI/CD tools, automation,  and docker containerization tools are affected.

For context, the Codecov supply chain attack exploited the company’s CI/CD pipeline. It’s not inconceivable that attackers could accomplish the same with the tarfile module by leveraging a vulnerable software development tool.

Even worse, the module is included by default in any python project, including software by major companies such as AWS, Google, Intel, Facebook, and Netflix, among others.

If vulnerable, these products put millions of users at the risk of supply chain attacks similar to SolarWinds.

“If an application can be forced to extract a malicious Tar file, an arbitrary file overwrite could lead to code execution,” said Siobhan Hunter, Security Research Manager at Synopsys Software Integrity Group. “This vulnerability is similar to an even older vulnerability CVE-2001-1267 in GNU Tar.”

Acknowledging that attackers could abuse the vulnerability, Parkin questioned whether the vulnerability was worth exploiting by attackers.

“The library is widely used and there are ways to abuse intended functionality with it, but it’s unclear if anything in the original assessment has changed,” Parkin said. “One has to think that After 15 years, if attackers were going to leverage a known issue, they’d have done so by now.”

Online tutorials propagate supply chain vulnerability

Trellix researchers highlighted the failure of online tutorials to educate developers on the risks of using the vulnerable tarfile module.

Popular tutorial sites, including python’s documentation,, tutorialspoint, and geeksforgeeks, fail to address the tarfile path traversal vulnerability adequately.

Thus, following online tutorials, programmers unknowingly produce software with supply chain vulnerability for years.

“This vulnerability’s pervasiveness is furthered by industry tutorials and online materials propagating its incorrect usage,” Beek said. “It’s critical for developers to be educated on all layers of the technology stack to properly prevent the reintroduction of past attack surfaces.”

However, Python documentation clearly warns about using the tarfile module, which barely prevents most developers from unsafely using the module.

“Discussions at the time with the Python Foundation and major Linux vendors concluded that it was “working as intended,” said Mike Parkin, Senior Technical Engineer at Vulcan Cyber. “While it was, and remains, possible to abuse the Python function in question, the circumstances required to do it were fairly specific, and there are ways to mitigate the risk.”

Mitigating software supply chain vulnerability in open source projects

Highlighting the importance of collaboration in patching vulnerabilities, Trellix is working to protect open source projects from the software supply chain vulnerability by pushing code via GitHub pull requests.

The cybersecurity company has identified at least 11,000 open source projects ready for patching.

Additionally, Trellix published the Creosote tool for free on GitHub to assist developers in checking if their python projects suffer from the path traversal vulnerability in the tarfile module.

Developers can also mitigate the vulnerability by adding code to their project to confirm that the destination of the written files matches the target directory.

Josh Kocher, an Adversarial Engineer at LARES Consulting, advised developers to be “mindful of library dependencies,” track vulnerabilities, and apply relevant mitigations.

Additionally, they should consider third-party libraries as “untrusted,” sanitize all inputs and apply error handling for all conditions.

“Vulnerabilities found in libraries can often be far-reaching in their impact due to the number of projects that may make use of them and, as seen with CVE-2007-4559, these vulnerabilities can exist in projects long after the vulnerability has been discovered,” he warned.