Researchers Develop Self-Replicating Malware “Morris II” Exploiting GenAI

One of the great concerns about “GenAI,” or the technology that drives generative chatbots like ChatGPT and Microsoft Copilot, is its potential use as a weapon by hackers. Researchers with the Israel Institute of Technology, Cornell Tech and Intuit have developed self-replicating malware that can exploit at least three different types of GenAI email assistants.

The malware, which the researchers call Morris II, was successfully run against Gemini Pro, ChatGPT 4.0, and LLaVA. Morris II was able to exfiltrate personal data and take over email accounts for spamming purposes, additionally able to compromise these systems with a “zero-click” technique that does not require interaction with links or files by the victim.

GenAI exploit demonstrates reality of theoretical AI attacks

The self-replicating malware’s name refers back to the infamous “Morris worm” that tore through the early version of the internet in the late 80s, when things were largely limited to university and Defense Department access. But while that attack relied on exploiting the then-common use of very weak and easily-guessed passwords, Morris II focuses on tricking GenAI into turning input into output, pushed onward to other AI systems.

The researchers model focuses on attacking GenAI email assistants. The approach fills a malicious email with prompts, meant for the AI assistant rather than the human recipient. The self-replicating malware uses these prompts to convince the AI assistant to fetch malicious data that then corrupts its internal database. The GenAI model can then be jailbroken and convinced to pass the same sequence of instructions on to other hosts using the same or similar AI assistants.

While the main approach for the self-replicating malware uses an email full of typed prompts, the researchers also demonstrated a method by which an image or audio file could be used to pass this same set of malicious commands. All of this would be initiated simply by receiving a malicious email, with no need for the target to click on anything or download any attachments.

As to the self-replicating malware’s capability once it has infested target systems, the researchers demonstrated the ability to automatically generate and send emails (something of obvious interest to propagandists or spammers). But the malware can also potentially rifle through inboxes and attachments for sensitive information, with the researchers showing how it could capture credit card and Social Security numbers from victims.

GenAI regulation may need to address security holes

GenAI attacks of this type have not yet been seen in the wild, and the researchers demonstrated this approach under lab conditions. But security researchers have been warning that state-sponsored hackers have been observed experimenting with the offensive capability of ChatGPT and similar tools since they became available.

The self-replicating malware functions by identifying prompts that will generate output that serves as a further prompt, in a process that is not very different from how common buffer overflow attacks operate. The approach also exploits a feature of GenAI called “retrieval-augmented generation” (RAG), a method by which LLMs can be prompted to retrieve data that exists outside of their training model. Ultimately the researchers blamed poor design for opening the door to this approach, urging GenAI companies to go back to the drawing board and improve their architecture.

GenAI email assistants of the sort that were attacked here are already a popular type of automation and productivity tool, performing features that range from automatically forwarding incoming emails to relevant parties to generating replies. Market estimates tend to put the value of this industry at just under $1 billion as of 2023, but see steady growth to around $2 billion over the next few years. Companies are also in a hurry to add GenAI components to all sorts of other business systems, creating ample opportunities for similar types of self-replicating malware.

OpenAI responded to the researchers by saying that the technique relied on user input that has not been checked or filtered, and that it was working to address the issue. But some security researchers believe that filtering and API rules will never be able to adequately stop GenAI from being manipulated into processing user input as machine output, and that these systems may have to be divided by specific functions and siloed from each other to truly end the threat of self-replicating malware.

Whatever the ultimate solution is, clearly more testing and development is needed. Regulation is also on the horizon. It is much closer to implementation in the EU, but in the US the White House has put forth a blueprint for an AI Bill of Rights and issued an executive order on AI safety in late 2023. That has since been followed by the January formation of an AI Council that has been given 90 days to address primary AI security concerns, which include more reporting requirements for AI developers and for cloud developers that offer services to foreign parties training powerful AI models.

James McQuiggan, Security Awareness Advocate at KnowBe4, advises that this development also calls for revised thinking about how social engineering attacks might be deployed: “Whether it’s security researchers or cybercriminals, with the capability of a self-replicating worm via a generative AI prompt, it further supports the need for users to be aware and educated about new styles of social engineering. More than ever, organizations need to adopt proactive approaches to all aspects of cybersecurity programs. Addressing the security challenges posed by GenAI-driven threats requires a collaborative effort among stakeholders across all cybersecurity industries. Organizations and ISACs must share knowledge and best practices, develop standardized frameworks for secure AI deployment, and engage in joint efforts to identify and mitigate emerging threats. Collaboration between AI developers, cybersecurity professionals, industry leaders, and regulatory bodies is essential to ensure that GenAI technologies are used responsibly and securely.”

But Kev Breen, Senior Director Threat Research at Immersive Labs, thinks that self-replicating malware of this sort is still some time off from actually appearing in the wild: “The AI attack surface is still largely unknown and continues to grow every day, which is where we see research like this emerge. It’s important to note that this AI worm is largely theoretical, makes a lot of assumptions about how to embed AI and is designed to be vulnerable — so we likely wouldn’t see this in the real world. However, this new research does demonstrate the need for developers to understand how prompts flow between AI components to avoid these risks. Organizations and application developers have to keep pace, not just with the changes in technology, but the ways attackers are exploiting them as well. It’s not enough to assume your development teams or users can securely use AI. If it’s embedded in applications, you have to assume at some point it could be compromised, like any other CVE in any other application. While OWASP for LLMs helps developers to understand the threats, nothing is better than exposing developers to real world examples through continuous exercising. True resilience is knowing how to be both proactive (with users and developers) and reactive to effectively respond after an incident.”