Strange But True: Horror Stories of Cybersecurity - Issue #2

Strange But True: Horror Stories of Cybersecurity - Issue #2

ChatGPT creates mutating malware that evades detection by EDR | CSO Online

1.    A global sensation since its initial release at the end of last year, ChatGPT's popularity among consumers and IT professionals alike has stirred up cybersecurity nightmares about how it can be used to exploit system vulnerabilities. A key problem, cybersecurity experts have demonstrated, is the ability of ChatGPT and other large language models (LLMs) to generate polymorphic, or mutating, code to evade endpoint detection and response (EDR) systems.

Any signature-based detection mechanism, such as AV or EDR, has always been plagued by polymorphic strains. The response has typically been to make the signatures less specific in order to identify variants of the original malware, but that leads to false positives. Although AV and EDR providers have become better at creating signatures and have implemented more advanced heuristics, AI and ChatGPT will make it much easier for the malware coders to create variants that are both more frequent and harder to detect.

 

2.    ChatGPT and other LLMs have content filters that prohibit them from obeying commands, or prompts, to generate harmful content, such as malicious code. But content filters can be bypassed. 

You should always be leery of anyone that tells you that something can’t happen. The last 30 years of breaches, compromises and cybercrime illustrate that everything is hackable (See Kevin Mitnick’s book The Art of Deception). Content filters are created by humans, which means they will invariably have flaws that can be exploited. This reflects the challenge we have always faced as cybersecurity professionals; we have to be right everywhere, every time and the bad guys only have to be right once. 

 

3.    Almost all the reported exploits that can potentially be done through ChatGPT are achieved through what is being called "prompt engineering," the practice of modifying the input prompts to bypass the tool’s content filters and retrieve a desired output.

This sounds a bit like the logic behind SQL Injections. If you’re not familiar with how this attack works, basically a web server is expecting a certain type of input from the user such as a name, an address, or a telephone number. If the server is not performing proper input validation, then literally anything can be entered into those fields. So instead of an address, an attacker inputs a SQL command, or even better, a SQL command encoded in hex (which gets interpreted by the webserver). The system then receives that input as a command, and passes it to the back-end database, performing such as actions as displaying table contents, copying data, and accepting transfer requests. This is a bit of an oversimplification, but you get the idea.

So, this attack would involve a similar approach. The ChatGPT prompts are expecting some type of input, but what happens if you give it something that it’s not expecting? What will it do with that information? Reject it? Execute it? Respond to it? And will it take that action every time, or are the conditions that will cause the prompts to respond differently at different times or under different circumstances? The real challenge for developers will be to successfully anticipate every possible input variation, which is of course, not possible. So, for the time being, the vulnerability will remain. Maybe ChatGPT can provide a better solution?

 

4. “It is interesting to note that when using the API, the ChatGPT system does not seem to utilize its content filter. It is unclear why this is the case, but it makes our task much easier as the web version tends to become bogged down with more complex requests,” Shimony and Tsarfati wrote in their blog.

This is a great illustration of what we just covered. If you use the UI, the way the developers intended, the filters are present, and presumably, more difficult to manipulate. However, by using the API, the filters are not present and therefore can be bypassed. Oof.

The developers will likely fix this issue, but when and how many exploits will be written in that time frame? Also, we need to anticipate that the bad guys will assume this vulnerability will be addressed and will be actively looking for ways to manipulate the system after it’s been patched.

This is a particularly frightening scenario and has been the stuff of movies such as Blade Runner, 2001: A Space Odyssey and The Matrix, for generations. While these sorts of dystopian futures are not a guaranteed outcome for humanity, for the first time, we genuinely need to start thinking about such scenarios as more than science fiction.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics