National Cyber Warfare Foundation (NCWF)

Hacking Artificial Intelligence (AI): Hijacking AI Trust to Spread C2 Instructions


0 user ratings
2025-10-30 14:24:26
milo
Red Team (CNA)

Welcome back, aspiring cyberwarriors! We’ve come to treat AI assistants like ChatGPT and Copilot as knowledgeable partners. We ask questions, and they provide answers, often with a reassuring sense of authority. We trust them. But what if that very trust is a backdoor for attackers? This isn’t a theoretical threat. At the DEF CON security […]


The post Hacking Artificial Intelligence (AI): Hijacking AI Trust to Spread C2 Instructions first appeared on Hackers Arise.



Welcome back, aspiring cyberwarriors!





We’ve come to treat AI assistants like ChatGPT and Copilot as knowledgeable partners. We ask questions, and they provide answers, often with a reassuring sense of authority. We trust them. But what if that very trust is a backdoor for attackers?





This isn’t a theoretical threat. At the DEF CON security conference, offensive security engineer Tobias Diehl delivered a startling presentation revealing how he could “poison the wells” of AI. He demonstrated that attackers don’t need to hack complex systems to spread malicious code and misinformation; they just need to exploit the AI’s blind trust in the internet.





Let’s break down Tobias Diehl’s work and see what lessons we can learn from it.





Step #1: AI’s Foundational Flaw





The core of the vulnerability Tobias discovered is really simple. When a user asks Microsoft Copilot a question about a topic outside its original training data, it doesn’t just guess. It performs a Bing search and treats the top-ranked result as its “source of truth.” It then processes that content and presents it to the user as a definitive answer.










This is a critical flaw. While Bing’s search ranking algorithm has been refined for over a decade, it’s not infallible and can be manipulated. An attacker who can control the top search result for a specific query can effectively control what Copilot tells its users. This simple, direct pipeline from a search engine to an AI’s brain is the foundation of the attack.





Step #2: Proof Of Concept





Tobias leveraged a concept he calls a “data void,” which he describes as a “search‑engine vacuum.” A data void occurs when a search term exists but there is little or no relevant, up‑to‑date content available for it. In such a vacuum, an attacker can more easily create and rank their own content. Moreover, data voids can be deliberately engineered.





Using the proof‑of‑concept from Microsoft’s Zero Day Quest event, we can see how readily our trust can be manipulated. Zero Day Quest invites security researchers to discover and report high‑impact vulnerabilities in Microsoft products. Anticipating a common user query—“Where can I stream Zero Day Quest?”—Tobias began preparing the attack surface. He created a website, https://www.watchzerodayquest.com, containing the following content:









As you can see, the page resembles a typical FAQ, but it includes a malicious PowerShell command. After four weeks, Tobias managed to get the site ranked for this event.





Consequently, a user could receive the following response about Zero Day Quest from Copilot:









At the time of writing, Copilot does not respond that way.









But there are other AI assistants.









And as you can see, some of them easily provide dangerous installation instructions for command‑and‑control (C2) beacons.





Summary





This research shows that AI assistants that trust real‑time search results have a big weakness. Because they automatically trust what a search engine says, attackers can easily exploit them, causing serious damage.





The post Hacking Artificial Intelligence (AI): Hijacking AI Trust to Spread C2 Instructions first appeared on Hackers Arise.



Source: HackersArise
Source Link: https://hackers-arise.com/hacking-artificial-intelligence-ai-hijacking-ai-trust-to-spread-c2-instructions/


Comments
new comment
Nobody has commented yet. Will you be the first?
 
Forum
Red Team (CNA)



Copyright 2012 through 2025 - National Cyber Warfare Foundation - All rights reserved worldwide.