Research

Researchers Demonstrate AI Command Hijacking via Malicious Road Signs

2 min readSource: Schneier on Security

New CHAI attack exploits Large Visual-Language Models in autonomous vehicles, exposing critical vulnerabilities in embodied AI systems with real-world testing.

AI Systems Vulnerable to Command Hijacking via Visual Prompts

Security researchers have identified a novel attack vector targeting embodied Artificial Intelligence (AI) systems, demonstrating how malicious actors could manipulate autonomous vehicles and drones through visual prompt injection. The attack, dubbed CHAI (Command Hijacking Against Embodied AI), exploits vulnerabilities in Large Visual-Language Models (LVLMs) to override AI decision-making processes.

Key Findings from the Research

The paper, titled "CHAI: Command Hijacking Against Embodied AI", reveals how attackers can embed deceptive natural language instructions—such as altered road signs—into visual inputs to trigger unintended actions. The research team developed a systematic approach to:

  • Search the token space of LVLMs to identify exploitable patterns.
  • Build a dictionary of adversarial prompts that bypass AI safeguards.
  • Generate Visual Attack Prompts (VAPs) capable of hijacking AI commands.

Technical Details of the CHAI Attack

The study evaluated CHAI across four LVLM-powered systems, including:

  • Autonomous driving platforms (real-world and simulated).
  • Drone emergency landing protocols.
  • Aerial object tracking systems.
  • A physical robotic vehicle for real-world validation.

Unlike traditional adversarial attacks that rely on pixel-level perturbations, CHAI leverages semantic and multimodal reasoning—core strengths of next-generation AI—to achieve higher success rates. The researchers found that the attack consistently outperformed existing state-of-the-art methods, raising concerns about the robustness of embodied AI in safety-critical applications.

Impact and Security Implications

The findings underscore a critical gap in AI security: defenses designed for conventional adversarial attacks may fail against prompt-based manipulations. Autonomous vehicles, drones, and robotic systems relying on LVLMs could be tricked into:

  • Misinterpreting road signs (e.g., a "STOP" sign altered to read "GO").
  • Ignoring emergency protocols (e.g., forcing a drone to land in an unsafe zone).
  • Deviating from intended paths (e.g., rerouting a delivery robot to a malicious destination).

Recommendations for Mitigation

While the research does not propose specific countermeasures, it highlights the urgent need for:

  • Enhanced input validation to detect and filter adversarial prompts.
  • Multimodal anomaly detection to identify inconsistencies between visual and contextual data.
  • Robustness testing frameworks tailored to embodied AI systems.
  • Collaboration between AI developers and cybersecurity experts to address emerging threats.

The full paper is available on arXiv, and further analysis can be found in The Register’s coverage.

This research was originally highlighted by cybersecurity expert Bruce Schneier.

Share