Researchers at SafeBreach Labs say they were able to manipulate Google Gemini by hiding instructions inside a WhatsApp message, showing how a routine notification could be turned into a vehicle for prompt injection.
The finding adds to growing concern about the security risks of AI assistants that can read messages, summaries and notifications from other apps. According to the researchers, Gemini on Android can examine incoming notifications to provide context-aware responses. That convenience, they said, also creates a pathway for malicious content to influence the assistant without the user taking any obvious action.
The attack did not rely on a user clicking a link or typing a command. Instead, the researchers embedded hidden instructions inside a crafted message. When Gemini read the notification, it reportedly followed the attacker’s instructions silently. SafeBreach said the technique worked across several messaging and social apps, including WhatsApp, Slack, Signal, SMS, Instagram and Messenger.
SafeBreach described the method as indirect prompt injection, a technique that places malicious instructions inside content an AI system is meant to process. In this case, the content arrived through a messaging notification. The researchers said they used a strategy they call Fake Context Alignment, which makes attack instructions appear to be part of a legitimate ongoing conversation. Their aim was to bypass Google’s existing defenses against this class of attack.
The researchers said they demonstrated five broad threat scenarios, including data theft, unauthorized actions, phishing relay, preparation for account takeover and silent surveillance. They also said that even when Gemini did not have access to external tools, the poisoned context alone could be enough to make the assistant generate fake system messages that could be used in phishing attempts.
The research is notable because it builds on a previous SafeBreach disclosure involving Gemini and Google Calendar invites. In both cases, the core issue was not a traditional software bug, but the possibility that Gemini could be steered by outside content it was designed to interpret.
SafeBreach said it disclosed the issue to Google before publishing its findings. Google’s own security guidance already identifies indirect prompt injection as a known threat and says the company has layered defenses in place to reduce the risk. The researchers’ results suggest those protections can still be bypassed in some scenarios.
The broader lesson, according to the researchers, is that the attack surface is tied to how AI assistants are integrated into everyday workflows. Any app or service that feeds information into an assistant can become a possible delivery channel for malicious instructions. The more access such a system has, the more serious the potential consequences if it is manipulated.
For users, the finding is a reminder that permissions matter. If an AI assistant can read messages, notifications or other personal data, it may also be exposed to content that was never meant for it. SafeBreach’s takeaway was not that users should avoid Gemini entirely, but that they should review what the assistant can access and disable features they do not need.
As AI assistants become more deeply embedded in phones and productivity tools, security researchers are likely to keep testing whether everyday content can be turned into a control channel. This latest demonstration suggests that a simple WhatsApp message may be enough to make a trusted assistant say or do the wrong thing.