The security researchers calling attention to AI agent vulnerabilities keep reaching the same uncomfortable conclusion: the technology is fundamentally gullible. An attacker does not need to break into a system, bypass firewalls, or exploit obscure software bugs. They simply need to persuade an AI agent to do something it should not do.
At the RSA Conference 2026 in San Francisco, Michael Bargury, CTO of security firm Zenity, demonstrated how comprehensively this works in practice. His team showed they could trick Cursor (a popular development tool) into leaking developer secrets, manipulate Salesforce agents into sending customer interactions to attacker-controlled servers, and convince ChatGPT to extract data from Google Drive. All with zero user interaction and without the victim clicking any link or opening any email.
The fundamental problem is architectural. Modern AI assistants have "grown arms and legs," gaining the ability to access emails, documents, and calendars and perform actions on users' behalf through integrations with enterprise environments like Microsoft, Google Workspace, and Salesforce. Once an attacker controls what the agent does, they inherit whatever access the agent has been granted. That is an extraordinarily powerful position.
Bargury reframes the vulnerability as a persuasion problem rather than a technical one. "AI is just gullible," he told The Register. "We are trying to shift the mindset from prompt injection because it is a very technical term, and convince people that this is actually just persuasion." The team demonstrated this by tricking Cursor into a "treasure hunt" where it needed to find items matching a certain format. By describing what secrets look like in that format, the researchers got the agent to voluntarily steal credentials. Cursor's own guardrails did not prevent this because the agent genuinely believed it was playing a game.
Cisco's State of AI Security 2026 found that while most organisations planned to deploy agentic AI, only 29% reported being prepared to secure those deployments. That gap exists because traditional security infrastructure cannot see these attacks coming. Endpoint detection tools look for malicious binaries and suspicious processes. They cannot flag a calendar invite containing hidden instructions.
Zenity Labs disclosed PleaseFix, a family of critical vulnerabilities affecting agentic browsers, including Perplexity Comet, that allow attackers to silently hijack AI agents, access local files and steal credentials within authenticated user sessions. The vulnerabilities can be triggered through malicious content embedded in routine workflows, enabling unauthorised actions without user awareness. In one example, a simple calendar invitation was enough to make the Comet browser exfiltrate files from the user's hard drive.
The challenge facing defenders is real but not insurmountable. Bargury emphasises that soft boundaries do not work. Asking an AI agent nicely not to perform sensitive operations achieves nothing. Instead, organisations need to build hard boundaries into the code itself, enforced at the execution level before the model's reasoning takes over. If an agent reads sensitive information, hard code it so the agent cannot transmit that data outside the organisation. If only certain operations should be possible, build those restrictions into the tool definitions the agent can access.
This requires a shift in thinking. Most agentic failures are not model failures. They are authority failures, in which the agent was permitted to do something it should never have been allowed to do. Because agents operate autonomously and at machine speed, failures can propagate across every connected system before any human can detect them.
For organisations already deploying AI agents in production without these controls in place, the situation is uncomfortable. A 2026 Gravitee survey found that only 24.4% of organisations have full visibility into which AI agents are communicating with each other. More than half of all agents run without any security oversight or logging. These undiscovered agents represent unguarded access paths into critical systems.
The broader industry is beginning to respond. Cisco announced security innovations designed for the agentic AI ecosystem at RSA Conference 2026, introducing solutions to address AI security issues and remove a top barrier to agent adoption. By establishing trusted identities, enforcing strict Zero Trust Access controls, hardening agents before deployment, enforcing guardrails at runtime, and giving security operations centre teams the tools to stop threats at machine speed, Cisco is building security into the foundation of the emerging AI economy.
The lesson is clear: deploying AI agents without rigorous security architecture is betting that attackers will not find your organisation, or if they do, they will not exploit what they find. Those are increasingly poor odds. The technology is too powerful, and the attack surface too broad, for trust to substitute for structural security.