Voice Phishing Now Leading Cloud Attack Vector

The threat landscape has shifted dramatically away from the email inbox. Voice phishing has emerged as the second most common initial access method for cybercriminals breaching global networks, and remarkably, the number one tactic when targeting cloud environments specifically. This represents a sharp departure from historical attack patterns, where vulnerability exploitation has long dominated.

According to Google Cloud's annual M-Trends report released this week, based on more than 500,000 hours of incident response engagements worldwide, attackers used voice-based phishing as the initial infection vector in 11 percent of all 2025 intrusions. That's a dramatic rise for a threat vector that was barely tracked as significant just a few years ago. Meanwhile, traditional email phishing plummeted to just 6 percent of breaches, highlighting the effectiveness of modern email security controls.

What makes this shift especially concerning is the interactive nature of the attack. Unlike automated email phishing campaigns, voice phishing engages a human element. Criminals are calling IT help desks directly, often impersonating employees or contractors, to register attacker-controlled devices for multi-factor authentication or to reset passwords. The help desk's core function is to help people, which creates a psychological vulnerability. As Jurgen Kutscher, vice president of Mandiant Consulting at Google Cloud, explained: "An IT help desk, by default, tries to help. That's part of the reason why the social engineering attacks that are interactive are so powerful."

Organised cybercrime groups like ShinyHunters and Scattered Lapsus$ Hunters are not treating voice phishing as a minor tactic. They've developed methodical playbooks, building multiple scenarios to trick help desks and establishing custom infrastructure to intercept credentials. The sophistication level now rivals traditional malware delivery, except the attackers are simply talking their way past human gatekeepers.

The threat extends beyond phone calls. Google's security researchers documented a spike in what's known as ClickFix attacks last year, where criminals trick users into running malicious commands by clicking prompts that mimic system error messages or CAPTCHA verification screens. These interactive attacks represent a new level of attacker sophistication, driven by the simple economic calculation that talking to people generates better returns than mass-mailing phishing emails.

The cloud environment vulnerability is particularly acute. Voice phishing bypasses many traditional perimeter defences because the attack happens outside technical infrastructure entirely. A determined attacker with a convincing script and basic social engineering training can reach any organisation. What's more, cloud-native architectures fragment the security landscape, making it harder for defenders to maintain visibility across multiple services and authentication systems.

Another troubling trend outlined in Google's report involves attack timelines at polar extremes. Some threat actors now complete initial access to ransomware deployment in under 30 seconds, leaving defenders no time to react at human speed. On the other end of the spectrum, Chinese government-linked groups and North Korean-affiliated scammers have achieved dwell times of 400 days by targeting network edge devices like firewalls and routers. These operators install backdoors that allow them to intercept network traffic and capture credentials without ever needing to penetrate deeper into the victim's environment.

One documented case involved the Chinese spy group tracked as UNC6201, which deployed a backdoor called Brickstorm to break into edge devices lacking endpoint security. From that foothold, they stole valid credentials and accessed victims' VMware environments, remaining undetected for an average of 393 days. This represents a fundamental shift in how sophisticated attackers think about persistence; instead of racing against detection, some groups are investing in the patience to extract maximum value.

The emergence of ClickFix and other interactive tactics suggests that attackers are increasingly comfortable operating at a higher level of technical and social sophistication. The barrier to entry for organised cybercrime is lowering; criminal networks now recruit phone-based operators for modest fees, turning voice phishing into a scalable service rather than a bespoke tactic.

Defenders face a structural challenge. Traditional security controls excel at detecting technical anomalies, but they struggle with social engineering that operates outside technical channels. Vulnerability exploitation remains the leading cause of breaches at 32 percent, and patching remains essential, but organisations can no longer assume that locking down their technical infrastructure is sufficient. The human element has become the decisive factor.

The implications for Australian organisations are direct. Regulators and industry bodies increasingly focus on identity and access controls, yet many organisations continue to underinvest in voice channel security and real-time anomaly detection on authentication systems. Staff training remains valuable, but genuine security requires architectural assumptions that attackers will breach identity controls and organisations must operate as though that compromise is imminent.