Is the CEO on my work video call actually a real person?
A $25 million deepfake scam highlights a new enterprise threat. Learn how blood-flow-based liveness verification can distinguish a live executive from a synthetic clone in real time.

The video call appears in your calendar, titled "Urgent: Confidential Transaction." The request seems to come from your CFO. On the call, you see your CFO, CEO, and other senior leaders. They look like themselves and sound like themselves. They instruct you to transfer a large sum of money to a new supplier. The urgency is palpable, the authority is clear. But is the person on the screen real? This scenario is not theoretical. It is the new frontier of corporate fraud, a sophisticated threat known as the deepfake executive video call scam, which uses AI to create convincing, real-time synthetic versions of trusted individuals. For enterprise security and financial fraud teams, the core challenge is no longer just securing networks, but authenticating reality itself.
"In 2023, the number of deepfake fraud attempts detected by organizations surged by 3,000%, a clear signal that synthetic media is becoming an industrialized tool for financial crime."
The anatomy of a deepfake executive video call scam
The new wave of enterprise fraud goes far beyond manipulated emails or cloned voices. It involves multi-participant video calls where every individual, except the target, is an AI-generated deepfake. The case of the global engineering firm Arup, which lost $25 million in early 2024, provides a chillingly clear blueprint for how these attacks unfold. The targeted employee, a finance worker in the firm's Hong Kong office, was summoned to a video call that appeared to include the UK-based Chief Financial Officer and other executives. The deepfake actors on the call looked and sounded authentic, creating a situation of manufactured consensus that convinced the employee to execute 15 unauthorized transactions.
This incident was not the result of a compromised network or a stolen password in the traditional sense. It was an act of "technology-enhanced social engineering." The attackers used publicly available footage and advanced generative AI models to create synthetic clones of the executives, then deployed them in a live, interactive environment. The core of the deception in a deepfake executive video call scam is its ability to bypass human intuition. Our brains are wired to trust the evidence of our own eyes and ears, especially in a familiar context like a work video call. Attackers exploit this trust by creating a scenario that feels entirely normal, making the fraudulent request seem legitimate.
Liveness detection: the new baseline for trust
How can an organization defend against a threat that so perfectly mimics reality? The answer lies in detecting signals that a synthetic clone cannot replicate. Traditional identity verification methods are insufficient. The new security baseline is liveness detection, a specific form of biometric analysis designed to confirm that the person on the other side of the screen is a living, breathing human being present at the moment of verification.
| Detection Method | How It Works | Strengths | Weaknesses |
|---|---|---|---|
| 2D Visual Analysis | Scans video frames for digital artifacts, pixel manipulation, inconsistent lighting, or unnatural movements (e.g., lack of blinking). | Can be fast and computationally inexpensive. Effective against lower-quality or "cheapfake" manipulations. | Easily defeated by advanced generative AI, which is constantly improving to eliminate these artifacts. Relies on a constantly updated list of known tells. |
| Interactive Challenges | Requires the user to perform an action, such as blinking, smiling, or turning their head, to prove they are live. | Simple to implement. Provides a basic level of assurance against a static photo attack. | Creates user friction. Can be spoofed by an attacker holding a device playing a video of the user performing the action. Deepfakes can now often replicate these simple movements. |
| Physiological Analysis (rPPG) | Uses a standard optical camera to detect the microscopic changes in skin color caused by blood flowing through the user's facial tissue. This reveals a live, unique pulse signal. | Fundamentally tied to a biological process that is nearly impossible to fake with digital tools. Passive and frictionless for the user. Highly effective against all known presentation attacks, including deepfakes, masks, and replay attacks. | Requires a clear, well-lit video stream for optimal signal extraction. |
Industry Applications
The need for physiological liveness detection cuts across multiple sectors dealing with high-stakes remote interactions.
Enterprise security teams
For internal security, the primary goal is preventing unauthorized access and fraudulent commands. By integrating rPPG-based liveness checks into internal communication platforms or single sign-on (SSO) systems, enterprises can ensure that a user logging in or issuing a sensitive directive is the actual employee, not a deepfake clone. This is especially critical for privileged users like IT administrators and finance department personnel.
Banks and financial institutions
In the financial sector, the use cases are extensive. Banks can deploy rPPG during high-value wire transfer authorizations, for changes to account details, or during video KYC (Know Your Customer) onboarding. This provides a robust defense against account takeover fraud and prevents the creation of fraudulent accounts using synthetic identities.
Fintech and payment platforms
Fintech companies that rely on speedy, remote onboarding are prime targets for fraud. A deepfake executive video call scam within a corporate client's organization could lead to the fraudulent redirection of funds. By offering rPPG as part of their own security stack, payment platforms can provide B2B clients with a higher level of assurance and protect the entire financial ecosystem.
Current research and evidence
The technology underpinning this new security layer is remote photoplethysmography (rPPG). Academic and commercial research has validated its efficacy as a defense against sophisticated presentation attacks. The core principle was established in studies showing that a standard RGB camera can capture the subtle, periodic changes in light reflection from human skin that correspond to the cardiac cycle.
Researchers like Ming-Zher Poh, Daniel J. McDuff, and Rosalind W. Picard at the MIT Media Lab published foundational work in this area (2010-2011), demonstrating that non-contact, automated pulse measurement was possible using basic webcams. More recent research, such as work presented at IEEE conferences, has focused specifically on "rPPG-based Face Anti-Spoofing." For example, a 2022 paper by Y-F Zhang et al. proposed using 3D Convolutional Neural Networks to analyze the spatio-temporal features of the blood flow signal, proving highly effective at distinguishing live faces from replay and 3D mask attacks. The consensus in the biometric security community, reflected in the ISO/IEC 30107 standard for Presentation Attack Detection (PAD), is that verifying a physiological liveness signal provides a much higher level of security than analyzing static visual artifacts. A deepfake can mimic the look of a person, but it cannot replicate the physiological signature of life.
The future of identity verification in the enterprise
The Arup incident was not an anomaly; it was a harbinger of a new normal. As generative AI becomes more powerful and accessible, the threat of the deepfake executive video call scam will only grow. The security "game" is no longer about static defenses but about a continuous, real-time "arms race." Organizations must assume that any remote interaction could be subject to an attempted spoof. This means shifting the security mindset from authenticating credentials to authenticating the human being behind the credentials. Technologies that measure physiological liveness, such as blood flow detection, are moving from a niche application to a foundational requirement for digital trust in the enterprise.
Frequently asked questions
-
Q: Can't our existing video conferencing software detect this?
-
A: Most commercial video platforms are designed for communication, not biometric security. They do not analyze video streams for physiological liveness signals and can be fooled by the high-quality output of modern deepfake generators.
-
Q: Does this require special hardware for our employees?
-
A: No. Advanced rPPG-based liveness detection works with standard webcams and smartphone cameras already built into employee devices. The analysis is performed by software that interprets the video feed.
-
Q: How does this fit into a multi-factor authentication (MFA) strategy?
-
A: Liveness detection serves as a powerful additional factor, often described as "Factor L" for Liveness. It complements other factors like passwords (something you know) and phone tokens (something you have) by verifying "something you are", a live, present human.
-
Q: What is the difference between active and passive liveness detection?
-
A: Active liveness requires the user to perform a challenge, like blinking or turning their head, which adds friction and can be spoofed. Passive liveness, which includes rPPG, analyzes the user's natural presence without requiring any special actions, providing both higher security and a better user experience.
The challenge of synthetic media and deepfake fraud requires a new generation of security tools. Circadify is at the forefront of developing passive, blood-flow-based liveness detection to help enterprise security and financial services teams authenticate the human, not just the login. To learn more about how to protect your organization from a deepfake executive video call scam, visit our enterprise security solutions page at circadify.com/solutions/fraud-detection.
