Dual LLM Agent
Dual LLM is a built-in security workflow for tools that return untrusted content. It is one strategy Archestra uses to reduce lethal trifecta risk. Instead of letting the main agent read raw output from sources like web pages, email, or user-generated files, Archestra routes that output through two built-in agents with different responsibilities.
For a deeper explanation of the security pattern itself, see the Dual LLM overview.
How It Works
The workflow uses:
- Dual LLM Main Agent: sees the user request and the Q&A transcript, but never the raw tool output
- Dual LLM Quarantine Agent: sees the raw tool output, but can only answer with a constrained multiple-choice response
The main agent asks a constrained multiple-choice question. The quarantine agent picks the best option index. After a few rounds, the main agent produces a short safe summary based only on the answers it received.
This separation limits prompt injection risk because untrusted text never reaches the main agent directly.
When It Runs
Dual LLM runs when a tool's tool result policy is set to Dual LLM. The most common cases are:
- Web search or scraping tools
- Email readers
- File or document readers that return user-controlled content
- Any external source where exact raw text is unsafe but a safe summary is still useful
The Tool Policy Configuration Agent can recommend this automatically for tools that read from untrusted sources. See Tool Policy Configuration Agent.