When AI Becomes the Attack Surface (6/8): Resilient AI Design

Eran Goldman-Malka · April 3, 2026

After the risk discussion comes the architecture question: what does a resilient AI system actually look like?

Start with isolation and least privilege. Models should not have broad, implicit access to systems. Every tool call needs explicit scope, policy checks, and context-aware approval paths for sensitive operations. Treat agent permissions like service-account permissions, because functionally that is what they are.

Then enforce auditability end to end. You need immutable logs for prompts, tool invocations, policy decisions, and outputs that triggered real actions. Without that chain, incident reconstruction is guesswork.

Prompt sandboxing and runtime validation are equally critical. High-risk prompts, untrusted context, and external content should be processed in constrained environments with output filtering and action gating. Finally, adopt continuous red-teaming: LLM-vs-LLM adversarial testing, conventional penetration methods, and regression tests for policy bypass.

Resilience is not a one-time hardening project. It is a living discipline that combines security engineering with model operations. If your AI stack changes weekly, do your controls evolve at the same pace?

Twitter, Facebook