AI changes risk in two ways at once: it introduces new technical failure modes, and it increases speed and scale—so small weaknesses become big incidents faster. Boards don’t need to become ML engineers to govern this, but they do need to recognize repeatable patterns.
Why AI “Can’t See the Edges” (A Practical Translation)
In classical statistics, rare events live in the tails—the “edges”—of a distribution. AI systems (especially generative AI) are often strong in the middle of the distribution (common patterns) and weak at the edges (rare, unusual, ambiguous, or out-of-context situations).
In plain language:
- AI can look brilliant on typical inputs and fail hard on atypical ones.
- AI can improvise plausible answers when it should say “I don’t know.”
- AI can struggle with real‑world physical constraints and causality, even when it sounds confident.
Stanford’s AI Index highlights this gap between capability and reliable real-world performance as a governance challenge: models can excel on benchmarks yet fail at basic tasks humans take for granted (AI Index 2026).
Four Board-Relevant Risk Patterns (With Real Incidents)
1) Security failures hidden behind “AI” branding
The “AI” part is often not the problem—the surrounding system is.
Example: Researchers reported that McDonald’s hiring platform (McHire, with Paradox’s chatbot “Olivia”) was accessible via weak credentials and an IDOR-style enumeration risk, exposing massive volumes of applicant data (ICDC case study).
Board takeaway: vendor risk and basic security hygiene still dominate outcomes.
2) Automation without guardrails (operational failure at scale)
When AI is connected to ordering, payments, or workflows, the system can be gamed or break in public ways.
Example: Taco Bell’s AI drive-through ordering saw viral failures including absurd orders like 18,000 water cups and conversation loops, prompting reconsideration of deployment approach (BBC report).
Board takeaway: “human oversight” must be designed as a real control, not a slogan.
3) Prompt injection and policy bypass
LLMs are susceptible to instructions embedded in user inputs that override intended behavior.
Example: A dealership chatbot was manipulated into “agreeing” to sell a 2024 Chevy Tahoe for $1, illustrating the reputational and control risks of customer-facing LLMs without proper constraints (Incident Database entry).
Board takeaway: never let free-form text control high-impact actions without validation and boundaries.
4) Systematic decision harm (high-impact domain risk)
AI risks are not only “chatbot embarrassment.” In high-impact domains, errors can become systemic.
Example: A class action complaint alleges UnitedHealthcare used an AI tool (nH Predict) in post-acute care coverage decisions, with claims that a high share of denials were reversed on appeal—raising governance and oversight questions about automation in consequential decisions (court filing PDF).
Board takeaway: consequential decisions require stronger safeguards, transparency, and escalation.
The “Top 6” Controls Boards Should Demand
These controls consistently reduce risk across the patterns above:
- Inventory + ownership: every AI system has a named accountable owner.
- Least privilege: agents and integrations get minimum access, time-bound credentials.
- Human oversight gates: clear points where humans must approve before action/commitment.
- Logging + monitoring: detect abuse attempts, drift, repeated failure patterns, and tool misuse.
- Vendor governance: due diligence, contract clauses, change-notification, subprocessor visibility.
- Incident readiness: playbooks that include AI-specific incidents (misinformation, leakage, prompt injection, runaway automation).
For LLM-specific security risks, OWASP provides a practical taxonomy that boards can reference when asking “what could go wrong?” (OWASP Top 10 for LLM Apps 2025).
A Final Reality Check
Most “famous AI fails” are not mysterious. They repeat:
- weak controls around access and data
- missing guardrails on what the model is allowed to do
- no monitoring until social media or regulators notice
A good roundup of these patterns (with links into multiple incidents) is here: 4 Famous AI Fails (& How To Avoid Them).
My board courses include real-world AI risk patterns, response considerations, and board-level oversight questions—so you can govern AI without getting lost in technical noise. I also consult with boards on responsible AI adoption. Contact me.
Relevant Sources
- 4 Famous AI Fails (& How To Avoid Them) — Monte Carlo — https://montecarlo.ai/blog-famous-ai-fails
- Taco Bell rethinks AI drive-through after man orders 18,000 waters — BBC — https://www.bbc.com/news/articles/ckgyk2p55g8o
- IDOR Case Study: McHire / Paradox “Olivia” exposure — ICDC — https://research.cgu.edu/icdc/2025/07/01/mcdonalds-july-2025-breach/
- Incident 622: Chevrolet Dealer Chatbot Agrees to Sell Tahoe for $1 — Incident Database — https://incidentdatabase.ai/cite/622/
- Lokken et al. v. UnitedHealth Group Inc. et al. (complaint PDF) — CourtListener/RECAP — https://storage.courtlistener.com/recap/gov.uscourts.mnd.211721/gov.uscourts.mnd.211721.1.0.pdf
- OWASP Top 10 for LLM Applications 2025 — OWASP — https://genai.owasp.org/resource/owasp-top-10-for-llm-applications-2025/
- The 2026 AI Index Report — Stanford HAI — https://hai.stanford.edu/ai-index/2026-ai-index-report
- Generative AI Profile (NIST AI 600-1) — NIST — https://www.nist.gov/publications/artificial-intelligence-risk-management-framework-generative-artificial-intelligence
