Chapter 07 · The Atlas

The Six Failure Modes.

Section Seven · The Six Failure Modes

AI failure is use-case-specific, not technology-specific.

The workstream documented six canonical AI failure modes. Each is operationally distinct. Each carries a different mechanism, a different cautionary cohort, and a different mitigation. The deepest source of enterprise AI program failure in the workstream is the conflation of these modes under a generic AI risk framing. The integrator's calibration discipline is to read the specific failure pattern that applies to the specific use case.

Failure Mode 01

False-Positive Saturation

Domain · KYC and AML

Production AML stacks routinely generate false-positive rates between ninety and ninety-eight percent. Compliance teams drown in alert backlogs. Regulator enforcement follows. The mechanism is not model performance in isolation. It is the absence of a calibrated alert threshold negotiated with the regulator, plus the absence of an alert-triage and investigation workflow that absorbs the high false-positive rate without breaking. Tier-1 AML stacks consistently report alert-to-SAR conversion below five percent. The cost cohort lands on the second-line compliance function which carries the regulator-facing burden when the backlog builds.

Canonical case: Multiple US and UK enforcement actions across 2022 to 2024 flagging AML-tuning inadequacy; the 90 to 98 percent false-positive operating norm reported in ACAMS and BPI surveys. Implication: AML AI is a threshold-and-workflow problem, not a model-accuracy problem.

Failure Mode 02

Overforecasting and Overstock Writedown

Domain · Inventory Management

AI-driven demand forecasting models trained on stable-demand history fail when the demand environment moves. The forecast continues to project the prior regime for one to three quarters before management overrides. The result is overstock, inventory writedown, multi-quarter earnings impact, and frequently a management change. Six named operators are documented in the cohort: Target Canada (2014), Beyond Meat (2022-23), Allbirds, Peloton (2022), Walmart general-merchandise (Q4 2023), and Kroger (2024). The pattern is consistent. The mitigation is not better forecasting algorithms in isolation. It is a regime-shift detection layer underneath the forecasting layer.

Canonical case: Target Canada 2014; Peloton 2022; Walmart Q4 2023. Implication: AI demand-forecast is a regime-detection problem at the macro layer, not an accuracy problem at the micro layer.

Failure Mode 03

Industrial-Platform Overinvestment

Domain · Predictive Maintenance and Industrial AI

The platform thesis — a horizontal industrial-AI platform that bypasses specialist vendors and runs every PdM use case from a single product layer — has failed at scale. GE Power Predix is the canonical case. The horizontal-PdM-platform bet contributed materially to the Power-segment goodwill impairments across 2018 and 2019 and never landed at the customer site at the scale the strategy required. The lesson the workstream extracts is that PdM is a specialist surface owned by domain-specific vendors — AspenTech, AVEVA, Senseye (Siemens), OSIsoft (AVEVA PI), and the analogous rotating-equipment, refinery process-control, and condition-monitoring stacks — not a horizontal IIoT platform play.

Canonical case: GE Power Predix horizontal-IIoT platform bet (2018-19 Power-segment impairment cycle). Implication: Industrial AI is a specialist-vendor procurement problem, not a platform problem.

Failure Mode 04

AI-Attack Outpacing AI-Detection

Domain · Fraud Detection

Deepfake-enabled social engineering is moving faster than fraud-detection models can adapt. The February 2024 Hong Kong twenty-five-million-dollar CFO-deepfake case is the workstream's anchor incident: a video call with multiple convincingly synthesised participants triggered a multi-million-dollar wire transfer that was authorized in good faith. Detection models trained on pre-2024 baselines are visibly behind the attack surface. The mitigation surface is not a single model. It is a multi-channel verification protocol — voice biometric callback, second-channel authorization, hardware-token confirmation — sitting underneath the AI detection layer.

Canonical case: Hong Kong $25M CFO deepfake (February 2024); broader 2024 deepfake fraud cohort. Implication: Fraud AI is a protocol-design problem at the workflow layer, not a detection-accuracy problem at the model layer.

Failure Mode 05

Generative AI Hallucination Retreat

Domain · Customer Service and Knowledge-Worker GenAI

GenAI pilots reach production, generate brand-damage incidents, and walk back to narrower scopes or hybrid human-AI architectures. The Klarna February-to-May 2024 walk-back is the workstream's canonical reference: a public claim that GenAI had did the equivalent work of seven hundred agents was retracted within months and the operator publicly returned to a human-AI hybrid. Air Canada lost a tribunal ruling over its chatbot's bereavement-fare statement in February 2024. McDonald's and IBM terminated the three-year drive-thru voice partnership in June 2024. NYC MyCity delivered citizen advice that was illegal under municipal regulation. DPD disabled its UK chatbot after a viral incident. The pattern is consistent. The retreat is the result.

Canonical case: Klarna walk-back (Feb-May 2024); Air Canada Moffatt; McDonald's-IBM; NYC MyCity. Implication: GenAI customer service is a scope-and-handover-design problem at the conversation-architecture layer, not a model-quality problem.

Failure Mode 06

AI-Pricing Antitrust and Consumer Backlash

Domain · Pricing Optimization

AI-coordinated pricing has now drawn formal antitrust action; AI-driven surge pricing has hit consumer-acceptance limits. RealPage was named in a US Department of Justice antitrust action in August 2024 over algorithmic rent-pricing coordination across landlords. Wendy's announced AI-driven surge pricing in February 2024 and walked back within days after consumer backlash. Both signals point at the regulatory and consumer-acceptance frontier on AI-coordinated and AI-personalized pricing. The mitigation surface is not a pricing-model accuracy problem. It is an antitrust-and-consumer-protection design problem at the pricing-policy layer.

Canonical case: RealPage US DOJ antitrust (August 2024); Wendy's surge-pricing walk-back (February 2024). Implication: AI pricing is a regulatory-and-consumer-acceptance design problem, not an optimization problem.

The argument the six modes carry

The six modes are not a generic AI risk taxonomy. They are six structurally distinct failure mechanisms, each with a distinct cohort, a distinct regulator response surface, a distinct cost cohort, and a distinct mitigation. False-positive saturation is a workflow problem at the threshold-and-investigation layer. Overforecasting is a regime-detection problem at the macro layer. Platform overinvestment is a procurement problem at the vendor-selection layer. AI-attack outpacing AI-detection is a protocol design problem at the verification layer. GenAI hallucination retreat is a scope-and-handover problem at the conversation-architecture layer. AI-pricing antitrust is a regulatory-design problem at the policy layer.

The integrator's calibration discipline reads which mode applies to which use case before it reads which technology to deploy. The reverse reading — selecting a technology first and applying a generic AI risk overlay second — produces the failure cohort. The technology-first reading is the dominant failure pattern in the workstream cautionary cohort.

← Previous

Chapter 06 · The Language-Overlay Pattern

Chapter 08 · Anchors and the Cautionary Cohort