Executive Summary
At a recent ACAMS New York Chapter event in New York City hosted by Matrix USA and Haynes and Boone, regulators, law enforcement, bank leaders, and AI specialists came together to discuss one big question, “how do we modernize AI and model risk review processes without losing control of risk, fairness, or accountability?”
The financial services sector is currently navigating an unprecedented epistemological shift. For decades, Model Risk Management (MRM) was predicated on a deterministic understanding of the world where input data processed by static formulas yielded predictable outputs. However, as we approach the 2026 horizon, the integration of Generative AI (GenAI), Large Language Models (LLMs), and increasingly autonomous “Agentic AI” systems has fundamentally altered this landscape. This shift is driving a new category of model risk: autonomous execution risk and emergent behavior, where models do not just get predictions wrong, they can act on those errors at scale before humans can intervene.
At the same time, the regulatory landscape is converging around a few core themes:
- Risk-based AI regulation (EU AI Act, OSFI E-23, AIDA, DORA, MAS FEAT, UK principles-based regime)
- Framework-driven governance (especially NIST AI RMF as a de facto standard of reasonable care in the U.S.)
- Greater expectations for continuous monitoring, fairness metrics, explainability, and human oversight
With that here are the key insights from the panel and from Matrix USA’s own research:
1. Model risk has moved from math risk to behavior risk
For decades, MRM focused on whether the math was correct, and the data was clean. With GenAI and agentic AI, the risk shifts to how systems behave over time in complex environments including feedback loops, emergent behaviors, and autonomous actions across multiple systems. Validation now must move from “Is the model accurate?” to “Is the system’s behavior safe, stable, and aligned with our risk appetite?”
2. The Shift from Predictive to Agentic AI
The most profound risk shift is the transition from Predictive AI (which scores risks) to Agentic AI (which perceives, reasons, and acts). Unlike passive models that flag alerts for human review, Agentic AI can autonomously investigate entities, freeze accounts, and draft SARs. This introduces “autonomous execution risk” where the risk is not just a prediction error, but an irreversible action taken based on that error.
3. NIST AI RMF as the Legal “De Facto” Standard
n the absence of a U.S. federal AI law, adherence to NIST’s AI Risk Management Framework (Govern, Map, Measure, Manage) is quickly becoming the de facto standard of care. Institutions that cannot evidence NIST-aligned AI governance may struggle to defend themselves in enforcement actions or litigation. For banks, this means:
- Mapping critical AI systems to trustworthiness characteristics (reliability, robustness, transparency, fairness, security)
- Embedding these principles into MRM policies, validation templates, and model lifecycles
4. Regulation is shifting from principles to prescriptive enforcement
Globally, regulators are moving from “high-level guidance” to specific expectations:
- EU AI Act risk tiers and governance obligations
- OSFI’s Guideline E-23 on Model Risk Management
- MAS FEAT principles in Singapore
- Sectoral guidance from supervisors like the FCA and others
Expect to be asked for concrete evidence: model inventory completeness, outcome fairness metrics, hallucination testing for GenAI, prompt and RAG change control, and documented circuit breakers for high-risk AI.
5. Existing MRM policies are misaligned with GenAI and LLMs
Most MRM policies were written for static, deterministic models under frameworks like SR 11-7. They break down when applied to GenAI and LLMs. Key gaps include:
- Non-determinism: LLMs can produce different outputs to the same prompt; policies assuming “Input A → Output B” are obsolete.
- Vendor foundational models: Banks often cannot see or validate pre-training data, so lineage-based controls need to shift to vendor transparency reviews and fine-tuning data governance.
- Prompts-as-code: A simple prompt change or RAG update can dramatically change behavior yet often sits outside formal change management today.
- Point-in-time validation: Annual validations are too slow for models that drift weekly; continuous monitoring and conditional approvals will become standard.
6. Fairness and bias are shifting to outcome-based monitoring
Rather than just checking input data for bias, leading institutions are moving to outcome fairness testing:
- Are alerts, SAR filings, account closures, or de-risking actions disproportionately concentrated in certain nationalities, geographies, or customer types?
- Do complex models diverge significantly from transparent challenger models for specific demographics?
- Outcome-based fairness testing, plus challenger-model comparisons, is becoming a key MRM control for AML, fraud, and sanctions models especially where deep learning and LLMs are involved.
7. Human reviewers still own conceptual soundness and accountability
Automation and AI agents can handle much of the math, code checking, and documentation. What cannot be automated is judgment:
- Does this pattern make sense in the real world, or is it a spurious correlation (e.g., “font size” as a driver of credit risk)?
- Does a proposed strategy (e.g., blocking all transactions from a high-risk jurisdiction) align with the institution’s values, risk appetite, and legal obligations?
- Is it ethically acceptable, reputationally defensible, and consistent with financial inclusion goals?
- Human reviewers also remain the accountable parties, regulators and courts will ultimately look to. You can automate testing, but you cannot automate accountability.
8. AI governance must evolve from compliance tax to strategic advantage
The panel’s closing message was clear, if AI governance is treated purely as a check-the-box burden, institutions will always lag behind threat actors and regulatory change.
Firms that win will treat AI governance and MRM as a strategic capability that:
- Enables faster, safer deployment of new AI use cases
- Improves regulatory confidence and reduces exam friction
- Supports better business outcomes (e.g., lower false positives, better customer experience, targeted investigations)
Implications for Financial Crime, AML, Fraud and Sanctions
For financial crime programs, these shifts are not theoretical, they are live issues today:
- AML & Transaction Monitoring: Agentic AI can orchestrate investigations end-to-end, but must operate under strict guardrails, fairness tests, and human oversight.
- Sanctions & Name Screening: LLMs can boost name matching, narrative analysis, and network discovery, but hallucination risk and explainability obligations must be explicitly managed.
- Fraud & Cyber-Enabled Crime: AI not only detects fraud but is used by criminals to generate synthetic identities, deepfakes, and hyper-realistic documentation, raising the bar for KYC and EDD controls.
- MRM, Compliance, and Business leaders must operate from a shared playbook, aligning AI ambitions with regulatory defensibility and ethical constraints.
Conclusion
The ACAMS New York Chapter discussion underscored a simple reality: modernizing AI and model risk review processes is no longer optional. The question is not whether institutions will use AI in financial crime and risk management, but whether they can do so safely, fairly, and defensibly. By 2026, the banks that stand out will not be those with the flashiest models, but those that have built resilient, adaptive governance frameworks that understand and control agentic behavior, industrialize validation, and treat human judgment as a non-negotiable asset rather than a bottleneck.
Matrix USA helps institutions get there translating cutting-edge AI and regulatory developments into practical operating models, policies, controls, and tools that work in the real world of AML, fraud, and sanctions. If you would like to explore how to modernize your AI and MRM frameworks or turn AI governance from a compliance tax into a competitive edge, Matrix USA is ready to help.