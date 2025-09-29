COMMENTARY: Generative AI has the potential to deliver up to $340 billion a year in productivity gains to the banking sector — if banks can build Gen AI solutions that scale.

Right now, banks are struggling with this, largely because each Gen AI model used in an institution must be validated to comply with government regulations. Validation can be a challenging process, and as the number of Gen AI use cases for banks grow, internal model risk management (MRM) teams are facing overload. As a result, the validation stage is becoming a bottleneck that impedes scalability.

What makes financial Gen AI model validation so difficult now?

In this article, we’ll examine how integrating compliance considerations into the design of Gen AI models can accelerate model validation and enable banks to realize productivity improvements and revenue gains.

Three factors complicate the Gen AI model validation process.

: Thanks to drag-and-drop tools in platforms like Microsoft’s Power Apps, anyone working at a bank or fintech can create a Gen AI model even if they’re not part of the IT team. To avoid the risk of shadow AI , and to comply with risk management rules, every one of these models must be validated by the MRM team. The validation process is the same for each model regardless of complexity, whether it’s a marketing personalization model or a model for pricing derivatives on the market.

Currently, the U.S. has no specific guidelines for Gen AI model validation. In the EU, the AI Act provides some general guidelines on how to test a Gen AI model that can serve as inspiration for a validation checklist, although the guidance is not written specifically for financial services.

Why traditional model validation falls short with Gen AI

When banks use third-party platforms for their generative AI models, they often lack visibility into how these tools generate their output. This “black box” nature prevents MRM teams from using standard validation processes.

Assess the quality and suitability of the input data used to train the model. Evaluate the output quality, relevance, fairness, and robustness of model deliverables. Verify the context the model operates in to ensure it’s used for its intended purpose.

Standard validation models for risk management have three main steps:

Compliance by design for Gen AI in banking and finance

This process works well for traditional models like derivative pricing, because the inputs are transparent, the outcome is well-defined, and it’s comparatively easy to compare the model’s outcomes to historical market data to see if the model is working properly. With Gen AI models, the inputs are often a black box, so the MRM team doesn’t know how the model has been trained or what code it uses, so it’s difficult to assess quality. This is where the bottleneck happens.

Clearing this roadblock requires adapting the end-to-end validation process to the unique characteristics of Gen AI models. Instead of focusing on black-box inputs, the focus is on controls, documentation, and tests. Ideally, the testing and documenting of Gen AI systems starts when they are designed and built and continues through the final validation by the MRM team. This approach allows the MRM team to see how inputs and outputs are managed, what the context is, whether the models’ results are explainable, and how quality is monitored.

What documents the model ingested Why those specific documents were used What prompt was used and why How the model has been tested What results the model generates How those results are to be used. What guardrails are in place to keep the model within its use case.

For example, let’s say a sales manager wants to use a third-party generative AI tool to guide the team on sales calls using internal policy and procedure documents. The first step to making this model compliant by design is ongoing documentation of:

Unlocking Gen AI value through compliance-focused design

The next step is testing — with ongoing documentation. Testing Gen AI models is more difficult than testing traditional models whose output can be compared to historical data. With Gen AI models, testing may require several months to accumulate enough user feedback or model outputs to assess the results. During this waiting period, documentation of testing and monitoring strategies is another critical element in designing for compliance.

Although Gen AI model risk management presents new challenges, the good news is that banks and financial services already have a very strong risk framework. With this risk framework as a foundation, it’s possible to manage Gen AI properly and in a way that enables scale. At a high level, you can start by categorizing risks associated with each model. For example, if you have a Gen AI system in place to handle compliance report writing, what will happen if that system goes down or starts to hallucinate outputs? Does your organization have the expertise to take over the system and manage the process if that happens? What skills might your people need in order to work successfully alongside this Gen AI model and to step in if there’s a problem?

Approaching Gen AI model-building with a compliance mindset from the beginning can give MRM teams the information they need to assess and validate models faster and more accurately. The compliance by design approach may be a challenge for employees to adopt, but once they’re in the habit of documenting their model-building processes, it will be easier to build, test, and scale Gen AI use cases. Once a bank reaches that stage, it will be easier to realize the productivity gains that offer so much potential value to the banking industry, while also appropriately managing this new type of risk.