How financial institutions can improve their governance of gen AI

| Article

Gen AI is reshaping the financial-services industry, from how banks serve customers to how executives make decisions. For all the benefits the new technology offers, including workflow automation, software enhancement, and productivity gains, gen AI also poses significant risks. It can expose a financial institution to legal and reputational risks and increase its vulnerability to cyberattacks, fraud, and more.

Trying to harness the benefits of this technology while warding off the risks can feel like a tightwire act. The heightened concerns stem from how gen AI works. Traditional AI systems are built to manage tasks that are narrow in scope by using proprietary business data. By contrast, gen AI can create new content—often by using public, unstructured, and multimodal data—through a series of complex, multistep processes that can create more opportunities for misuse and error. Traditional AI-risk-governance systems aren’t designed to oversee these additional layers of complexity.

Financial institutions will need to update their AI governance frameworks to account for this increased complexity and the greater points of exposure. This will mean incorporating model risk management (MRM) and new technology, data, and legal risks into their enterprise risk model. They will need to review their oversight of AI and then assess how best to manage gen-AI-specific models going forward.

Financial institutions will need to update their AI governance frameworks to account for the increased complexity and greater points of exposure related to gen AI.

In this article, we explain how financial institutions can update and continually monitor their AI governance frameworks using a gen-AI-risk scorecard and a mix of controls. In this way, they can better identify and mitigate potential risks from gen AI and other technologies long before those risks can cause substantial financial or ethical problems.

Upgrade gen AI governance

To account for gen AI and its potential effects on business, leaders will need to systematically review all risk areas touched by the technology. They should take stock of their oversight systems, gen AI models, and intellectual property (IP) and data use, plus a range of legal and ethical factors.

Oversight systems

In most current arrangements, a single group (such as an MRM committee) oversees all gen AI applications. This approach typically isn’t a good fit for gen AI systems, because they often comprise a blend of different models and software-like components, each of which may need specialized oversight. For example, a gen-AI-powered chatbot that provides financial advice to customers may expose companies to a range of technological, legal, and data-related risks. Accordingly, financial institutions need to decide which gen AI components only require model risk scrutiny and which require a joint review with other risk cells. Close coordination across risk committees can ensure thorough oversight.

Gen AI models

Risk leaders at financial institutions will need new models to manage gen AI risk across their companies. In the past, AI models were built primarily to do one specific task at a time, such as making predictions based on structured data and sorting data based on labels. Such tools might mine past loan data, for instance, to forecast the likelihood that an applicant might default on their loan or to identify optimal loan pricing.

With new multitasking gen AI models, banks can do more than just predict and categorize. They can devise and deliver personalized service, improve customer engagement, and enhance operational efficiency in ways that they couldn’t with traditional AI. For example, gen AI models can automatically create new loan term sheets based on their analysis of similar, previously executed loans. This not only reduces manual work but also can speed up the closing process and improve the borrower’s experience.

However, because gen AI models are trained on both public and private data, they can produce information or responses that are factually incorrect, misleading, or even fabricated—generating, for example, inflated income totals or an imagined history of bankruptcy for a customer querying a gen AI application about loan qualifications. These issues can be minimized using retrieval-augmented-generation (RAG) applications that combine external and internal data to ensure accurate responses. The RAG applications can include legally reviewed language about lending rules and can enforce strict conversation guidelines to help banks manage customers’ interactions with gen AI tools.

With new multitasking gen AI models, banks can do more. However, because the models are trained on both public and private data, they can produce information or responses that are incorrect, misleading, or fabricated.

IP and data use

Gen AI tools can introduce liabilities involving inbound and outbound IP and its oversharing. For instance, a gen AI coding assistant might suggest that a bank use computing code that has licensing issues or that may inadvertently expose the bank’s proprietary algorithms. Some gen AI applications operating in real time, such as ones used in customer service, require a mix of automated and human oversight to catch issues promptly.

Many financial institutions’ data governance controls don’t sufficiently address gen AI, which relies heavily on combining public and private data. This raises concerns about who is responsible for what data and how it’s used. For example, when using gen AI coding assistants, questions and pieces of code from open integrated development environments can be included in the prompts and sent to external gen AI providers. But they might not be saved, and their influence on code recommendations could have legal implications.

Financial institutions should develop systems to track where data originates, how it’s used, and whether it adheres to privacy regulations. Not linking credit decisions to their source data could result in regulatory fines, lawsuits, and even the loss of license for noncompliance. Companies need to keep records for AI-generated content, which can change based on what’s entered.

Legal and ethical factors

Headlines abound about gen AI systems that have run afoul of regulations. Mostly that’s because these models blur the lines between new content and existing content protected by IP laws. This creates confusion about who owns and licenses it. Additionally, when gen AI models are trained on sensitive data, such as customer information, more attention is required for privacy and compliance. These models need careful monitoring so that they don’t expose confidential information or perpetuate biases.

Transparency and “explainability” (the ability to understand how an AI model works and why it makes specific decisions) are also crucial, as the outputs of gen AI systems can sometimes be difficult to trace back to their origins. Financial institutions must establish safeguards to manage these risks throughout the model life cycle to ensure compliance with changing regulations and ethical standards.

Use a scorecard to manage gen AI risk

As financial institutions systematically review customer exposure; financial impact; the complexity of gen AI models, technologies, and data; and the legal and ethical implications, they can use a risk scorecard to determine which elements of their gen AI governance require updates and how urgent the need is. Teams can use the scorecard to evaluate the risks for all gen AI use cases and applications across the company (exhibit).

Teams can use a scorecard to evaluate the risks for all gen AI use cases and applications across their company.

The scale used (scores of 5, 3, and 1, with 1 meaning low risk) reflects the degree of customer exposure and the level of human expert oversight in the inner workings of the gen AI application. It also reflects the expected financial impact, stage of gen-AI-application development, and more. Across these categories, oversight by human experts—particularly for high-stakes applications—is still the most effective way to ensure that gen AI systems don’t make critical errors.

The scorecard can also be helpful to procurement teams in financial institutions that purchase rather than build gen AI applications; they can use it to assess their potential exposure to third-party risk and their comfort with the data and modeling techniques used by sellers of gen AI applications. While some factors may not be totally transparent to buyers, procurement teams can use a mix of vendor due diligence, technical reviews of underlying models, and contractual safeguards to assign risk scores to third-party software and make more informed purchasing decisions.

Introduce a mix of controls to govern gen AI risk

Using a risk scorecard can help financial institutions prioritize gen AI use cases based on the business need and risk/return profile of each case. Scorecards can also signal when problems arise. In both cases, the scorecard must also be supported by a risk management framework, or set of controls, for managing gen AI. Each type of control—business, procedural, manual, and automated—plays a critical role in ensuring the safe and efficient use of gen AI.

Business controls: Don’t block; adjust

Financial institutions will need to design a structure that oversees gen AI risk without slowing down innovation. For example, an organization could use a centralized AI oversight committee in the early stages of adopting a chatbot or other gen AI application. Later, control could shift to a subcommittee or multiple committees. The point is to build in flexibility.

Companies will need to decide how risks fit into their operational models (whether centralized, federated, or decentralized) to better address new challenges posed by gen AI systems. Most financial institutions start with a centralized organizational model for gen AI risk and shift toward a partially centralized or fully decentralized model as their risk management capabilities mature. To move faster, some establish gen AI accelerators to create consistent approaches across departments.

Procedural controls: Stay nimble

For procedures such as handling credit applications, most financial institutions should update their MRM standards. The standards should reflect gen-AI-specific risks, such as how models handle changing inputs and multistep interactions. For instance, if a bank simulates a wide range of customer responses to a virtual assistant, the MRM will need to continuously adapt. Similarly, technology review processes should be streamlined to safely integrate gen AI systems into operations. All updates should include methods for monitoring how gen AI applications adapt over time to ensure that they remain accurate and compliant as they process new prompts and new data.

Manual controls: Keep an eye on the machine

Human oversight is essential for checking sensitive outputs and ensuring the ethical use of gen AI. For example, reviewers need to redact sensitive data before models process it. When it comes to the quality of gen-AI-generated responses, financial institutions should create “golden lists” of questions for testing the models.

They should also solicit lots of feedback from customers and employees. Systems can learn from these human evaluations. The feedback can inform the accuracy and appropriateness of various outputs—for instance, how a virtual assistant “speaks” to a customer should align with institutional values and goals. The outputs should be reviewed regularly and updated as needed to bolster the models’ learning capabilities.

Automated controls: Consider third-party tools

One of the benefits of technology is that it can, in some cases, manage itself. Automated tools can sanitize data at scale, flag unusual use, and start fixes in real time. For instance, many third-party applications can remove sensitive information from documents before processing. Other third-party tools can automate vulnerability testing for gen AI systems, which helps financial institutions quickly identify and address weaknesses. Gen AI models themselves can use a combination of traditional AI and newer technologies to check their own outputs—that is, models checking models—to ensure quality control at high speeds.


As gen AI becomes an even bigger part of financial institutions, risk leaders will need to rethink how they manage the related systems. They will need to move beyond traditional AI risk practices and include real-time monitoring, robust transparency, and stronger safeguards for data privacy and ethics. A comprehensive risk scorecard and a focus on four key sets of controls can help companies find the right balance between pursuing innovation and mitigating risk. More than that, taking a systematic approach to updating gen AI risk governance can help financial institutions unlock the transformative power of this new technology to improve decision-making, customer service, and operational efficiency—and do so responsibly.

Explore a career with us