Artifical intelligence is reshaping how financial institutions operate—but how far along is that transformation really? And what new risks emerge when AI systems begin making financial decisions alongside humans?
In this Q&A, Pietro Bini, Assistant Professor of Finance at Boston University Questrom School of Business, explores where AI is already delivering value in the financial markets—and where its impact remains early or overstated. The conversation also examines how AI-driven decision-making introduces new forms of bias, risk, and even potential collusion, raising critical questions for investors, institutions, and regulators alike as markets become increasingly shaped by intelligent systems.
How is AI already changing the way financial institutions make decisions around trading, portfolio construction, and risk management?
Financial institutions, in particular hedge funds, have been using forms of AI and machine learning to generate trading signals for some time. Before the popularization of AI through large language models (LLMs), machine learning was already used to extract information from unstructured, alternative data sources. A classic example is hedge funds using image recognition to count cars parked at retailer stores from satellite images and using that information to predict sales in real time.
Large language models have pushed that shift further by making textual information much easier to use at scale. Earnings calls, regulatory filings, broker reports, and news flows can now be processed far more efficiently. In trading, that improves signal extraction. In portfolio construction and risk management, it helps incorporate softer information into existing risk metrics such credit ratings. And today AI can go beyond extracting factual content, as it can pick up subtler dimensions of the data such as changes in tone, disclosure, or narrative.
These applications started from a smaller subset of financial intermediaries but are quickly expanding across institutions.
Where do you see AI delivering the most immediate value in financial markets today—and where is its impact still more limited or overstated?
Beyond the mid-office applications I just mentioned, AI is expanding to both the front office and back office. In the front office, we already start seeing AI for client-facing interactions — robo-advisors are a clear example, offering a cost-efficient alternative to traditional advisory. In the back office, AI applications can support regulatory compliance and fraud detection.
The largest overstatement, however, is about where we stand on the adoption curve. We are still at the early stages: mainly pilot programs, not yet implementations at scale. In a Q4 2025 survey from KPMG [link here], the majority of asset managers confirmed they were still using AI in pilot programs. Scaling a new technology requires two factors: improved technical capabilities and the business capabilities to deploy them effectively. Today, the technical side is evolving much faster than the business side. Scaling AI requires a clear action plan, a strategic vision, a skilled workforce, and the right incentives. You cannot scale if you miss any one of them.
Your research explores AI agents’ beliefs and preferences—how do these “model behaviors” translate into real-world market dynamics?
Traditional models in economics and finance assume that investors are rational — they update beliefs using Bayes’ rule and have standard preferences over risk. Behavioral finance studies how humans deviate from this benchmark and how these deviations can explain specific dynamics in the financial markets. My research applies the same framework to AI agents.
A good illustration is prospect theory and the disposition effect. Under prospect theory, individuals evaluate outcomes relative to a reference point — gains versus losses — rather than in terms of total wealth. They are risk-averse over gains but risk-seeking over losses. In real-world markets, prospect theory is one of the micro-foundations used to explain the disposition effect: investors sell winning positions too soon and hold losing ones too long. This behavior is consistent with investors anchoring on the purchase price and becoming more risk-averse when a stock appreciates and more risk-seeking when it declines.
In my research, I observe similar biases across four prominent families of LLMs. When tested for prospect theory, newer and larger models respond more like humans than older and smaller ones: they evaluate outcomes relative to a reference point, are risk-averse over gains and risk-seeking over losses.
What new types of risk emerge when financial decisions are increasingly shaped by AI systems rather than human judgment?
One challenge is that as AI systems evolve, they show a combination of behaviors that do not move uniformly toward rationality or toward human behavior. In my research, I compare different generations and sizes of large language models and observe that as models become more advanced (newer generations) and larger (more parameters), their beliefs become increasingly rational but their preferences become increasingly irrational and human-like. This trend is quite different from what we observe in the human population. To put this into perspective, a recent study from the TIAA Institute and Stanford’s Global Financial Literacy Excellence Center [link here] shows that financial literacy in the United States has been stagnant for nearly a decade, with the average adult correctly answering only about half of basic personal finance questions — and younger generations scoring even lower. So as AI and human decision-makers increasingly diverge, it becomes harder to anticipate how their interactions will play out in financial markets.
Another important risk is that of collusion among AI agents. Recent research [Dou, Goldstein, Ji (2025)] shows that AI traders can reach collusive outcomes without explicit agreement or direct communication. This type of tacit behavior is also much harder for the regulators to detect compared to traditional forms of collusion.
How should investors and institutions think about bias in AI-driven models, especially when those models are trained on historical market behavior?
The first question is what the correct benchmark should be. There is a common misconception that AI agents in finance should be aligned to human behavior, likely because in many fields the goal is to have AI mimic humans, often referred to as human-AI alignment. But in the finance setting, behavioral biases lead to investment mistakes, so the rational benchmark is preferable — we want AI agents that are less biased than humans, not aligned to human biases. The same TIAA and Stanford study shows that greater financial literacy is strongly linked to better financial outcomes. This reinforces the point: aligning AI to human financial behavior would mean aligning to the wrong benchmark.
Regarding historical market data specifically, two risks stand out. The first is look-ahead bias: back-testing an AI agent on past data that overlaps with its training set. Large language models trained on data up to very recently may appear to predict events they already know, leading to overestimated performance. Developing chronologically consistent AI models is an active area of research. The second is extrapolative beliefs — forming expectations about future returns based on past returns. Researchers have shown that, for humans, extrapolation is one of the most important biases in explaining market dynamics such as financial bubbles. My research shows that larger, more advanced models exhibit more rational beliefs than smaller or earlier-generation models, including lower degrees of extrapolation. That said, given that this bias varies across model generations and sizes, it must be monitored as newer models are released.
Do you see AI as amplifying existing market inefficiencies, or helping to reduce them over time?
In Finance, we refer to market efficiency as the degree to which markets incorporate new information into prices. There are two contrasting forces at play when it comes to AI. On one hand, AI allows both buyers and sellers to process new information faster. As they interact, that information is quickly incorporated into prices, which should improve market efficiency. On the other hand, the risk of AI collusion that we mentioned earlier undermines competition and market efficiency.
What role should human oversight play in financial decision-making as AI becomes more deeply embedded in investment processes?
Human oversight remains central because getting AI right does not mean automating everything. Human oversight should focus on three areas: identifying the best use cases for AI, evaluating AI models rigorously, and managing edge cases.
On use cases, this connects to what we discussed earlier: combining technical capabilities with business judgment to determine where AI creates the most value relative to existing approaches. On evaluation, there is still significant work to be done to evaluate AI models in finance settings. Anecdotally, LLM model cards typically report performance on skills such as coding and mathematical reasoning, but there are very few benchmarks for social sciences, and especially for finance. This is one focus of my research. On edge cases, the key challenge is domain shift — when the data a model encounters during deployment differs significantly from its training data. In these situations, the quality of AI predictions can deteriorate significantly, and human judgment should step in. I am actively working on this area as well.
Looking ahead, what are the most important questions regulators and financial leaders should be asking about AI in capital markets today?
The financial system is built on trust. Trust between consumers, institutions, and regulators. AI challenges that foundation in several ways: data privacy, black-box opacity, and reliability. The central question for financial institutions today is how to build responsible AI systems. This goes beyond traditional risk management; it requires defining what level of AI-driven autonomy institutions are comfortable assuming while maintaining trust, and what guardrails to enforce.
For regulators specifically, one question that I think will emerge is that of possible systemic risk from correlated AI behavior. If investors adopt similar AI models trained on similar data, their behavior may become highly correlated. We already know, for example, that in the corporate bond market, when large institutional investors hold similar portfolios and are exposed to similar shocks, they tend to sell similar assets during adverse periods, which increases systemic risk. How this dynamic extends to broader markets and investors as AI adoption grows is an open question. The problem of AI collusion I mentioned earlier is another area that I think will demand regulatory attention.





















