An analyst at a Wall Street bank in 2022 would spend half a morning reading a 200-page 10-K to answer a single portfolio manager’s question. The same analyst in 2025 drops the document into a large language model trained on financial disclosures and gets a structured summary in 40 seconds. The practical change is that big — and the regulatory and market implications have arrived just as quickly. Bloomberg announced BloombergGPT, a 50-billion-parameter model trained on 708 billion tokens of mixed financial and general text, on March 30, 2023. By 2025, the global generative AI in financial services market sat at $1.95 billion according to Precedence Research’s generative AI in financial services industry report, with North America at 42% of that spend and cloud-based deployments running the overwhelming majority of production traffic. Regulators in the EU have already moved from observation to fieldwork: the ESMA-Alan Turing working paper reported 85% of surveyed financial firms were using LLMs in some form, and less than 20% of those deployments were fully customized.
How LLMs moved from research into production
Large language models are not new to financial services research. The earlier generation of NLP tools — word embeddings, BERT-style encoders, named-entity extractors — had been running inside research desks and compliance teams since 2019. What changed in 2023 was the capability jump at the generative layer. Models could produce structured outputs, follow complex instructions, and read documents they had never seen before. That shifted the cost curve.
BloombergGPT was the first broadly publicized financial-specific LLM. The paper released on March 30, 2023 detailed a 50-billion-parameter decoder-only model trained on 363 billion tokens of Bloomberg’s financial data and 345 billion tokens of general text — 708 billion tokens in total. The headline result was that a finance-tuned model could outperform general-purpose models on financial benchmarks without losing ground on the general NLP tasks every bank also needs. It established the blueprint that every major financial institution has either copied or bought access to.
The second shift was infrastructure. Cloud-based LLM deployment — the OpenAI API, Azure OpenAI Service, AWS Bedrock, Google Vertex AI — became the default for experimentation, and the default for most production workloads that did not require the model weights to sit inside the institution’s own network. Cloud-based deployments accounted for more than 58% of the market by revenue share in 2025, according to Precedence Research. On-premises and private-cloud deployments held the rest, concentrated at institutions handling the most sensitive data or with the strictest regulatory obligations.
What the LLM-in-finance market looks like in 2025
The market is larger than the research narrative suggests because generative AI in finance is not a single product — it is a layer that has been added to almost every text-heavy workflow inside a bank, asset manager, insurer, or fintech. The picture below consolidates the most-cited figures.
| Metric | Value | Source |
|---|---|---|
| Generative AI in financial services, 2025 | $1.95 billion | Precedence Research |
| Projected market size, 2035 | $17.88 billion | Precedence Research |
| Forecast CAGR, 2026–2035 | 24.81% | Precedence Research |
| North America revenue share, 2025 | 42%+ | Precedence Research |
| Cloud deployment share, 2025 | 58%+ | Precedence Research |
| BloombergGPT parameters | 50 billion | Bloomberg / arXiv 2303.17564 |
| BloombergGPT training corpus | 708 billion tokens | Bloomberg / arXiv 2303.17564 |
| Surveyed firms using LLMs (EU) | ~85% | ESMA / Alan Turing |
The Precedence Research segmentation also shows that risk management is the application projected to grow fastest over the forecast period, ahead of customer service and document processing. That lines up with how internal budgets have been reallocating: the first wave of LLM spend went into customer chatbots, but the second wave — the one that is actually driving revenue impact — is going into risk, compliance, and research.
What LLMs actually do inside a financial institution
Strip away the product-marketing layer and LLM use cases inside finance cluster into five distinct jobs. The first is document review and extraction. Credit analysts, underwriters, and M&A teams spend their time reading legal contracts, regulatory filings, and deal documents. A properly-prompted LLM can extract covenants, flag unusual clauses, and produce summaries. The speed improvement is 10x to 20x, and the quality on well-structured documents is competitive with a mid-level human analyst.
The second is compliance and policy retrieval. Compliance officers need to answer “what does the bank’s policy say about X” against a repository of hundreds of internal documents. Retrieval-augmented generation (RAG) has become the default architecture for this, because it lets the model cite policy passages rather than generate plausible-sounding but wrong rules. Every tier-one US bank now runs at least one production RAG system over its compliance library.
The third is customer-facing dialogue — chatbots, virtual assistants, and agent-handoff systems. This was the first use case most banks shipped in 2023-2024, and it is also where most banks discovered the limits of generative AI. Hallucinations, liability exposure on incorrect answers, and the cost of live monitoring all pushed these deployments toward tight guardrails and mostly-deflection-rather-than-resolution workflows. The systems that have worked best are the ones that treat the LLM as a routing layer, not an answer generator.
The fourth is research and summarization. Sell-side research, buy-side investment committees, and internal strategy teams use LLMs to summarize earnings calls, synthesize news, and draft initial research notes. This is where the overlap with sentiment analysis systems US traders and fintechs rely on to turn text into signal is clearest — an LLM can produce a sentiment score, a thesis summary, and a draft narrative in one pass.
The fifth is fraud and AML investigation. An investigator reviewing a suspicious-activity alert historically had to piece together transaction history, customer communications, and external news manually. LLMs handle the text-heavy part of that stitching. The anti-money-laundering systems in US fintech have been among the earliest adopters, because the case-narrative writing step that used to take investigators 30 minutes can now be drafted in a minute, freeing the analyst to focus on judgement calls.
The vendor and deployment map
The LLM vendor map inside finance divides into three layers. At the foundation-model layer, banks and asset managers are overwhelmingly using OpenAI (via Azure OpenAI Service), Anthropic (via Bedrock or direct API), and Google’s Gemini family. Meta’s Llama 3 and 4 and Mistral’s models are common for on-premises or private-cloud deployments where the weights need to be hosted inside the institution. A smaller number of firms — led by Bloomberg with BloombergGPT, and some internal research groups at JPMorgan, Morgan Stanley, and Goldman Sachs — have trained their own domain-specific models from scratch.
At the application layer, the market has fragmented. Every big vendor already serving financial institutions — NICE Actimize, BlackRock Aladdin, Bloomberg Terminal, Factset, S&P Capital IQ — has added LLM features to their existing workflows. A new wave of AI-native vendors — Hebbia, Harvey, Arkifi, Writer — sells horizontal LLM agents into the same buyers. The buying pattern that has settled out is that incumbents keep the workflow, and new entrants win on speed and quality for specific verticals.
At the infrastructure layer, the cloud hyperscalers dominate because the model weights and inference infrastructure live there. AWS Bedrock, Azure OpenAI Service, and Google Vertex AI account for most of the production volume, and the cloud segment is more than 58% of market revenue in 2025. On-premises LLMs sit mostly inside institutions with the most sensitive data or the tightest data-residency rules — tier-one banks in regulated markets, sovereign wealth funds, and some insurance carriers.
What the regulators are watching
Financial regulators have moved faster on LLMs than they did on earlier waves of machine learning, because the risks show up in areas they already police. The ESMA / Alan Turing Institute working paper on leveraging large language models in finance, which draws on a June 2024 workshop with 38 technology and finance experts, is the most detailed public framing of the supervisory stance in the EU. The paper’s central finding — that around 85% of surveyed financial organizations already use LLMs, but fewer than 20% are working with fully customized models — is the reason regulators see a short window to shape governance standards before deployment patterns become entrenched.
The specific risks regulators call out fall into a tight cluster: hallucinations producing misleading customer communication, model-drift affecting compliance decisions over time, data leakage to third-party model providers, and the concentration risk that comes from every major bank running the same handful of foundation models. The US supervisory picture sits on the same axis, with the OCC and the Federal Reserve emphasising that existing model-risk management rules apply to LLMs without exception.
This is one of the reasons the category overlaps heavily with the work US banks are doing on machine learning in finance and where US firms are getting measurable value from it, where the same governance frameworks that cover credit-scoring models have been extended to cover LLM-based applications.
What it means for fintechs and operators
For fintech founders, the opportunity is narrow but real. The foundation-model layer is closed — nobody new is going to raise enough capital to train a competitor to GPT-4-class models for generic use. The application layer is where new entrants win, specifically the ones that pick a narrow enterprise workflow and deliver measurably better accuracy on it than a horizontal chatbot can. Document review for asset-backed lending, credit memo generation for middle-market banks, and research summarization for long-short hedge funds are all workflows where focused startups have already built defensible businesses.
For lenders, the operational change is that the model risk function has to take on LLMs as a distinct category. Governance frameworks written for credit models do not map cleanly onto a generative system whose output space is open-ended. Most mid-sized US lenders have been building new controls — prompt libraries, output filtering, human-review sampling, hallucination-rate metrics — throughout 2024 and 2025.
For operators inside incumbent financial institutions, the question has become whether to buy, build, or partner. Most have landed on a layered strategy: foundation model from a cloud provider, retrieval infrastructure built in-house, specialist agents bought from vendors for high-value workflows. The institutions that have treated LLMs as infrastructure rather than as a single project have been able to ship multiple applications off the same governance scaffolding. The ones that started with isolated pilots are now spending 2026 consolidating.
The bottom line
Large language models are no longer a research topic inside finance — they are a $1.95 billion market growing at 24.81% per year, a regulatory category with formal supervisory attention, and a standard tool at most tier-one US and EU banks. BloombergGPT’s release in March 2023 was the signal. The 85% adoption figure ESMA reported two years later was the confirmation. The firms that have treated the technology as a layer to be governed, rather than a product to be demoed, are the ones still shipping. The rest are catching up to controls they should have written before they deployed.