In 2019, the Federal Reserve Bank of Philadelphia published a study comparing machine learning credit models against traditional logistic regression models used by banks. The study found that machine learning models reduced default rates by 10 to 25 percent at the same approval rate, or alternatively, approved 10 to 15 percent more borrowers at the same default rate. The improvement was consistent across credit categories and borrower demographics. The finding confirmed what fintech lenders had been claiming for years: machine learning makes measurably better credit decisions than the scoring methods the industry has relied on since the 1980s.
Credit risk assessment is the function where machine learning has had its most thoroughly documented impact on financial services. According to MarketsandMarkets, the global AI in finance market reached $38.36 billion in 2024 and is projected to grow to $190.33 billion by 2030. Credit decisioning accounts for a large share of that investment because better credit models directly improve two metrics that drive profitability: loss rates (lower defaults) and volume (more approved borrowers).
How Traditional Credit Risk Assessment Works
The dominant credit scoring model in the United States is the FICO score, introduced in 1989 by Fair, Isaac and Company. The FICO model uses a logistic regression algorithm with approximately 20 input variables derived from credit bureau data: payment history, amounts owed, length of credit history, types of credit in use, and recent credit inquiries.
According to Mordor Intelligence, the AI in fintech market is projected to grow at a compound annual growth rate exceeding 20 percent through 2029, driven by demand for automated fraud detection, credit scoring, and customer service applications.
Research from McKinsey’s 2024 analysis indicates that organisations deploying AI at scale report efficiency improvements of 15 to 25 percent within the first 18 months of production implementation.
The model produces a score between 300 and 850. Lenders set a threshold (typically around 660 for prime lending) and approve applicants above it. The approach is standardised, well-understood, and regulated. It has served the industry effectively for over three decades.
But FICO has structural limitations that machine learning addresses. The model uses only credit bureau data, which means it cannot evaluate borrowers who have limited credit history (thin-file borrowers). An estimated 45 million Americans are credit invisible or have thin files, according to the Consumer Financial Protection Bureau. These individuals may be perfectly capable of repaying a loan, but the traditional system cannot assess them because they lack the historical data the model requires.
The model also treats all borrowers within a score band identically. Two borrowers with a FICO score of 720 receive the same risk assessment even if one has a stable government job with 20 years of tenure and the other recently started freelancing in a volatile industry. The variables that distinguish these borrowers (employment stability, income trajectory, spending behaviour) are not in the credit bureau data and therefore not in the FICO model.
What Machine Learning Changes
Machine learning credit models differ from traditional models in three fundamental ways: they use more data, they find non-linear relationships, and they improve over time.
More data. Machine learning models can incorporate hundreds or thousands of variables. Upstart’s model uses over 1,500 variables including education (degree type, institution, field of study), employment (job title, employer, tenure, industry growth rate), bank transaction data (income patterns, savings behaviour, spending consistency), and geographic factors (cost of living, local economic conditions). Each additional variable that has predictive power makes the model more accurate. Traditional models are limited to roughly 20 variables because logistic regression becomes unstable with too many inputs. Machine learning algorithms, particularly gradient-boosted trees and neural networks, handle high-dimensional data efficiently.
Non-linear relationships. Traditional credit models assume linear relationships between variables and default risk: more debt is linearly worse, longer credit history is linearly better. Machine learning models discover non-linear patterns. A borrower with moderate debt who makes consistent minimum payments may be lower risk than a borrower with very low debt who occasionally misses payments. The interaction between debt level and payment consistency is a non-linear relationship that a traditional linear model cannot capture but that a gradient-boosted decision tree model finds naturally.
Continuous improvement. A FICO model is recalibrated periodically, perhaps every few years. A machine learning model can be retrained monthly or even weekly on the latest loan performance data. This means the model adapts to changing economic conditions faster. During the COVID-19 pandemic, credit models trained on pre-2020 data performed poorly because borrower behaviour changed dramatically. Machine learning models that were retrained on post-COVID data recovered their predictive accuracy within months. Traditional models took longer to recalibrate.
Real-World Impact: The Numbers
The performance advantage of machine learning credit models is documented across multiple companies and studies.
Upstart, which went public in 2020, reports that its machine learning model approves 27% more borrowers at the same loss rate compared to traditional models. Among approved borrowers, the average APR is 16% lower because the model’s greater accuracy allows it to identify low-risk borrowers within pools that traditional models would classify as higher risk. This matters for consumers: a borrower who would have paid 15% APR under a traditional model might pay 12.6% through Upstart’s model.
Zest AI provides machine learning credit models to banks and credit unions. The company reports that its models reduce default rates by up to 30% while increasing approval rates. One of its clients, a top-10 US auto lender, increased approvals by 14% while reducing losses by 25% after deploying Zest AI’s models.
Pagaya, an AI-driven lending network, partners with banks and fintech lenders to identify creditworthy borrowers that the lender’s own models would reject. Pagaya’s AI evaluates applications that fall below the lender’s approval threshold, identifies those that are actually likely to repay, and funds those loans through its own balance sheet. The model processes millions of applications and has funded over $25 billion in loans since its founding.
Square Loans uses a fundamentally different data source for credit assessment: the transaction data flowing through each merchant’s Square point-of-sale terminal. The system analyses daily revenue, transaction frequency, average ticket size, and seasonal patterns to determine lending offers. Square has originated over $17 billion in small business loans. The default rate remains low because the model sees the merchant’s business health in real time, not through quarterly financial statements filed months after the fact.
The Fairness Question
Machine learning credit models introduce fairness concerns that regulators are actively addressing. The core issue is that models trained on historical data can perpetuate historical bias. If past lending decisions unfairly disadvantaged certain demographic groups, a model trained on that data may learn to replicate those patterns.
The Equal Credit Opportunity Act in the United States prohibits discrimination in lending based on race, colour, religion, national origin, sex, marital status, age, or receipt of public assistance. The challenge is that machine learning models can use proxy variables that correlate with protected characteristics without explicitly using those characteristics. A model that considers zip code as a variable is, in some cases, using a proxy for race because residential segregation means zip codes correlate with racial demographics.
Zest AI has built its business partly around addressing this challenge. The company’s platform includes tools that test credit models for disparate impact across protected classes and adjust them to reduce bias while maintaining predictive accuracy. The approach involves identifying which model features contribute most to disparate outcomes and either removing or constraining those features. The result is a model that is both more accurate than traditional scoring and less likely to produce discriminatory outcomes.
The Consumer Financial Protection Bureau requires lenders to provide adverse action notices explaining why a credit application was denied. This requirement creates a tension with complex machine learning models whose decisions are difficult to explain. A model that considers 1,500 variables and uses non-linear relationships between them cannot easily produce a simple explanation like “insufficient credit history.” Fintech companies are investing in explainability techniques (SHAP values, partial dependence plots, surrogate models) that translate complex model decisions into human-readable explanations that satisfy regulatory requirements.
The Road Ahead
Grand View Research estimates that generative AI in financial services will grow from $2.21 billion in 2024 to $25.71 billion by 2033. A significant portion of that growth will flow into credit risk applications as generative AI enables new capabilities: processing unstructured data like bank statement PDFs, generating explanations for credit decisions in natural language, and creating synthetic training data to address class imbalance problems in default prediction.
The most significant shift will be in emerging markets. Billions of people worldwide lack formal credit histories. Traditional credit scoring cannot serve them. Machine learning models that use alternative data, including mobile phone activity, utility payment records, social commerce transaction data, and digital wallet behaviour, are extending credit access to populations that traditional finance has never reached.
Tala has disbursed over $4 billion in loans across Africa, South Asia, and Latin America using mobile data-based credit scoring. M-Shwari, a mobile lending product in Kenya operated by Safaricom and NCBA Bank, uses M-Pesa transaction data to assess creditworthiness and has served over 30 million borrowers. These products exist only because machine learning can evaluate creditworthiness from data sources that traditional models were never designed to process.
Credit risk assessment is not moving toward machine learning. It has already moved. The remaining question is how quickly regulators, traditional lenders, and credit bureaus adapt their frameworks to accommodate models that are demonstrably more accurate, more inclusive, and more adaptive than the scoring systems they are replacing.