How AI Can Be Used to Improve Loan Underwriting

On September 29th, 2008, the US stock market crashed, as the Dow Jones Industrial Average fell 777.68 points in intraday trading. That month, 159,000 jobs were lost. In spite of The Federal Reserve’s attempt to prop up banks with $540 billion in money market funds, the Dow plummeted another 15% in October, while another 240,000 jobs were lost.

There was no pandemic, no shutdown, no quarantines, and yet, America suffered a terrible recession. The cause ultimately boils down to poor loan underwriting practices. Financial experts agree:

“The stock market crash of 2008 was as a result of defaults on consolidated mortgage-backed securities. Subprime housing loans comprised most MBS. Banks offered these loans to almost everyone, even those who weren’t creditworthy.”

AI Loan Underwriting

Loan underwriting is tricky. Everyone has unique financial circumstances, and there’s no way you can be 100% certain that someone will pay back a loan.

If loan underwriting practices are bad enough, it leads to a global economic meltdown that some — particularly African Americans — haven’t fully recovered from, over a decade later, only to be pummeled by yet another recession.

Artificial intelligence can come to the rescue, providing the ability to interpret millions of rows of data in an instant, as opposed to manual underwriting, in which no human could ever analyze the data.

COVID-19 has created a number of shifts that can impact loan underwriting, including decreased earnings, less disposable income, and an overall sense of lower financial security. All this additional data can be analyzed with AI.

Unfortunately, building AI models from scratch is a headache, requiring a lot of work in the weeds of programming languages like Python or R. Fortunately, automated machine learning solutions are finally taking off, with tools like Obviously.AI enabling companies to build models without code.

Predicting loan payment

In practical terms, AI for loan underwriting is about predicting if loans will default or be paid. We can use machine learning classification models, like logistic regression, that predict an output of “default” or “paid” given historical financial data. To do so, simply securely upload data to Obviously.AI, and select the data column that describes whether a given loan was paid or defaulted on.

The historical data should include attributes that are potentially predictive of default probability. We can find relevant research papers by searching for "loan default" on Google Scholar, which informs us of some meaningful attributes. A meta-analysis of 41 studies on student loan default rates shows that these are some of the most important attributes:

  • Whether a student attends a less-than-two-year, proprietary, community college, or less-than-four-year institutions.
  • Whether a student comes from a low-income family.
  • Level of institutional investment.
  • Level of instructional support.
  • Number of dependents claimed.
  • Parental education.
  • Debt and income.
  • Academic enrollment and intensity.
  • ... and many more.

We can find innumerable studies for any kind of loan, informing what data we should use to build a predictive model. Another paper uses attributes like loan-to-value, term of mortgage, borrower occupation, and GDP growth to predict residential mortgage default rates.

This has important ramifications for the loan underwriting industry, which deals with millions of applications a day.

Who knows, armed with state-of-the-art AI, loan underwriters may prevent the next Great Recession.

Exclusive datasets, guides, and insights to your inbox.

Join 3,000 subscribers. GDPR and CCPA compliant.