Power BI: Exploring PPP Loan Data From The Small Business Administration

CARES Act Signed Into Law March 27, 2020

As a part of the Coronavirus Aid, Relief, and Economic Security Act (CARES Act), the $669 billion loan program known as the Paycheck Protection Program (PPP) was established to help businesses keep jobs.

The Small Business Administration (SBA) released loan data about the Paycheck Protection Program on 7/7/2020.

The data published includes csv files with loan level details such as:

  • Business names (for loans less than $150,000)
  • Loan amounts (for loans less than $150,000)
  • Loan ranges (for loans greater than $150,000)
  • Number of jobs retained per loan
  • NAICS numbers to analyze loans by sector/industry

Visualizing $521 Billion of Loans

I wanted to analyze and compare figures across states and industries to gain insights (as opposed to only reading the media).

But first I needed to work through some data collection problems.

Some of the issues with the data:

  • The number of loans without a NAICS number
  • The number of loans without a business name (withheld for privacy for loan <$150k)
  • 75% of all PPP loans did not include any demographic information because they were not provided by the borrowers (race, gender, veteran)
  • Incorrect states (ex: “FI” instead of “FL” for Florida, states listing “XX” or “AE”)

Another challenge was creating a column that contained loan amounts instead of loan ranges.

I used Power Query to create columns for high, mid, and low amounts (e.g., loan range a $2-5M became $5M, $3.5M, and $2M).

From there, I generated random numbers based on the loan range. This would allow me to create unique loan amounts instead of having thousands of loans with the same midpoint amount.

Below is the first version – I’ll continue to update the dashboard over the next week.

Here is a link to open it in a new tab: https://app.powerbi.com/view?r=eyJrIjoiZjc4MjYzZGQtZTNhNS00ZDk3LTgwZTAtMDlmOTZjOTAyMzU3IiwidCI6IjcyN2VmOWY5LTc1YWQtNGFiMC1iNzg0LTdlNThiZDM5YjA5MSJ9