I wanted to take a look at COVID-19 data because of the increased coverage in the media over the past month.
Many states began reopening (restaurants, bars, retail) around the middle of May. Below is trend data for “COVID” that shows an increasing trend in searches starting around the first and second weeks of June.
Gathering COVID-19 Data
First, I wanted to create a dashboard that included more data about testing since there has been a lot of coverage in the media about a surge in cases.
The two main places I’ve seen for COVID-19 data were from 1) Johns Hopkins and 2) the New York Times.
Most people are probably familiar with the Johns Hopkins dashboard since it was featured very early on throughout the media.
And most people are probably familiar with the New York Times COVID-19 page that shows a nice set of charts for:
- States where new cases are increasing
- States where new cases are mostly the same
- States where new cases are decreasing
However, the COVID-19 data from both Johns Hopkins and the New York Times does not contain more details like testing data and hospitalization data. The most detail I’ve seen is number of cases and deaths.
This lack of detail led me to find The COVID Tracking Project, which was launched by The Atlantic.
Testing & Hospitalization Data from The COVID Tracking Project
The team at The COVID Tracking Project have really done a great job in gathering more detailed data. They describe challenges faced early on with collecting accurate data:
At the same time, federal public health authorities have elected not to publish complete testing data. From March through mid-May, the CDC published a case count for identified cases of lab-confirmed COVID-19 confirmed by testing. However, it significantly lagged behind other sources of this data, like the gold-standard Johns Hopkins University tracker.
The week of May 9, 2020, the CDC began publishing case counts, deaths, and basic testing data in a new dashboard. Our team compared the CDC’s data with what we were compiling from state sources. We found that although case and death counts were very similar in the two datasets, there were substantial mismatches in the testing data. The same week, an investigation by two of our founders at The Atlantic revealed that not only were several states mixing viral (diagnostic) and antibody (past infection indicator) test numbers in their public data reporting, the CDC was also mixing these test numbers, while labeling the data “viral tests.”
From The COVID Tracking Project “Why It Matters” Page: https://covidtracking.com/about/why-it-matters
For hospitalization data, similar challenges were faced gathering data:
Hospitalization data for US COVID-19 cases remains another area in which no other public source appears to be compiling information from the states and territories. For a total of 99 counties in 14 states, the CDC provides detailed hospitalization (and demographic) data through COVID-NET. However, the CDC does not publicly report state- or county-level hospitalization data for the rest of the United States.
From The COVID Tracking Project “Why It Matters” Page: https://covidtracking.com/about/why-it-matters
The COVID Tracking Project has the most complete data that I’ve seen. It provides more opportunities to analyze testing and hospitalization data on a daily basis.
COVID-19 Dashboard
I connected to data from The COVID Tracking Project and used their API to get daily historic US values.
For grouping data by voting patterns, I collected data from:
- Population by state: https://simple.wikipedia.org/wiki/List_of_U.S._states_by_population
- List of swing states from the 2020 presidential election: https://www.marieclaire.com/politics/a31142244/swing-states-2020-election/
- Number of electoral votes by state: https://en.wikipedia.org/wiki/United_States_Electoral_College
I created some simple 5 & 7-day rolling averages for testing, hospitalizations, and deaths.
Check out the dashboard below. I’ll be updating it over time as this is just a first draft.