UPDATED: It's not just Johns Hopkins University that's shutting down their COVID data project next month...
One of the big public health stories last week was that Johns Hopkins University, which has been operating one of the best COVID-19 tracking projects in the country since the pandemic hit U.S. shores, has announced that they're shutting it down next month:
When the pandemic hit, the federal government struggled to publish snapshots of the virus’ spread.
So, academics and journalists quickly filled the void, creating new tools with near real-time estimates of the unfolding pandemic.
(as an aside, that sounds awfully familiar to me...)
Since January 2020, Johns Hopkins University has operated one of the most prominent resources for tracking covid-19 case counts and deaths across the world.
After more than three years, the university will stop updating its tracker on March 10 as the country has moved into a different stage of the pandemic with a different data flow. But the story of the online dashboard — a $13 million project that’s been viewed over 2.5 billion times — is more than just about a tool to track the pandemic. It underscores the country’s fragmented public health systems and its decentralized and underfunded reporting system, which hobbled the U.S. pandemic response.
“The thing that became the biggest surprise was the importance and reliance on it by everyone: by the general public and decision-makers and everyone in between,” said Lauren Gardner, the director of the Johns Hopkins Center for Systems Science and Engineering who started the global tracker with one of her PhD students.
“Hopkins filled a gap that nobody else was able to do,” Ali Mokdad, a professor of health metrics sciences at University of Washington’s IHME. “So all of us, reporters, us in academia, we went to Hopkins to get the data.”
The article cites two reasons why JHU is shutting down the project: First, supposedly the CDC has stepped up their COVID data collection/reporting operations, although if that's the case it's certainly news to me.
The second is far more depressing:
The quality of the tracker depends on the quality of publicly available data. At the beginning of the pandemic, states, counties and even cities were providing daily covid updates, which scrapers for the tracker could collect data from. But that’s not the case anymore, and that has a direct impact on what the dashboard can do.
...Beth Blauer, an associate vice provost at Johns Hopkins University who helped run the Coronavirus Resource Center, said that many states closed up their covid reporting with little information on any next steps.
This is absolutely the case, and it makes my job (along with so many other healthcare data analysts) far more difficult. For instance, here's the latest COVID data reporting frequency of every U.S. state & territory according to the JHU COVID Tracker GitHub repository. It's important to note that while this doesn't look too bad, there was a point when nearly every cell was filled (Mon - Fri, anyway). Plus, not every state is reporting county-level data anymore (ahem...Florida & Nebraska), and some may only be reporting cases & hospitalizations, not deaths at the county level anymore:
So, what will I do once JHU shuts down their tracking project? Well, there are still other data sources...but some of those also rely on the JHU project and/or may be shutting down themselves in the near future as well. For instance, there's the Community Profile Report published by the White House COVID-19 Team's Data Strategy and Execution Workgroup in the Joint Coordination Cell (that's a mouthful!). This report is published weekly on Fridays, and has been my go-to source for county-level death data from Florida in particular (for some reason, JHU's github always seems to be a week or two behind the Community Profile Report for Florida).
After over two years of public reporting, the Community Profile Report will no longer be produced and distributed after February 2023. The final release will be on February 23, 2023. We want to thank everyone who contributed to the design, production, and review of this report and we hope that it provided insight into the data trends throughout the COVID-19 pandemic. Data about COVID-19 will continue to be updated at CDC’s COVID Data Tracker.
UPDATE: Sure enough, the last update to this was indeed on February 23rd.
(sigh) OK, how about the NY Times COVID data GitHub, which I use for Utah's county-level data (JHU has tracked Utah data accurately but they use "regions" of the state instead of counties for some reason, while the NY Times uses the actual counties)?
In the coming weeks, most likely in March, the data in this repo for daily cases and deaths will no longer be updated. The Times plans to change its Covid tracking pages to use data from the federal government. This GitHub repo will serve as an archive of the daily case and death reporting that The Times has done since early 2020.
As case and death reporting at the local level has become less frequent and comprehensive, the daily data we have been able to gather has become less useful for indicating real-time trends about the virus. The Times will continue to publish the latest data from federal sources on its website.
Here are links to the data published by the federal government if you would like to switch sources.
- The C.D.C.’s primary Covid data dashboard contains links to many different sources of data.
- For data on cases and deaths, the C.D.C. publishes a time series of weekly cases and deaths by state as reported by state health departments.
- For county-level data, the White House publishes weekly Community Profile Reports with new cases and deaths and many other metrics.
- The C.D.C. also publishes a version of their case surveillance database in line-list format, with cases at the county-level by month. This data is updated roughly once a month.
- For deaths data, the C.D.C.’s National Center for Health Statistics maintains a dashboard with a variety of different data files on Covid mortality based on reconciled death certificates. This deaths data is collected with a different methodology than the other cases and deaths data, which is drawn from the case surveillance database, and would be our recommendation for any analyses of Covid mortality.
Of course, you can scratch the third bullet off the list above since the Community Profile Report is also being discontinued next week.
UPDATE: Sure enough, the NY Times github project was discontinued as of March 24th.
There's other potential sources as well, such as:
- COVID Act Now
CNN (relies on JHU data)
- Washington Post
- WorldOMeters (state/country level)
- World Health Organization (country level)
...and so on...but again, most of these don't drill down to the county level, and some of those which do either rely on JHU data themselves and/or don't make it easy to export the data. Right now COVID Act Now seems to be my best bet, but we'll see.
In any event, I plan on posting one final set of analyses/death rate graphs sometime in early March. After that...well, it depends on whether or not I'm able to continue gathering up to date, comprehensive, accurate county-level COVID data or not.