Just how accurate is WorldoMeter anyway?

As regular readers know, for the past month or so I've been devoting way too much time to tracking COVID-19 cases & fatalities at the state and county level. For my sources, it's been a combination of state health department websites, the New York Times daily GitHub data archive, the Johns Hopkins University daily GitHub data archive and the WorldoMeter website...which in turn gets their data from other sources. The testing data on my state-level spreadsheet, meanwhile, comes from the COVID Tracking Project website.

For the most part, however, I've settled on WorldoMeter for the state-level data and Johns Hopkins U for the county-level data, as each source formats their data in the most convenient manner for my purposes in porting it to my spreadsheets.

Of course, there's always going to some discrepancies between different sources for a variety of reasons (methodology, timing of data entry, etc), but a Twitter follower of mine sent me a link to this CNN article which delves into the somewhat murky history of the WorldoMeter data site:

Before the pandemic, Worldometer was best known for its “counters,” which provided live estimates of numbers like the world’s population or the number of cars produced this year. Its website indicates that revenue comes from advertising and licensing its counters. The Covid-19 crisis has undoubtedly boosted the website’s popularity. It’s one of the top-ranking Google search results for coronavirus stats. In the past six months, Worldometer’s pages have been shared about 2.5 million times — up from just 65 shares in the first six months of 2019, according to statistics provided by BuzzSumo, a company that tracks social media engagement and provides insights into content.

The website claims to be “run by an international team of developers, researchers, and volunteers” and “published by a small and independent digital media company based in the United States.”

But public records show little evidence of a company that employs a multilingual team of analysts and researchers. It’s not clear whether the company has paid staff vetting its data for accuracy or whether it relies solely on automation and crowdsourcing. The site does have at least one job posting, from October, seeking a volunteer web developer.

It's an interesting article and well worth a read, but for my purposes, the main question is just how reliable/accurate their daily data updates, which I (and many others) rely on for ongoing COVID-19 data tracking, actually are.

I decided, therefore, to compare yesterday's total case counts at the U.S. state & territory levels from both WorldoMeter and the official Johns Hopkins University Coronavirus Resource Center Github data archive to see how closely they match...and was relieved to see that for the most part, they're pretty damned close:

As you can see, their daily positive COVID-19 case totals are identical in 36 states/territories, and are less than 1% apart in 11 more (which is almost certainly due simply to one data entry team entering their daily updates earlier than the other for those states/territories). That leaves 6 states with a discrepancy greater than 1% (plus the U.S. Virgin Islands, where there's a difference of exactly 1 case...but USVI's total is less than 100, thus making it more than 1%).

Of those 6 states, the only ones which really leap out are New Jersey, Texas, Washington State and the big one, New York, which holds 77% of the national gap between the two sources. I'm assuming that nearly all of the 21,210 cases "missing" from the Johns Hopkins data is because New York City is counting thousands of probable cases which New York State isn't including in their official statewide count.

In any event, nationally, the two sources are within 1.3% of each other through yesterday, so while I'll continue to keep an eye on any unusual discrepancies going forward, it looks like WorldoMeters is fairly close to the mark. Of course, there have been a lot of claims--with a decent amount of evidence--that certain states are either deliberately (or unintentionally, in some cases) under-reporting their COVID-19 cases, hospitalizations and/or deaths, so there's a very good chance that both WorldoMeters and Johns Hopkins numbers are actually lower than the actual situation...but I can only work with the data available to me.