My last post looked at data in FL & OH in the pre-Covid and post-vaccine eras and used other health indicators to show that differences in outcomes could just as easily be correlated to factors other than vaccination rates or (more pathetically) political affiliation. In other words, the paper completely ignores the healthy user bias of vaccine recipients which certainly accounts for some, if not all, of the apparent benefits of the vaccine.
Truth be told, I had not looked deeply at the original paper. I had read responses from Berenson and Prasad to the paper and felt confident I got the main argument it was making. A comment from reader of the month Bart Miller in my last post, however, spurred me to look at the original paper more closely. Having done so, I discovered a problem with the paper that is at best extremely sloppy & at worst fraudulent. In years past I would have been baffled by the fact that 3 members of the Yale(!) faculty put their names on this, and that the JAMA published it, but I am almost finally completely disabused of the integrity of the institutions when there is a political point to be made.
You can’t do this…..
When looking at an academic paper, it’s only common to perhaps skim through the text & look at any charts and graphs to understand the main results (this blog makes use of visuals frequently). So, with the author’s thesis that Republicans starting having more excess deaths than Democrats after Covid started & particularly after the vaccine was available, one might find the following very compelling:
In the above, we see that excess deaths were essentially equal across the parties in 2018-19, then started to show some differences in 2020 & further into 2021. When I looked at this more closely, however, I thought to myself, “Gee, 2018-19 is so flat and uniform… there basically no large deviations (up or down) in either party for any week in 2 years. That’s weird (and statistically improbable).” In other words, something smells fishy. Then I looked at the paper’s supplemental materials on their methods. This tells me how they calculated Excess Deaths. In order to calculate Excess Deaths, you need a baseline death rate. Here is how they calculated that:
So, the baseline deaths expected are based on a statistical model created that used the observed deaths from 2018-19. Then to calculate Excess Deaths, they compare the actual deaths to the predicted deaths of their model. What this means is that when they are calculating “Excess Deaths” in 2018-19, they are comparing the actual deaths to predictions from a model that was based on the same 2018-19 data. That is, as I said before, at best extremely sloppy (wrong). The portion of the graph above for 2018-19 does not show Excess Deaths, what it actually shows is the accuracy of the model they built to fit the 2018-19 data. Unless their models stunk, the calculation they made necessarily will hover closely around zero. If they really wanted to show what Excess deaths looked like in 2018-19, they would need to compare deaths in those years to a baseline created from some other period of time, say 2013-17. What the authors have essentially said is “Using 2018-19 as our baseline death rate, we found no excess death in 2018-19”. Well, duh.
But wait, doesn’t the data still show the differences between R’s & D’s increased after the vaccine was available…..
Yes, it does seem to show this, however I have huge doubts about the validity of that claim (even ignoring the healthy user bias confounders). For 2020 and beyond, the authors calculation of Excess Deaths is to take actual deaths and divide by “expected deaths in a Covid free world”. The latter value is generated from their model. The model is based on 2018-19 data and projects forward what they think deaths would have been if not for Covid. Seems reasonable, yes? On its face, I agree.
Here’s the problem… Anyone who has done forecasting will verify that, almost universally, the further into the future you try to forecast something, the less accurate the forecast will be. Also, the more granular the forecast is, the worse it is likely to be. It’s possible for Proctor and Gamble to forecast pretty well how many diapers they will sell in the US next month. Now ask them to forecast how many diapers they will sell in week 12 of 2025 in Fulton County, GA among 65-74 year old Republicans. Think it will be as accurate? These authors are essentially doing the latter. They are saying, based on the data in 2018-19, they have created a model to predict the (Covid free world) deaths for a specific county, in a specific week, for a specific age group and political party for 2 years into the future. PUH-LEEZE!
So, the apparent divergence of outcomes between R’s & D’s could just as easily be caused by the fact that their baseline death rate predicted by the model is getting less and less accurate the further into the future we go. If their model had a bias to predict (Covid free world) deaths for R’s too low, and D’s too high1, then as you go further into the future, they would show excess deaths as being higher for R’s and lower for D’s. Or it could be the opposite, maybe the gap is even wider than they show. Without knowing how accurate their forecasting model is, we really have no way of knowing. This results shown could be completely an artifact of the inherent inaccuracy of such a forecasting model, regardless of what was actually happening.
Political bias and blinders
The 1st error pointed to above is honestly so bad that I again almost can’t believe Yale professors and JAMA reviewers let it go. In my view, this shows how blind people can be when a conclusion aligns with what they want to be true.
As a technical note, I have yet to see any statistics or evidence of the accuracy of their forecasting model, but I would be astonished if it was anywhere near accurate enough to support the conclusions of the paper.
@Bart Miller - regarding your comment in the previous post. It may be that there are truly differences in outcomes of the age groups, or like I talk about in the 2nd section above, might just reflect different inaccuracies & biases of their forecasting model. Honestly, this paper needs to line a bird cage.
This sounds eerily similar to the Canadian study that said the unvaxxed were more likely to be in car accidents because of “Distrust of the government, a belief in freedom, misconceptions of daily risks, faith in natural protection, misinformation and personal beliefs”.