Yesterday I tweeted out Obesity Rate for Young Children Plummets 43% in a Decade. This is a big deal, and many people retweeted it. Here’s the summary in The New York Times:
But the figures on Tuesday showed a sharp fall in obesity rates among all 2- to 5-year-olds, offering the first clear evidence that America’s youngest children have turned a corner in the obesity epidemic. About 8 percent of 2- to 5-year-olds were obese in 2012, down from 14 percent in 2004.
They helpfully link to the paper in The Journal of the American Medical Association, Prevalence of Childhood and Adult Obesity in the United States, 2011-2012. And actually, if you read the paper the authors themselves seem very unsure about the robustness of this specific result. I quote from the paper:
…Tests for differences by age in children were evaluated with the following comparisons: aged 2 to 5 vs 6 to 11 years, 2 to 5 vs 12 to 19 years, and 6 to 11 vs 12 to 19 years. Similarly, in adults comparisons were made between aged 20 to 39 and 40 to 59 years, 20 to 39 and 60 years or older, and 40 to 59 and 60 years or older. P values for test results are shown in the text but not the tables. Adjustments were not made for multiple comparisons.
…Similarly, there was no significant change in obesity prevalence among adults between 2003-2004 and 2011-2012. In subgroup analyses, the prevalence of obesity among children aged 2 to 5 years decreased from 14% in 2003-2004 to just over 8% in 2011-2012, and the prevalence increased in women aged 60 years and older, from 31.5% to more than 38%. Because these age subgroup analyses and tests for significance did not adjust for multiple comparisons, these results should be interpreted with caution.
In the current analysis, trend tests were conducted on different age groups. When multiple statistical tests are undertaken, by chance some tests will be statistically significant (eg, 5% of the time using α of .05). In some cases, adjustments are made to account for these multiple comparisons, and a P value lower than .05 is used to determine statistical significance. In the current analysis, adjustments were not made for multiple comparisons, but the P value is presented.
The p-value here is 0.03 for the difference in question. That passes the conventional threshold of significance (0.05), but it is close enough to the border that I’m quite suspicious. Here is the full conclusion of the paper:
Overall, there have been no significant changes in obesity prevalence in youth or adults between 2003-2004 and 2011-2012. Obesity prevalence remains high and thus it is important to continue surveillance.
Granted, these may turn out to be real true results. And the age class that showed a decline in obesity is definitely one we should focus on. But public health is a serious matter, and therefore we shouldn’t get ahead of ourselves.
One hypothesis that presents itself in regards to this paper is that a reviewer asked explicitly about the multiple comparisons problem. The authors acknowledged the problem, without actually checking to see if the results hold after a correction, and then the editor let the paper through. Of course this is just a model. I haven’t tested it, so can’t even offer up a p-value, even if I was a frequentist.
Note: The raw data is here.