Since yesterday, the following image from an article by liberal journalist Evgenya Albats has been making the rounds on the Internet. It shows that whereas Putin’s official tally was 65%, independent observers put it close to or below the 50% marker that would necessitate a second round, such as Golos’ 51% and Citizen Observer’s 45%. Predictably, these figures were seized upon by the liberals to condemn the legitimacy of the elections. As Putin ended up getting 63.6%, while the average of all observers was 50.2%, one could conclude that the level of fraud was 13% or more.
However, as pointed out by Kireev, this is a gross misuse of statistics for political ends, because of the severe sampling problems: Golos observers were concentrated in Moscow, St.-Petersburg, and a few other large cities where Putin is less popular, while Citizen Observer is almost entirely confined to the capital. The website http://sms.golos.org/ collates the results from all the big Russian observer projects, and from the regional data, we can see that about half the election protocols compiled to create these figures were from Moscow; almost another quarter were from Moscow oblast and St.-Petersburg.
Nonetheless, while looking through the regional data, I realized that if it were to be adjusted for its pro-Moscow (anti-Putin) sampling bias, we could get a fairly a good estimate for the level of fraud in this election; or at least, an upper limit for it. And so that’s what I proceeded to do.
After assembling the data, I came up with the following table. The first column are the different provinces. The second column is Putin’s vote according to the observer protocols for that region. The third column is Putin’s vote according to the Central Election Commission. The third column is the difference between the two. It may represent fraud, but it may also be (1) sampling bias – more on this later, (2) natural margins of error, which are especially high in regions where there were few observers. The fourth column is the total number of ballots (both real and spoiled) cast in this region.
|ВВП (набл.)||ВВП (ЦИК)||Х||бул.|
|Республика Адыгея (Адыгея)||59.72||64.07||4.35||220481|
|Республика Марий Эл||57.11||59.98||2.87||381148|
|Республика Татарстан (Татарстан)||72.21||82.70||10.49||2378904|
|Чувашская Республика – Чувашия||59.01||62.32||3.31||702957|
|Ханты-Мансийский автономный округ||65.12||66.41||1.29||707504|
|Чукотский автономный округ||44.84||72.64||27.80||29337|
|Территория за пределами РФ||65.17||73.19||8.02||441931|
|Всего в регионах с наблюдателями||56.11||61.97||5.86||63151581|
As you can see, the figures are more or less as what we can expect from analysis already published on this blog. In Moscow, fraud is minimal, the difference between observer protocols and the official result being less than 2%. We can be fairly certain about this: The protocols analyzed have data on over a million, i.e. some 1,021,810 votes, out of a total of 4,247,438 cast; at almost 25%, this is excellent coverage. Furthermore, the real fraud figure may be smaller than the 1.84% given above because the observers made sure to cover all the stations with the most suspicious 2011 results.
Coverage in St.-Petersburg is far smaller at 5%, but the fraud figure of 8.44% can still be treated as very reliable. It is backed up by other statistical evidence.
To get a figure for the regions in SMS-ЦИК dataset, which accounted for 88% of Russia’s total votes, I took the regional observer protocols’ figures for Putin and weighing them by the total number of ballots in that region. My final fraud figure using this method came out to 5.86%.
This is not a conclusive fraud figure, of course, there still being at least five factors that would further influence it. Two of them are negative, one is probably neutral, and two are positive.
(1) This is a negative factor, but one that is very hard to quantify. The pro-Putin votes are weighted according to turnout, however, it is also the case that regions with greater turnout tend to have more fraud – this is because one of the most common methods of fraud is inflating turnout that almost invariably benefits Putin. But it is important stress that this relationship does not necessarily imply fraud, for it is also the case that there are subgroups of the Russian population – primarily, rural dwellers – among whom turnout is naturally higher. So we can expect turnout to be higher in some of the more rural provinces without fraud being responsible. Separating out the two is extremely tricky and is closely tied to a related problem – to what extent is fraud, or subgroups with specific voting patterns, responsible for Putin’s and United Russia’s long tails?
(2) The neutral factor (more or less) are the margins of error that come from only having a very limited numbers of observers in the more remote regions. For instance, it seems pretty unlikely there was 5% fraud AGAINST Putin in the Komi republic. I am assuming that since there margins can either be positive or negative, they will largely cancel themselves out by the time we calculate the aggregate total.
(3) This is a negative factor. Some regions, accounting for 12% of the total votes, are missing from the SMS-ЦИК dataset: Altai Republic, Buryatia, Daghestan, Ingushetia, Kabardino-Balkaria, Kalmykia, Karachay-Cherkess, Mordovia, Sakha Republic, North Ossetia, Tyva, Khakassia, Chechnya, Kamchatka krai, Kirov oblast, Kostroma oblast, Magadan oblast, Smolensk oblast, Tambov oblast, Jewish autonomous oblast, Nenets autonomous oblast, Yamalo-Nenets autonomous oblast, and Baikonur (Kazakhstan).
The FOM exit poll data showed that even though the North Caucasus was the region most wracked by fraud, it also showed, at 68.4%, the highest genuine support for Putin. The election in Stavrapol krai appear to have been fair – the official figure there was actually higher than the observers’ – so let’s leave its result as is. Assuming that turnout in the ethnic minority republics of the North Caucasus was only 50% or so, as seems more likely based on anecdotal evidence rather than the 90%-like official turnout, then the real, average Putin vote across those areas would then be about 71% – still above the Russian national average, but only moderately so – as opposed to the official 89%. This would raise Putin’s real average score by a bit, but by less than he would lose from the large amount of fraud embodied in them.
Similar things can be said, albeit to a smaller extent, for the other ethnic republics (a few of which, like Buryatia, seem to be quiet fair; others, like Mordovia, which are as fraudulent as anything observed in the North Caucasus). The average Putin vote officially in all the non-North Caucasus, non-observed regions is 68%; of the ethnic Russian majority ones, only about 62%. These regions are already almost or entirely consistent with the national average, so they will have only the most insignificant impacts.
Including all the other regions will up the official score to 63.6% (by definition), but will also increase both the level of fraud and Putin’s real score. So perhaps Putin will go up to 57.0% (thanks to the genuine North Caucasus votes), but fraud will also increase to maybe 6.6%.
(4) Now this is already looking very bad, as bad as the 2011 elections, but fortunately there are two major mitigating factors. First, just as nationwide observers are biased towards Moscow, then logically at the regional level they would likewise be biased towards major urban areas. If a crew of observers volunteer in some Russian backwater province, after hearing Navalny’s call over the Internet, chances are they would hail from the big local urban center. And there are significant voting differences between town and city in Russia, with the rural voters consistently both turning out in greater numbers and giving the Kremlin candidate around 10% or even 15% more votes (e.g., in FOM’s last pre-elections poll, only 43% of Muscovites and 47% of people living in cities of more than one million said they’d vote for Putin, compared to 51% of small towners and 58% of rural folks).
Now some 25% of Russia’s population is rural, and another significant part lives in small towns; the observer presence there is all but minimal, in any one region. As such, the observer protocol figures would systemically understate Putin’s vote. To what extent? Crude back of the envelope calculation, but I think it’s valid: 25% of a subgroup that gives Putin 10% more would give him 2.5% more, and they are very much underrepresented in the poll; add another 0.5% for the small town people. Putin’s real score rises to 60.0%, while his fraud score is compressed to 3.6%; total remains, by definition, at 63.6%.
(5) It is also known that observers concentrated most on polling stations that had a legacy of suspicious results from the 2011 elections. Since it is likely that those stations are still more likely – relative to others – to be bad apples this time round, the focus on them means that the level of fraud may be further artificially skewed.
There are many ways one can interpret these results.
One can cite the 5.86% figure as the most precise one, but one that doesn’t take into account a number of complicating factors. Alternatively, one could argue for a significantly lower figure, like 3.6% – or even lower once you adjust for the last factor. Alternatively, one can argue that the positive factors cancel out the first factor, which is unknown in magnitude but surely significant, and so return to a fraud estimate of 4%-6%. This range would back the two most comprehensive exit polls, FOM which gives Putin 59.3% (possible fraud: 4.3%) and VCIOM, which gives him 58.3% (possible fraud: 5.3%).
Either way, one thing is absolutely clear: A proper analysis of the observer protocols statistics can in no way support the theory that Putin got less than 50%.