How to logically and statistically explain the alleged IEBC Algorithm
My last two articles on this site are about statistics, first explaining why Amollo Otiende can be excused for not being able to differentiate between x and x, and secondly going into details on the formula Y=1.2045x + 183546 and how that formula indicates a possible manipulation of votes by use of an IEBC Algorithm. The same formula has also been mentioned as the basis for the constant difference of 11% between Uhuru Kenyatta and Raila Odinga.
According to Raila Odinga, the alleged IEBC Algorithm was used by IEBC to ensure that the difference between him and his opponent in the 2017 Presidential Elections President Uhuru Kenyatta always remained at 11%. Similarly, the IEBC Algorithm ensured that Mike Sonko of Nairobi, Joyce Laboso of Bomet, Alfred Mutua of Machakos, Ali Kurane of Garissa, Fahim Yasin of Lamu, and Ole Lenku of Kajiado all won with 54% of the vote, giving rise to now what is famously known as computer generated leaders or vifaranga vya computer.
After penning the article on the details of Amollo Otiende formula, I retreated into my thoughts to figure out if there is a logical way to explain the IECB Algorithm that has been accused of manipulating the Presidial votes during transmission, and it dawned on me that there is actually a statistical explanation for such a straight line – and it all goes back to methodologies of opinion polls.
Opinion Pollsters are in the business of estimating what the public opinion on specific social economic issues at any given point in time is, and the way the public may vote is one of these social economic issues the pollsters always estimate. Although over the years the leading Opinion Pollsters have failed to accurately predict the outcome of elections not only in Kenya but also in France, United Kingdom and United States, the tools and methodologies these Opinion Pollsters deploy and employ to do their predictions have been proven throughout history as both accurate and reliable.
Developing a statistical explanation for the alleged IEBC Algorithm
Assuming there are no shortcomings that make the pollsters not to properly estimate public opinion, shortcomings that range from finances, adequate sampling, improper weighting due to lack of skills and weighting parameters, ample time for thorough analysis and so forth, we would expect that the sampling methodology that accounts for all possible diversities within the population would accurately capture the overall feeling of the population on any given matter, including but not limited to the feeling of how the population intends to vote.
When I was thinking about the implications of the paragraph above, I realized that if the results were streaming randomly from random polling stations across the country, then at some point in time the results that had been received across the country represented an adequate sample that an opinion pollster could have used to accurately predict the final results.
A number of statisticians use above scenario to conduct what they call exit polls. Exit polls that have proven to be more accurate than Opinion Polls normally conduct such polls on about 2,000 to 5,000 voters immediately they have voted. Thus, if we can deduce the point in time to start considering the likelihood that people had voted in a certain way, then it is the point in time when about 2000 to 5000 total votes cast have been reported.
Now let us assume at least 2000 results have streamed in from truly random polling stations across the country. This batch of truly random results should give an indication of how the voting pattern is going to be. After further polling stations stream in their votes, we would expect that these future batches of results should make the voting pattern clearer as they add onto the previous batch, not fuzzier. It is like an Opinion Pollster collecting larger and larger samples for analysis.
For example, if at the time a total of 2,000 votes cast had been reported the leading candidate had 54% of the vote and the second candidate had 44% of the vote, and that if the final result were going to turn out as 53.5% and 45.5% respectively, then as total votes reported continue to increase, the candidates’ percentages should slowly shift towards the 53.5% and 45.5% figures.
As can be seen, the above is a logical explanation that should remove any doubts that some IEBC Algorithm was used to doctor the results during transmission. To explain those many words in figures, I have come up with a simple mathematical model to help explain the concept in a much simpler manner. The model is presented in the Table and Figure below.
The Table above contains hypothetical Presidential results between Raila Odinga and Uhuru Kenyatta that would have been collected every four hours over a period of four days presumably on 8th, 9th, 10th, and 11th August 2017. The hypothetical Uhuru and Raila votes were generated such that the percentage results for Uhuru and Raila were not significantly different from the initial percentages shown as 43.8% for Raila and 54.5% for Uhuru in the second row of the table.
Having generated the data that would closely mimic data randomly streaming in from random polling stations over time, the data was scatter plotted to see the type of regression line that would be generated if this hypothetical data was subjected to the same treatment the actual real life data was subjected to and reported in this previous article. The regression graph is shown below.
The figure above shows that when results stream in randomly over time, and each random batch of results accurately approximate the overall voting pattern, then when a best line of fit is generated by plotting two seemingly unrelated variables that best line of fit matches a best line of fit that would otherwise have been generated from correlated variables.
Conclusion
It has been understood and probably accepted that the Amollo Otiende formula is a direct indication that the presidential results were manipulated during the course of the transmission. However, the science of opinion polling and sampling clearly demonstrate that a similar formula could still be generated if results were streaming in in a pure random manner. The formula presented by Amollo Otiende should therefore not be taken as the sure proof that the results transmission were interfered with through an implementation of an alleged IEBC Algorithm.