At this point, Davao City Mayor Rodrigo Duterte’s victory is already accepted as fact—with the numbers as close to indisputable as you can get. It gets tricky, though, when we talk about the VP race. The pre-election surveys predicted how close the fight between Camarines Sur Representative Leni Robredo and Senator Bongbong Marcos would be. And sure enough, actual election returns now separate the two with the slimmest of margins.
The closeness of this race has inspired a lot of mudslinging, name-calling, and just general hate towards the other camp. But what caught our attention recently was the sudden influx of math- and statistics-related analyses on our Facebook feeds that, depending on what you read, proves electoral fraud (i.e. that BBM is being cheated out of the Vice-Presidency), or inspires confidence in Robredo's eventual victory. Let’s take a look at how the numbers are being crunched and try to see the logic behind them.
The afterMATH of the elections
One way of looking at the data was presented by David Yap (a former Ateneo professor) and Antonio Contreras (a professor at DLSU) on Facebook. The most basic of their arguments states that at a certain point during the partial and unofficial count, there was a sudden reversal in the trend—that after a particular point in time, BBM’s upward trend suddenly nosedived in favor of Leni Robredo. Assuming that election returns are received randomly, a sharp change in trend can indeed be taken as an anomaly.
Let’s skip the complicated jargon and talk about this in ways we can all understand—using basketball and Stephen Curry. Imagine Steph in a free throw shooting contest against your barkada’s resident geek who has never played a competitive game of basketball in his life. It’s reasonable to assume that Steph will make nine shots out of 10 (because he shot 91 percent from the FT line this season) and that your friend will miss nine shots out of 10 (because, well, he sucks).
The rules are simple:
1) Each one takes free throw shots at a completely random order (meaning Steph can shoot twice, then your friend thrice, then Steph once, your friend twice, etc.)
2) They do this until they have a total of 20 FT attempts between them
3) And the clincher: Every made shot is a vote for BBM and every missed shot is a vote for Robredo.
For every shot attempt they take, track the makes and misses. That is, at the end of every turn, plot how many more makes there are over misses. It doesn’t matter who makes the shot or who misses it, you’re just tracking the total shots made (votes for BBM) vs. the shots missed (votes for Leni). So if the first two shots are good, that’s a two-point lead for BBM. If the next three shots miss the mark, then Leni now has a one-point lead. Plot it out on a graph and it should look something like this:
*Total number of made shots minus the total number of missed shots after every turn
In the scenario we had, the NBA's back-to-back MVP is assumed to make most of his shots (mostly votes for BBM) while your friend is presumed to miss most of his shots (mostly votes for Leni). But since we had them shoot at a random order, we're unable to discern a sharp increase or decrease between the number of makes and misses because, to some extent, every time Steph makes a shot, your friend will probably miss a shot as well.
Now compare it to the graph that Yap and Contreras released that shows the trend of Marcos’ lead over Robredo:
The difference is noticeable. So if the election returns are indeed received by the servers at random (remember, we don’t know which regions will transmit first as there is no pre-determined order for transmission), then there shouldn’t be a change this drastic.
So what’s up, LP?
But as the Robredo camp has pointed out, through Ateneo de Manila University professor Alyson Yap, the transmission of election return isn’t really random in the strictest sense of the term. We actually have an idea which precincts will transmit their votes first. Less populated regions will finish voting first. Precincts with faster internet speeds will transmit faster. On the flipside, areas with military conflict or shoddy web service will most likely be the last to transmit their election returns.
Now let’s go back to our free throw contest scenario. Given this new knowledge that things aren’t so random as they seem, let’s change the rules a bit. This time, let’s make Steph take 10 free throws first and let your friend take the last 10 shots. The graph of cumulative makes and misses will now look like this:
Looks familiar, doesn’t it? Since Steph (or again, those presumed to vote for BBM) took his free throws first, there were more makes than misses for the first ten shots. But when your friend (or the one we assume will most likely vote for Leni) started taking his shots after Steph, the misses started piling up.
As it turned out, the regions that transmitted first (represented by Steph shooting first) such as NCR and what is known to be the “Solid North,” are known to be Marcos strongholds while the election returns that came in later (represented by your friend taking his shots after Steph) are reportedly from Robredo’s balwartes.
Given this scenario, it’s perfectly normal to see a sharp change from BBM leading in the VP race to Leni overtaking him. This, though, does not prove that there is no electoral fraud.
Or the initial argument of Yap and Contreras, does little to prove that there was fraud, either. New evidence has surfaced to suggest the idea of cheating via altering computer codes, but that’s an entirely different argument.
Another way of looking at it
But that’s not saying that we can’t use the data to project (not conclude) who the winner will be. Jose Ernie Lope (UP Diliman; team leader of the Philippine Team in last year’s International Mathematical Olympiad) predicts a possible Robredo victory (by 205,790 votes, to be exact) by a simple method using ratio and proportion.
Returning to our basketball analogy, if Steph Curry shoots 50 percent during the first quarter, we can assume that he will also make half of his shots during the second, third, and fourth quarters. Yes, this is a poor way of estimating his end-game percentage because we know he probably won’t shoot 50 percent for the entire game. But if we take his shooting percentage at the end of the third quarter, then that will be closer to the shooting percentage he’ll have at the end of the game.
As of publishing this article, the data shows that 16 out of 18 regions have transmitted at least 94 percent of their total votes to the Comelec servers, with only Region 13, ARMM, and overseas absentee votes being below 90-percent transmission. This is like getting the shooting percentage in the last few minutes of the fourth quarter—definitely not a fool-proof way of determining the final numbers, but simple, elegant, and definitely not biased.
At the end of the day, the election boils down to simple addition: it’s one vote plus one vote plus one vote. And no matter where those “one votes” come from, and no matter what order you add them in, the sum will always be the same.
The way we see it, the only way to definitively check for fraud is to wait for the final count and compare them with the actual ballots. While we encourage keeping an eye out for anomalies, proving or disproving anything conclusively using predictive models, incomplete data, and with little to no actual evidence is moot and unnecessary. In the meantime, why don’t we grab a beer, let the electoral process run its course, and hope that the Philippines gets the Vice-President it deserves.
And besides, we’re getting a headache from all this math.