Research problem: “The 60-30-10 phenomenon in senatorial elections: Is this simply an artifact of law of large numbers?”


Dr. Felix Muga III showed that the total votes of Team PNOY (12), UNA (9), and other candidates (12) follow the 60-30-10 pattern for all canvass times. COMELEC explains this phenomenon as simply the result of the law of large numbers. Our aim then is to verify this claim of COMELEC by changing the elements of each 12-9-12 grouping and see if a similar constant ratio of a:b:c still holds for each canvass.


A. Listing the Combinations

You have three bins: _ _ _. The first bin has 12 candidates. The second bin has 9 candidates. And the last bin has 12 candidates. The number of combinations for the first bin regardless of permutations or rearrangements is C_1 = 33!/((33-12)!12!) = 33!/(21!12!). Once you have chosen the elements of the first bin, the number of combinations left for the second bin regardless of permutations is C_2 = 21!/((21-9)!9!) = 21!/(12!9!). And once you have chosen the elements of the second bin, the number of combinations left for the third bin is C_3 = 12!/((12-12)!12!) = 12!/(0!12!) = 1. Thus, the total number of combinations of 33 senators placed in 12-9-12 bins is

C = C_1C_2C_3 = [33!/(21!12!)][21!/(12!9!)][1] = 33!21!/(21!12!12!9!) = 33!/(12!12!9!) = 1.0429×10^(14).

This listing cannot anymore be done by hand.

B. Computing the ratios

For each combination of candidates in the 12-9-12 bins, compute the total number of votes B_1 in each bin 1, the total number of votes B_2 in bin 2, and the total number of votes B_3 in bin 3. Define the vote vector

V = [B_1, B_2, B_3]/(B_1 + B_2 + B_3) = (b_1, b_2, b_3),

where b_1, b_2, and b_3 are greater than 0 but less than 1. We can see that the only linearly independent variables are b_1 and b_2, because b_3 = 1 – b_1 – b_2. Plot the b_1 and b_2 in the x and y axis. Trace the path of the (b_1, b_2) point as a function of integer canvass time t. If the b_1:b_2 ratio is fairly constant, then the plot will look like a fuzzy ball of a particular radius. Measure the radius of the smallest ball that contains all the points. Alternatively, one may get the root-mean-square value of the distances of the points from the centroid and use this RMS value as the radius.

C. The Bubble Chart

We now have a table with columns defined by (combination in bin 1, combination in bin 2, b_01, b_02, R_0), where (b_01, b_02) is the centroid or the average percentage values of each bin . We plot (b_01, b_02, R_0) in a bubble chart.

D. Clustering

We cluster the bubbles according to bubble radius. We use the 2D standard deviation (or RMS value) of the percentages Team Pinoy-UNA-Others combination as a unit of measurement. We classify bubbles according to sizes and we make a histogram. We compute the probability that a normalized bubble radius is between 0 and 1, between 1 and 2, between 2 and 3, and so on. If the probability for the normalized bubble radius is at its peak at 1, then we have reason to believe that what COMELEC says is true: it is just the law of large numbers. But if the peak is elsewhere and farther from 1, then we have a reason to doubt COMELEC’s statement.


I don’t have data for each canvass. This is simply the number of votes counted for each candidate during each canvass.


I need the help of a programmer.