Wednesday, November 11, 2020

Benford’s Law and the US 2020 Presidential Election Votes

Benford’s law states that if you get a big range of data from the real world and you look at the lead digit of each of the values you get significantly more 1s than other digits if the numbers span multiple magnitudes.

As one application, Benford’s law is used to detect fraud in accounting. There typically, the pairs of the two first digits are analyzed and plotted according to their frequency in order to detect anomalies. An anomaly can have different explanations though.

For example, in the US 2020 presidential elections, the proportion of digits 1 and 2 on first digits for votes for Mr. Biden is lower than expected, while for votes for Mr. Trump the proportion of digits 1 and 2 on first digits is slightly higher.


In the video below, Matt Parker analyzes the situation and shows that the more densely populated areas in the US, where a majority of Mr. Biden's votes are coming from, have precincts with mostly the same size. Thus here the condition of having data spanning multiple magnitudes is not fulfilled, hence we get a distribution of first digits that deviates from the prediction by Benford’s law.

When looking at the frequency of the last digits, there is an anomaly in the voter data for Mr. Trump. Instead of having a roughly equal distribution of frequency of last digits, the lower digits are much higher. This is due to the fact that a majority of votes for Mr. Trump come from smaller precincts thus favoring the smaller numbers.

 

Thus, the deviation of voting counts (from precincts with a standardized size) from Benford’s Law is not an indicaton of voter fraud but rather a phenomenon to be expected.

Further reading:
Deckert, J., Myagkov, M., & Ordeshook, P. (2011). Benford's Law and the Detection of Election Fraud. Political Analysis, 19(3), 245-268. doi:10.1093/pan/mpr014