# Response to Alexander Shen's note

#### Abstract

Response to Alexander Shen's note "On the likelyhood for finite mixture models and Kirill Kalinin’s paper “Validation of the Finite Mixture Model Using Quasi-Experimental Data and Geography”"

Alexander Shen writes, "the expression being maximized, considered as a function of $$f_0, f_i, f_e$$, is a linear function on the triangle $$f_0 + f_i + f_e = 1, f_0; f_i; f_e \geq 0$$." The expression being maximized (from [5]) is not a function of $$f_0$$, $$f_i$$ and $$f_e$$. These "probabilities" are functions of the likelihood and so depend on all the other parameter estimates. For example, when both "incremental" and "extreme" frauds are included the R code that implements the method, which follows [2, 1, 6, 4], iteratively evaluates the following until stable, maximizing values for the likelihood are found:

$$F = (1-f_{\mathrm{i}}-f_{\mathrm{e}})F_0 + f_{\mathrm{i}}F_I + f_{\mathrm{e}}F_E$$

$$h_0 = (1-f_{\mathrm{i}}-f_{\mathrm{e}})F_0/F$$

$$h_I = f_{\mathrm{i}}F_I/F$$

$$h_E = f_{\mathrm{e}}F_E/F$$

$$f_{\mathrm{i}} = \text{mean}(h_I)$$

$$f_{\mathrm{e}} = \text{mean}(h_E)$$

where $$F_0$$, $$F_I$$ and $$F_E$$ are vectors of length $$n$$ (the number of observations) that have the observation-specific likelihoods as elements. $$h_0$$, $$h_I$$ and $$h_E$$ are also vectors of length $$n$$, and $$F_0/F$$, $$F_I/F$$ and $$F_E/F$$ are evaluated elementwise. The likelihood value actually maximized is $$\sum_{i=1}^n(\log(h_0F_0 + h_IF_I + h_EF_E))$$ where $$h_0F_0$$, $$h_IF_I$$ and $$h_EF_E$$ are elementwise products. Shen's "triangle" argument does not apply.

Results from the model of [5] di ffer from results produced by the algorithm of [3] in part because [3] describes a Monte Carlo simulation method not a statistical estimation method based on any kind of likelihood or probability specification.

#### References

1. Dempster A.P., Laird N.M., Rubin D.B. Maximum likelihood from incomplete data via the em algorithm. – Journal of the Royal Statistical Society. Series B (Methodological). 1977. V. 39. No. 1. P. 1–38.
2. Hasselblad V. Estimation of finite mixtures of distributions from the exponential family. – Journal of the American Statistical Association. 1969. V. 64. No. 328. P. 1459–1471.
3. Klimek P., Yegorov Y., Hanel R., Thurner S. Statistical detection of systematic election irregularities. – Proceedings of the National Academy of Sciences. 2012. V. 109. No. 41. P. 16469–16473.
4. McLachlan G., Peel D. Finite Mixture Models. New York: Wiley, 2000.
5. Mebane W.R., Jr. Election forensics: Frauds tests and observation-level frauds probabilities. – Paper presented at the 2015 Annual Meeting of the Midwest Political Science Association, Chicago, April 7–10, 2016, 2016.
6. Wu C.F.J. On the convergence properties of the em algorithm. – Annals of Statistics. 1983. V. 11. No. 1. P. 95–103.