Preview

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Advanced search

Multilevel splitting for rare events estimation in permutation tests

https://doi.org/10.17586/2226-1494-2024-24-4-654-660

Abstract

Permutation tests are widely employed in statistical analysis, especially when the assumptions of parametric tests are violated, or the data distribution is unknown. However, classical permutation tests encounter challenges when attempting to estimate the probabilities of rare events with high relative accuracy, leading to difficulties in applying corrections for multiple hypothesis testing. In this study, we propose an original method for estimating arbitrarily small P-values in permutation tests, which is based on multilevel splitting for Monte Carlo method. The proposed method involves splitting the original permutation space into non-overlapping levels based on the statistic values. This approach allows the problem of estimating the original probability of a rare event to be reduced to estimating ordinary conditional probabilities for each level. Utilizing such an approach enables efficient estimation of the desired P-values while maintaining a balance between computation time and the level of relative error. The efficacy of the method is demonstrated in its application to the task of estimating arbitrary P-values in the two-sample Kolmogorov-Smirnov test. Comparing the method results with true P-values has shown practical convergence of the method. Moreover, examples of the superiority of the proposed method over alternative asymptotic approaches have been provided. Thus, the proposed method shows significant potential for application across a broad spectrum of scientific fields, such as systems biology, immunology, and others. Furthermore, the method can be adapted for use in various statistical analysis scenarios that require handling probabilities of rare events in permutation tests.

About the Authors

V. D. Sukhov
Washington University in St. Louis
United States

Vladimir D. Sukhov — Researcher

Saint Louis, 63110



G. V. Korotkevich
ITMO University
Russian Federation

Gennady V. Korotkevich — Assistant

Saint Petersburg, 197101



A. A. Sergushichev
Washington University in St. Louis; ITMO University
United States

Alexey A. Sergushichev — PhD, Assistant Professor, Professor

Saint Louis, 63110

Saint Petersburg, 197101



References

1. Good P. Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer Science & Business Media, 2013.

2. Pesarin F., Salmaso L. Permutation Tests for Complex Data: Theory, Applications and Software. John Wiley & Sons, 2010, 448 p.

3. Hammersley J. Monte Carlo Methods. Springer Science & Business Media, 2013, 178 p.

4. Kalos M.H., Whitlock P.A. Monte Carlo Methods. John Wiley & Sons, 2009, 215 p.

5. Trendelkamp-Schroer B., Noé F. Efficient estimation of rare-event kinetics. Physical Review X, 2016, vol. 6, no. 1, pp. 011009. https://doi.org/10.1103/physrevx.6.011009

6. Lestang T., Ragone F., Bréhier C.-E., Herbert C., Bouchet F. Computing return times or return periods with rare event algorithms. Journal of Statistical Mechanics: Theory and Experiment, 2018, vol. 2018, no. 4, pp. 043213. https://doi.org/10.1088/1742-5468/aab856

7. Caron V., Guyader A., Zuniga M.M., Tuffin B. Some recent results in rare event estimation. ESAIM: Proceedings, 2014, vol. 44, pp. 239– 259. https://doi.org/10.1051/proc/201444015

8. L’Ecuyer P., Demers V., Tuffin B. Splitting for rare-event simulation. Proc. of the 2006 Winter Simulation Conference, 2006, pp. 137–148. https://doi.org/10.1109/wsc.2006.323046

9. Glasserman P., Heidelberger P., Shahabuddin P., Zajic T. Multilevel splitting for estimating rare event probabilities. Operations Research, 1999, vol. 47, no. 4, pp. 585–600. https://doi.org/10.1287/opre.47.4.585

10. Botev Z.I., Kroese D.P. An efficient algorithm for rare-event probability estimation, combinatorial optimization, and counting. Methodology and Computing in Applied Probability, 2008, vol. 10, no. 4, pp. 471–505. https://doi.org/10.1007/s11009-008-9073-7

11. Metropolis N., Rosenbluth A.W., Rosenbluth M.N., Teller A.H., Teller E. Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 1953, vol. 21, no. 6, pp. 1087– 1092. https://doi.org/10.1063/1.1699114

12. Hastings W.K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 1970, vol. 57, no. 1, pp. 97–109. https://doi.org/10.2307/2334940

13. Kolmogorov A. Sulla determinazione empirica di una legge di distribuzione. Giornale dell’Istituto Italiano degli Attuari, 1933, vol. 4, pp. 83–91.

14. Smirnoff N. Sur les Écarts de la courbe de distribution empirique. Matematicheskii Sbornik, 1939, vol. 48, no. 1, pp. 3–26.

15. Virtanen P., Gommers R., Oliphant T.E., Haberland M., Reddy T., Cournapeau D., Burovski E., Peterson P., Weckesser W., Bright J. et. al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature Methods, 2020, vol. 17, no. 3, pp. 261–272. https://doi.org/10.1038/s41592-019-0686-2


Review

For citations:


Sukhov V.D., Korotkevich G.V., Sergushichev A.A. Multilevel splitting for rare events estimation in permutation tests. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2024;24(4):654-660. (In Russ.) https://doi.org/10.17586/2226-1494-2024-24-4-654-660

Views: 12


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2226-1494 (Print)
ISSN 2500-0373 (Online)