Compound quality model for recommender system evaluation

A. M. Tsyplov; A. V. Boukhanovsky

doi:10.17586/2226-1494-2025-25-6-1117-1124

Compound quality model for recommender system evaluation

A. M. Tsyplov, A. V. Boukhanovsky

https://doi.org/10.17586/2226-1494-2025-25-6-1117-1124

Full Text:

PDF (Rus)

Generate QR code

Abstract

The study examines approaches to quantifying various effects, such as Position bias, Popularity Bias, and others, in recommender systems. A new quality model of the recommendation algorithms is proposed which reduces the selected metrics to one unit of measurement and determines its impact on the system for each effect. The obtained scores allow for a deeper comparative analysis of various algorithms as well as investigation the behavior of the algorithm in different user segments. For each metric, two conditional marginal distribution densities are built within the framework of the model: separately based on relevant and irrelevant recommendations. Based on the comparison of these densities, the set of possible metric values is divided into normal and critical. The model evaluates the impact of each effect on the system based on the frequency of hitting the values of the corresponding metric in its critical area. To demonstrate how the model works, four recommendation algorithms were analyzed on the MovieLens-100K academic dataset. During the testing, Popularity Bias, the lack of novelty in recommendations, and the tendency of algorithms to recommend objects solely based on user demographic data were evaluated. For each effect, an assessment of its impact on the system is constructed, and an example of predicting an upper estimate of the system quality is given if the corresponding effect is eliminated. The study demonstrated that metrics of effects such as Popularity or Position Bias can change the distribution of absolute values depending on the system. One of the ways to compare different recommendation algorithms more reliably is the proposed quality model. The model is suitable for evaluating personal recommendations, regardless of the scope of application and the algorithm that was used to build them.

Keywords

recommendation systems, ranking, evaluation of the quality of recommendations, popularity bias, position bias, machine learning

About the Authors

A. M. Tsyplov

ITMO University
Russian Federation

Aleksei M. Tsyplov, PhD Student

197101; Saint Petersburg

A. V. Boukhanovsky

ITMO University
Russian Federation

Alexander V. Boukhanovsky, D.Sc., Professor, Head of the School

School of Translational Information Technologies

197101; Saint Petersburg

sc 6603474810

References

1. Anderson A., Maystre L., Anderson I., Mehrotra R., Lalmas M. Algorithmic effects on the diversity of consumption on spotify. Proc. of the Web Conference, 2020, pp. 2155–2165. doi: 10.1145/3366423.3380281

2. Avazpour I., Pitakrat T., Grunske L., Grundy J. Dimensions and metrics for evaluating recommendation systems. Recommendation Systems in Software Engineering, 2014, pp. 245–273. doi: 10.1007/978-3-642-45135-5_10

3. Ding H., Kveton B., Ma Y., Park Y., Kini V., Gu Y., et al. Trending now: modeling trend recommendations. Proc. of the 17th ACM Conference on Recommender Systems, 2023, pp. 294–305. doi: 10.1145/3604915.3608810

4. Cai Y., Guo J., Fan Y., Ai Q., Zhang R., Cheng X. Hard negatives or false negatives: correcting pooling bias in training neural ranking models. Proc. of the 31st ACM International Conference on Information and Knowledge Management, 2022, pp. 118–127. doi: 10.1145/3511808.3557343

5. Abdollahpouri H., Mansoury M., Burke R., Mobasher B. The connection between popularity bias, calibration, and fairness in recommendation. Proc. of the 14th ACM Conference on Recommender Systems, 2020, pp. 726–731. doi: 10.1145/3383313.3418487

6. Beel J., Langer S., Genzmehr M., Gipp B., Breitinger C., Nürnberger A. Research paper recommender system evaluation: a quantitative literature survey. Proc. of the International Workshop on Reproducibility and Replication in Recommender Systems Evaluation, 2013, pp. 15–22. doi: 10.1145/2532508.2532512

7. Wasilewski J., Hurley N. Incorporating diversity in a learning to rank recommender system. Proc. of the 29th International Florida Artificial Intelligence Research Society Conference, 2016, pp. 1–6.

8. Ricci F., Rokach L, Shapira B. Recommender Systems Handbook. Springer, 2010, 842 p.

9. Said A., Bellogin A. Comparative recommender system evaluation: benchmarking recommendation frameworks. Proc. of the 8th ACM Conference on Recommender Systems, 2014, pp. 129–136. doi: 10.1145/2645710.2645746

10. Wilhelm M., Ramanathan A., Bonomo A., Jain S., Chi E.H., Gillenwater J. Practical diversified recommendations on YouTube with determinantal point processes. Proc. of the 27th ACM International Conference on Information and Knowledge Management, 2018, pp. 2165–2173 . doi: 10.1145/3269206.3272018

11. Chang Bo, Meng C., Ma H., Chang S., Gu Y., Peng Y., et al. Cluster anchor regularization to alleviate popularity bias in recommender systems. Proc. of the Companion Proceedings of the ACM Web Conference, 2024, pp. 151–160. doi: 10.1145/3589335.3648312

12. Bellogin A., Castells P., Cantador I. Precision-oriented evaluation of recommender systems: an algorithmic comparison. Proc. of the 5th ACM Conference on Recommender Systems, 2011, pp. 333–336. doi: 10.1145/2043932.2043996

13. Cremonesi P., Koren Y., Turrin R. Performance of recommender algorithms on top-n recommendation tasks. Proc. of the 4th ACM Conference on Recommender Systems, 2010, pp. 39–46. doi: 10.1145/1864708.1864721

14. Abdollahpouri H., Burke R., Mobasher B. Managing popularity bias in recommender systems with personalized re-ranking. Proc. of the 32nd International Florida Artificial Intelligence Research Society Conference, 2019, pp. 1–6.

15. Yi X., Yang J., Hong L., Cheng D.Z., Heldt L., Kumthekar A., Zhao Z., Wei L., Chi E. Sampling-bias-corrected neural modeling for large corpus item recommendations. Proc. of the 13th ACM Conference on Recommender Systems, 2019, pp. 269–277. doi: 10.1145/3298689.3346996

16. Silveira T., Zhang M., Lin X., Liu Y., Ma S. How good your recommender system is? A survey on evaluations in recommendation. International Journal of Machine Learning and Cybernetics, 2019, vol. 10, no. 5, pp. 813–831. doi: 10.1007/s13042-017-0762-9

17. Akiyama T., Obara K., Tanizaki M. Proposal and evaluation of serendipitous recommendation method using general unexpectedness. CEUR Workshop Proceedings, 2010, vol. 676, pp. 3–10.

18. Scott L.M., Su-In L. A unified approach to interpreting model predictions. Proc. of the 31st Conference on Neural Information Processing Systems, 2017, pp. 1–10.

19. Isinkaye F.O., Folajimi Y.O., Ojokoh B.A. Recommendation systems: principles, methods and evaluation. Egyptian Informatics Journal, 2015, vol. 16, no. 3, pp. 261–273. doi: 10.1016/j.eij.2015.06.005

20. Rhee W., Cho S.-M., Suh B. Countering popularity bias by regularizing score differences. Proc. of the 16th ACM Conference on Recommender Systems, 2022, pp. 145–155. doi: 10.1145/3523227.3546757

21. Shani G., Gunawardana A. Evaluating recommendation systems. Recommender Systems Handbook, 2010, pp. 257–297. doi: 10.1007/978-0-387-85820-3_8

Review

For citations:

Tsyplov A.M., Boukhanovsky A.V. Compound quality model for recommender system evaluation. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2025;25(6):1117-1124. (In Russ.) https://doi.org/10.17586/2226-1494-2025-25-6-1117-1124

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2226-1494 (Print)
ISSN 2500-0373 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Compound quality model for recommender system evaluation

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy