Predicting the results of the 16-factor R. Cattell test based on the analysis of text posts of social network users
https://doi.org/10.17586/2226-1494-2023-23-2-279-288
Abstract
We investigated the possibility of automating the prediction of the 16-factor personality traits by R. Cattell from text posts of social media users. The proposed new method of automating the evaluation of R. Kettell’s 16-factor personality test traits includes language models and neural networks. Implementation of the method involves several steps. At the first step text posts are extracted from user accounts of social media, pre-processed with language model RuBERT and previously trained over a full-connected neural network. The result of this step is a normalized empirical distribution of the posts by the previously introduced classes for each user. Subsequently, based on the distribution of user posts the evaluation of the expression of psychological features of the user is made with the help of support vector machine, random forest and Naive Bayesian classifier. The final data set for model building and further testing their performance was made up of 183 respondents who took the R. Cattell test, with links to their public social media accounts. Classifiers predicting results for six factors (A, B, F, I, N, Q1) of R. Cattells 16-factor personality test were constructed. The results can be used to create a prototype of automated system for predicting the severity of psychological features of social media users. Results of work are useful in the applied and research systems connected with marketing, psychology and sociology, and also in the field of protection of users from social engineering attacks.
Keywords
About the Authors
V. D. OliseenkoRussian Federation
Valerii D. Oliseenko — Junior Researcher
Saint Petersburg, 199178
sc 57219554703
M. V. Abramov
Russian Federation
Maxim V. Abramov — PhD, Head of Laboratory, Senior Researcher
Saint Petersburg, 199178
sc 56938320500
References
1. Vander Shee B.A., Peltier J., Dahl A.J. Antecedent consumer factors, consequential branding outcomes and measures of online consumer engagement: Current research and future directions. Journal of Research in Interactive Marketing, 2020, vol. 14, no. 2, pp. 239–268. https://doi.org/10.1108/JRIM-01-2020-0010
2. Fayaz A., Muhammad Z.T., Ayaz A. The Big Five dyad congruence and compulsive buying: A case of service encounters. Journal of Retailing and Consumer Services, 2022, vol. 68, pp. 103007. https://doi.org/10.1016/j.jretconser.2022.103007
3. Shanahan T., Tran T.P., Taylor E.C. Getting to know you: Social media personalization as a means of enhancing brand loyalty and perceived quality. Journal of Retailing and Consumer Services, 2019, no. 47, pp. 57–65. https://doi.org/10.1016/j.jretconser.2018.10.007
4. Woods S.A., Mustafa M.J., Anderson N., Sayer B. Innovative work behavior and personality traits: Examining the moderating effects of organizational tenure. Journal of Managerial Psychology, 2018, vol. 33, no. 1, pp. 29–42. https://doi.org/10.1108/JMP-01-2017-0016
5. Bouiri O., Lotfi S., Talbi M. Correlative study between personality traits, student mental skills and educational outcomes. EducationSciences, 2021, vol. 11, no. 4, pp. 153. https://doi.org/10.3390/educsci11040153
6. Chekalev A.A., Khlobystova A.O., Tulupyeva T.V. Applicant’s decision support system for choosing the direction of study. Proc. of the XXV International Conference on Soft Computing and Measurements (SCM), 2022, pp. 226–228. https://doi.org/10.1109/SCM55405.2022.9794902
7. Stoliarova V.F., Tulupyev A.L. Cumulative mean function of public posting episodes in the online media with regard to user’s digital traces: Limited data on publications dates and profile data. Proc. of the XXV International Conference on Soft Computing and Measurements (SCM), 2022, pp. 25–27. https://doi.org/10.1109/SCM55405.2022.9794894
8. Thielmann I., Spadaro G., Balliet D. Personality and prosocial behavior: A theoretical framework and meta-analysis. Psychological Bulletin, 2020, vol. 146, no. 1, pp. 30–90. https://doi.org/10.1037/bul0000217
9. Clark C., Davila A., Regis M., Kraus S. Predictors of COVID-19 voluntary compliance behaviors: An international investigation. Global Transitions, 2020, vol. 2, pp. 76–82. https://doi.org/10.1016/j.glt.2020.06.003
10. Khlobystova A.O., Abramov M.V., Tulupyev A.L. Soft estimates for social engineering attack propagation probabilities depending on interaction rates among instagram users. Studies in Computational Intelligence, 2020, vol. 868, pp. 272–277. https://doi.org/10.1007/978-3-030-32258-8_32
11. Piotrowski C., Sherry D., Keller J.W. Psychodiagnostic test usage: A survey of the society for personality assessment. Journal of Personality Assessment, 1985, vol. 49, no. 2, pp. 115–119. https://doi.org/10.1207/s15327752jpa4902_1
12. Goldber L.R. An alternative “description of personality”: the big-five factor structure. Journal of Personality and Social Psychology, 1990, vol. 59, no. 6, pp. 1216–1229. https://doi.org/10.1037/0022-3514.59.6.1216
13. Schwartz S.H. A proposal for measuring value orientations across nations. Questionnaire Development Package of the European Social Survey, 2003, no. 259(290), pp. 261–319.
14. Cattell H.E.P., Mead A.D. The sixteen personality factor questionnaire (16PF). The SAGE Handbook of Personality Theory and Assessment. V. 2, 2008, pp. 135–159. https://doi.org/10.4135/9781849200479.n7
15. Plutchik R., Kellerman H., Conte H.R. A structural theory of ego defenses and emotions. Emotions, Personality, and Psychotherapy. Boston, Springer, 1979, pp. 227–257. https://doi.org/10.1007/978-1-4613-2892-6_9
16. Tulupyeva T.V., Tafintseva A.S., Tulupyev A.L. An approach to the analysis of personal traits reflection in digital traces. Bulletin of Psychotherapy, 2016, no. 60(65), pp. 124–137. (in Russian)
17. Azucar D., Marengo D., Settanni M. Predicting the Big 5 personality traits from digital footprints on social media: A meta-analysis. Personality and Individual Differences, 2018, vol. 124, pp. 150–159. https://doi.org/10.1016/j.paid.2017.12.018
18. Oliseenko V.D., Tulupyeva T.V. Neural network approach in the task of multi-label classification of user posts in online social networks. Proc. of the XXIV International Conference on Soft Computing and Measurements (SCM), 2021, pp. 46–48. https://doi.org/10.1109/SCM52931.2021.9507148
19. Oliseenko V.D., Eirich M., Tulupyev A.L., Tulupyeva T.V. BERT and ELMo in task of classifying social media users posts. Lecture Notes in Networks and Systems, 2023, vol. 566, pp. 475–486. https://doi.org/10.1007/978-3-031-19620-1_45
20. Tay L., Woo S.E., Hickman L., Saef R.M. Psychometric and validity issues in machine learning approaches to personality assessment: A focus on social media text mining. European Journal of Personality, 2020, vol. 34, no. 5, pp. 826–844. https://doi.org/10.1002/per.2290
21. Bleidorn W., Hopwood Ch.J. Using machine learning to advance personality assessment and theory. Personality and Social Psychology Review, 2019, vol. 23, no. 2, pp. 190–203. https://doi.org/10.1177/1088868318772990
22. Kahn J.H., Tobin R.M., Massey A.E., Anderson J.A. Measuring emotional expression with the Linguistic Inquiry and Word Count. The American Journal of Psychology, 2007, vol. 120, no. 2, pp. 263– 286. https://doi.org/10.2307/20445398
23. Hartmann J., Huppertz J., Schamp C., Heitmann M. Comparing automated text classification methods. International Journal of Research in Marketing, 2019, vol. 36, no. 1, pp. 20–38. https://doi.org/10.1016/j.ijresmar.2018.09.009
24. Eichstaedt J.C., Kern M.L., Yaden D.B., Schwartz H.A., Giorgi S., Park G., Hagan C.A., Tobolsky V.A., Smith L.K., Buffone A., Iwry J., Seligman M.E.P., Ungar L.H. Closed and open-vocabulary approaches to text analysis: A review, quantitative comparison, and recommendations. Psychological Methods, 2021, vol. 26, no. 4, pp. 398–427. https://doi.org/10.1037/met0000349
25. Devlin J., Chang M.-W., Lee K., Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL). V. 1, 2019, pp. 4171–4186. https://doi.org/10.18653/v1/N19-1423
26. Peters M.E., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., Zettlemoyer L. Deep contextualized word representations. Proc. of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL). V. 1, 2018, pp. 2227–2237. https://doi.org/10.18653/v1/N18-1202
27. Brown T.B., Mann B., Ryder N., Subbiah M., Kaplan J.D., Dhariwal P., Neelakantan A., Shyam P., Sastry G., Askell A., Agarwal S., Herbert-Voss A., Krueger G., Henighan T., Child R., Ramesh A., Ziegler D., Wu J., Winter C., Hesse C., Chen M., Sigler E., Litwin M., Gray S., Chess B., Clark J., Berner Ch., McCandlish S., Radford A., Sutskever I., Amodei D. Language models are few-shot learners. Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020.
28. Sun J., Tian Z., Fu Y., Geng J., Liu C. Digital twins in human understanding: a deep learning-based method to recognize personality traits. International Journal of Computer Integrated Manufacturing, 2021, vol. 34, no. 7-8, pp. 860–873. https://doi.org/10.1080/095119-2X.2020.1757155
29. Wang Z., Wu C.-H., Li Q.-B., Yan B., Zheng K.-F. Encoding text information with graph convolutional networks for personality recognition. Applied Science, 2020, vol. 10, no. 12, pp. 4081. https:// doi.org/10.3390/app10124081
30. Cortes C., Vapnik V. Support-vector networks. Machine Learning, 1 9 9 5 , v o l . 2 0 , n o . 3 , p p . 2 7 3 – 2 9 7 . https://doi.org/10.1023/A:1022627411411
31. Breiman L. Random forests. Machine Learning, 2001, vol. 45, no. 1, pp. 5–32. https://doi.org/10.1023/A:1010933404324
32. Friedman N., Geiger D., Goldszmidt M. Bayesian network classifiers. Machine Learning, 1997, vol. 29, no. 2-3, pp. 131–163. https://doi.org/10.1023/a:1007465528199
33. Grandini M., Bagli E., Visani G. Metrics for multi-class classification: an overview. 2020. Available at: https://arxiv.org/abs/2008.05756 (accessed: 01.09.2022).
34. Refaeilzadeh P., Tang L., Liu H. Cross-Validation. Encyclopedia of Database Systems. Boston, Springer, 2009, pp. 532–538. https://doi.org/10.1007/978-0-387-39940-9_565
35. Gruzdeva A.S., Bessmertny I.A. Classification of short texts using a wave model. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2022, vol. 22, no. 2, pp. 287– 293. (in Russian). https://doi.org/10.17586/2226-1494-2022-22-2-287-293 Authors Valerii D. Ol
Review
For citations:
Oliseenko V.D., Abramov M.V. Predicting the results of the 16-factor R. Cattell test based on the analysis of text posts of social network users. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2023;23(2):279-288. (In Russ.) https://doi.org/10.17586/2226-1494-2023-23-2-279-288