The use of anthropometric points to introduce restrictions into the synthesis of a 3D model of the human body using SMPL
https://doi.org/10.17586/2226-1494-2023-23-5-935-945
Abstract
Generating a realistic three-dimensional model of the human body is a very time-consuming task. Even with the necessary computing resources, generation errors occur on the figures of people who differ from the average physique. In this paper, an experimental algorithm for reading anthropometric data from only two full-face and profile photographs is proposed. The proposed solution to the problem of generation using the selection of anthropometric points involves setting the constraints of the SMPL (Skinned Multi-Person Linear Model) model. For segmentation of the human body based on empirical studies, a modification of the Fully Connected Convolutional Neural Network (FCN) ResNet101, trained on the COCO Segmentation 2017 dataset, was used. With its help, the basis for the detection of anthropometric points in full-face and profile photos was obtained. The error in determining anthropometric points ranges from 2 to 5 % depending on their location. The constraints for the SMPL rendering model are calculated using the LevenbergMarquardt algorithm. For its correct operation, a special cost function is proposed, taking into account the features of this task. The dataset collected by the authors of the article (117 people of different physiques and height) shows that the proposed method allows you to obtain a small average absolute error (MAE = 0.0395 m) and a high coefficient of determination (R2 = 0.913). The graph of anthropometric points sets stricter conditions for generating a figure and any deviation from the graph is a consequence of a large generation error. The proposed solution allows you to accurately generate a model of the human body. At the same time, low requirements for computing resources and the quality of users’ initial photos remain. The proposed solution can be used in online fitting rooms, which adds additional complexity to the task due to the requirements to restore the figure from only two pictures as well as the need to accurately reproduce the features of male and female figures.
About the Authors
A. V. KugaevskikhRussian Federation
Alexander V. Kugaevskikh — PhD, Associate Professor
sc 56442745400
Saint Petersburg, 197101
M. A. Bolshim
Russian Federation
Maksim A. Bolshim — Student
Saint Petersburg, 197101
I. F. Sattarov
Russian Federation
Ildar F. Sattarov — Development Director
Tyumen, 625007
References
1. Robinette K.M., Daanen H., Paquet E. The CAESAR project: a 3-D surface anthropometry survey. Proc. of the Second International Conference on 3-D Digital Imaging and Modeling (Cat. No. PR00062), 1999, pp. 380–386. https://doi.org/10.1109/im.1999.805368
2. Hirshberg D.A., Loper M., Rachlin E., Black M.J. Coregistration: Simultaneous alignment and modeling of articulated 3D shape. Lecture Notes in Computer Science, 2012, vol. 7577, pp. 242–255. https://doi.org/10.1007/978-3-642-33783-3_18
3. Loper M., Mahmood N., Romero J., Pons-Moll G., Black M.J. SMPL: A skinned multi-person linear model. ACM Transactions on Graphics, 2015, vol. 34, no. 6, pp. 1–16. https://doi.org/10.1145/2816795.2818013
4. Pearson K. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 1901, vol. 2, no. 11, pp. 559–572. https://doi.org/10.1080/14786440109462720
5. Kanazawa A., Black M.J., Jacobs D.W., Malik J. End-to-end recovery of human shape and pose. Proc. of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 7122–7131. https://doi.org/10.1109/cvpr.2018.00744
6. Lin T.Y., Maire M., Belongie S., Hays J., Perona P., Ramanan D., Dollár P., Zitnick C.L. Microsoft COCO: Common objects in context. Lecture Notes in Computer Science, 2014, vol. 8693, pp. 740–755. https://doi.org/10.1007/978-3-319-10602-1_48
7. He K., Zhang X., Ren S., Sun J. Identity mappings in deep residual networks. Lecture Notes in Computer Science, 2016, vol. 9908, pp. 630–645. https://doi.org/10.1007/978-3-319-46493-0_38
8. Ionescu C., Papava D., Olaru V., Sminchisescu C. Human3.6M: Large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, vol. 36, no. 7, pp. 1325–1339. https://doi.org/10.1109/tpami.2013.248
9. Johnson S., Everingham M. Clustered pose and nonlinear appearance models for human pose estimation. Proc. of the British Machine Vision Conference, 2010, pp. 12.1–12.11. https://doi.org/10.5244/c.24.12
10. Loper M., Mahmood N., Black M.J. MoSh: Motion and shape capture from sparse markers. ACM Transactions on Graphics, 2014, vol. 33, no. 6, pp. 1–13. https://doi.org/10.1145/2661229.2661273
11. Andriluka M., Pishchulin L., Gehler P., Schiele B. 2D human pose estimation: New benchmark and state of the art analysis. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 3686–3693. https://doi.org/10.1109/cvpr.2014.471
12. Mehta D., Sridhar S., Sotnychenko O., Rhodin H., Shafiei M., Seidel H.-P., Xu W., Casas D., Theobalt C. VNect: Real-time 3D human pose estimation with a single rgb camera. ACM Transactions on Graphics, 2017, vol. 36, no. 4, pp. 1–14. https://doi.org/10.1145/3072959.3073596
13. Varol G., Ceylan D., Russell B., Yang J., Yumer E., Laptev I., Schmid C. BodyNet: Volumetric inference of 3D human body shapes. Lecture Notes in Computer Science, 2018, vol. 11211, pp. 20–38. https://doi.org/10.1007/978-3-030-01234-2_2
14. Chang A.X., Funkhouser T., Guibas L., Hanrahan P., Huang Q., Li Z., Savarese S., Savva M., Song S., Su H., Xiao J., Yi L., Yu F. ShapeNet: An information-rich 3D model repository. arXiv, 2015, arXiv:1512.03012. https://doi.org/10.48550/arXiv.1512.03012
15. Varol G., Romero J., Martin X., Mahmood N., Black M.J., Laptev I., Schmid C. Learning from synthetic humans. Proc. of the I IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 109–117. https://doi.org/10.1109/cvpr.2017.492
16. Lassner C., Romero J., Kiefel M., Bogo F., Black M.J., Gehler P.V. Unite the people: Closing the loop between 3D and 2D human representations. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6050–6059. https://doi.org/10.1109/cvpr.2017.500
17. Bogo F., Kanazawa A., Lassner C., Gehler P., Romero J., Black M.J. Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. Lecture Notes in Computer Science, 2016, vol. 9909, pp. 561–578. https://doi.org/10.1007/978-3-319-46454-1_34
18. Pishchulin L., Insafutdinov E., Tang S., Andres B., Andriluka M., Gehler P., Schiele B. DeepCut: Joint subset partition and labeling for multi person pose estimation. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 4929– 4937. https://doi.org/10.1109/cvpr.2016.533
19. Thiery J.M., Guy É., Boubekeur T. Sphere-meshes: Shape approximation using spherical quadric error metrics. ACM Transactions on Graphics, 2013, vol. 32, no. 6, pp. 1–12. https://doi.org/10.1145/2508363.2508384
20. Long J., Shelhamer E., Darrell T. Fully convolutional networks for semantic segmentation. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3431–3440. https://doi.org/10.1109/cvpr.2015.7298965
21. Ren S., He K., Girshick R., Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, 2015, vol. 28, pp. 91–99.
22. Ujević D., Rogale D., Drenovac M., Pezelj D., Hrastinski M., Narančić N.S., Mimica Ž., Hrženjak R. Croatian anthropometric system meeting the European Union. International Journal of Clothing Science and Technology, 2006, vol. 18, no. 3, pp. 200–208. https://doi.org/10.1108/09556220610657961
23. Prasanth G.N.S. Golden Ratio in Human Anatomy. Master’s thesis. Chittur, Government College Chittur, 2012.
Review
For citations:
Kugaevskikh A.V., Bolshim M.A., Sattarov I.F. The use of anthropometric points to introduce restrictions into the synthesis of a 3D model of the human body using SMPL. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2023;23(5):935-945. (In Russ.) https://doi.org/10.17586/2226-1494-2023-23-5-935-945