Preview

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Advanced search

Font generation based on style and character structure analysis using diffusion models

https://doi.org/10.17586/2226-1494-2025-25-4-676-683

Abstract

   The article discusses the role of generative neural networks in the development and optimization of fonts which play a key role in creating aesthetically attractive and functional designs. The main attention is paid to licensing restrictions and insufficient availability of fonts for various world languages, which creates difficulties for designers and typographers in the process of creating text materials. The novelty of the approach lies in the use of the diffusion model as a generative neural network for automatic font creation, including missing glyphs for languages not supported by standard fonts. To solve the tasks set, a diffusion model has been developed which is an algorithm for generating fonts based on the analysis of patterns in the structure of symbols and the logic of their construction. The model is integrated into an application that automates the process of creating font layouts, allowing users to generate new glyphs and fonts tailored to specific language needs. This technique includes preliminary data preparation, network training, and subsequent character generation that mimic the style and composition of the original fonts. During the experiments, the diffusion model demonstrated a high ability to generate high-quality font characters visually similar to the original samples. Font sets with a limited set of characters were used as source data, which allowed us to evaluate the capabilities of the model to create missing glyphs for various languages. The results showed that the developed model successfully reproduces the stylistic features of the original font, which confirms its potential for application in the development of font solutions for global use. The proposed method of font generation is of interest to specialists working in the field of design, typography, and the creation of text materials for various language audiences. The results obtained can be useful when creating fonts intended for use in multilingual projects that require the presence of missing characters.

About the Authors

M. I. Maslov
LLC “Nanosoft Razrabotka”; ITMO University
Russian Federation

Maksim I. Maslov, Software Developer, Student

108811; Moscow; 197101; Saint Petersburg



A. E. Avdyushina
ITMO University; JSC “Research and development center”
Russian Federation

Anna E. Avdyushina, Analyst, Assistant

197101; Saint Petersburg; 101000; Moscow

sc 57221719751



M. A. Solodkaya
ITMO University
Russian Federation

Maria A. Solodkaya, Assistant

197101; Saint Petersburg



A. V. Kugaevskikh
ITMO University
Russian Federation

Alexander V. Kugaevskikh, PhD, Associate Professor, Associate Professor at the Department

197101; Saint Petersburg

sc 56442745400



References

1. Ronneberger O., Fischer P., Brox T. U-net: Convolutional networks for biomedical image segmentation. Lecture Notes in Computer Science, 2015, vol. 9352, pp. 234–241. doi: 10.1007/978-3-319-24574-4_28

2. Wang Y., Lian Z. DeepVecFont: synthesizing high-quality vector fonts via dual-modality learning. ACM Transactions on Graphics (TOG), 2021, vol. 40, no. 6, pp. 1–15. doi: 10.1145/3478513.3480488

3. Wang Y., Wang Y., Yu L., Zhu Y., Lian Z. DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. pp. 18320–18328. doi: 10.1109/CVPR52729.2023.01757

4. Yang Z., Peng D., Kong Y., Zhang Y., Yao C., Jin L. FontDiffuser: One-shot font generation via denoising diffusion with multi-scale content aggregation and style contrastive learning. Proc. of the AAAI Conference on Artificial Intelligence, 2024. V. 38. N 7. P. 6603–6611. doi: 10.1609/aaai.v38i7.28482

5. Huang Q., Fu B., Zhang A., Qiao Y. GenText: Unsupervised artistic text generation via decoupled font and texture manipulation. arXiv, 2022, arXiv:2207.09649. doi: 10.48550/arXiv.2207.09649

6. Zeng J., Chen Q., Liu Y., Wang M., Yao Y. StrokeGAN: Reducing mode collapse in Chinese font generation via stroke encoding. arXiv, 2020, arXiv:2012.08687. doi: 10.48550/arXiv.2012.08687

7. Park S., Chun S., Cha J., Lee B., Shim H. Few-shot font generation with localized style representations and factorization. Proc. of the AAAI Conference on Artificial Intelligence, 2021, vol. 35, no. 3, pp. 2393–2402. doi: 10.1609/aaai.v35i3.16340

8. Yao M., Zhang Y., Lin X., Li X.; Zuo W. VQ-Font: Few-shot font generation with structure-aware enhancement and quantization. Proc. of the AAAI Conference on Artificial Intelligence, 2024, vol. 38, no. 15, pp. 16407–16415. doi: 10.1609/aaai.v38i15.29577

9. Ding M. An edge-directed diffusion equation-based image restoration approach for font generation. IEEE Access, 2023, vol. 11, pp. 141435–141444. doi: 10.1109/ACCESS.2023.3342026

10. Jeong J., Shin J. Multi-scale diffusion denoised smoothing. Proc. of the 37<sup>th</sup> International Conference on Neural Information Processing Systems, 2023, pp. 67374–67397.

11. Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser Ł., Polosukhin I. Attention is all you need. Advances in Neural Information Processing Systems, 2017, vol. 30, pp. 1–11.

12. Voronov G., Lightheart R., Davison J., Krettler C.A., Healey D., Butler T. Multi-scale sinusoidal embeddings enable learning on high resolution mass spectrometry data. arXiv, 2022, arXiv:2207.02980. doi: 10.48550/arXiv.2207.02980

13. Dhariwal P., Nichol A. Diffusion models beat GANs on image synthesis. Advances in Neural Information Processing Systems, 2021, vol. 34, pp. 8780–8794.

14. Convolutional Layer – Building Block of CNNs. Towards Data Science, 2024. Available at: https://towardsdatascience.com/convolutional-layer-building-block-of-cnns-501b5b643e7b (accessed: 30. 01. 2024).

15. Xu M., Du X., Wang D. Super-resolution restoration of single vehicle image based on ESPCN-VISR model. IOP Conference Series: Materials Science and Engineering, 2020, vol. 790, no. 1, pp. 012107. doi: 10.1088/1757-899X/790/1/012107

16. Ho J., Jain A., Abbeel P. Denoising diffusion probabilistic models. Proc. of the 34<sup>th</sup> International Conference on Neural Information Processing Systems, 2020, pp. 6840-6851.

17. Nichol A.Q., Dhariwal P. Improved denoising diffusion probabilistic models. Proc. of the 38<sup>th</sup> International Conference on Machine Learning, 2021, vol. 139, pp. 8162–8171.

18. Lin S., Yang X. Diffusion model with perceptual loss. arXiv, 2023, arXiv:2401.00110. doi: 10.48550/arXiv.2401.00110


Review

For citations:


Maslov M.I., Avdyushina A.E., Solodkaya M.A., Kugaevskikh A.V. Font generation based on style and character structure analysis using diffusion models. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2025;25(4):676-683. (In Russ.) https://doi.org/10.17586/2226-1494-2025-25-4-676-683

Views: 134

JATS XML


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2226-1494 (Print)
ISSN 2500-0373 (Online)