K-sparse encoder for efficient information retrieval
https://doi.org/10.17586/2226-1494-2025-25-4-710-717
Abstract
Modern industrial search engines typically employ a two-stage pipeline: fast candidate retrieval followed by reranking. This approach inevitably leads to the loss of some relevant documents due to the simplicity of algorithms used in the first stage. This work proposes a single-stage approach that combines the advantages of dense semantic search models with the efficiency of inverted indices. The key component of the solution is a K-sparse encoder used to convert densevectors into sparse ones compatible with inverted indices of the Lucene library. In contrast to the previously studied identifiable variational autoencoder, the proposed model is based on an autoencoder with a TopK activation function which explicitly enforces a fixed number of non-zero coordinates during training. This activation function makes the sparse vector generation process differentiable, eliminates the need for post-processing, and simplifies the loss function to a sum of reconstruction error and a component preserving relative distances between dense and sparse representations. The model was trained on a 300,000-document subset of the MS MARCO dataset using PyTorch and an NVIDIA L4 GPU. The proposed model achieves 96.6 % of the quality of the original dense model in terms of the NDCG@10 metric (0.57 vs. 0.59) on the SciFact dataset with 80 % sparsity. It is also shown that further increasing sparsity reduces index size and improves retrieval speed while maintaining acceptable search quality. In terms of memory usage, the approach outperforms the Hierarchical Navigable Small World (HNSW) graph-based algorithm, and at high sparsity levels, its speed approaches that of HNSW. The results confirm the applicability of the proposed approach to unstructured data retrieval. Direct control over sparsity enables balancing between search quality, latency, and memory requirements. Thanks to the use of an inverted index based on the Lucene library, the proposed solution is well suited for industrial-scale search systems. Future research directions include interpretability of the extracted features and improving retrieval quality under high sparsity conditions.
About the Author
V. Yu. DobryninRussian Federation
Viacheslav Yu. Dobrynin, PhD Student
197101; Saint Petersburg
sc 57223099701
References
1. Chen R., Gallagher L., Blanco R., Culpepper J.S. Efficient cost-aware cascade ranking in multi-stage retrieval. Proc. of the 40<sup>th</sup> International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017, pp. 445–454. doi: 10.1145/3077136.3080819
2. Liu S., Xiao F., Ou W., Si L. Cascade ranking for operational E-commerce search. Proc. of the 23<sup>rd</sup> ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 1557–1565. doi: 10.1145/3097983.3098011
3. Furnas G.W., Landauer T.K., Gomez L.M., Dumais S.T. The vocabulary problem in human-system communication. Communications of the ACM, 1987, vol. 30, no. 11, pp. 964–971. doi: 10.1145/32206.32212
4. Zhao L., Callan J. Term necessity prediction. Proc. of the 19<sup>th</sup> ACM International Conference on Information and Knowledge Management, 2010, pp. 259–268. doi: 10.1145/1871437.1871474
5. Zamani H., Dehghani M., Croft W.B., Learned-Miller E., Kamps J. From neural re-ranking to neural ranking: Learning a sparse representation for inverted indexing. Proc. of the 27<sup>th</sup> ACM International Conference on Information and Knowledge Management, 2018, pp. 497–506. doi: 10.1145/3269206.3271800
6. Formal T., Piwowarski B., Clinchant S. SPLADE: Sparse lexical and expansion model for first stage ranking. Proc. of the 44<sup>th</sup> International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp. 2288–2292. doi: 10.1145/3404835.3463098
7. Dobrynin V., Sherman M., Abramovich R., Platonov A. A sparsifier model for efficient information retrieval. IEEE 18<sup>th</sup> International Conference on Application of Information and Communication Technologies (AICT), 2024, pp. 1–4. doi: 10.1109/aict61888.2024.10740301
8. Dobrynin V.Yu., Abramovich R.K., Platonov A.V. Efficient sparse retrieval through embedding-based inverted index construction. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2025, vol. 25, no. 1, pp. 61–67. doi: 10.17586/2226-1494-2025-25-1-61-67
9. Khemakhem I., Kingma D.P., Monti R.P., Hyvarinen A. Variational autoencoders and nonlinear ICA: A unifying framework. Proc. of the 23<sup>rd</sup> International Conference on Artificial Intelligence and Statistics (AISTATS), 2020, vol. 108, pp. 2207–2216.
10. Louizos C., Welling M., Kingma D.P. Learning sparse neural networks through L0 regularization. arXiv, 2017, arXiv:1712.01312. doi: 10.48550/arXiv.1712.01312
11. Makhzani A., Frey B. k-Sparse autoencoders. arXiv, 2013, arXiv:1312.5663. doi: 10.48550/arXiv.1312.5663
12. Gao L., Tour T.D., Tillman H., Goh G., Troll R., Radford A., Sutskever I., Leike J., Wu J. Scaling and evaluating sparse autoencoders. arXiv, 2024, arXiv:2406.04093. doi: 10.48550/arXiv.2406.04093
13. Bricken T., Templeton A., Batson J., Chen B., Jermyn A., Conerly T., et al. Towards monosemanticity: decomposing language models with dictionary learning. Transformer Circuits Thread, 2023.
14. Paszke A., Gross S., Massa F., Lerer A., Bradbury J., Chanan G., et al. PyTorch: An imperative style, high-performance deep learning library. arXiv, 2019, arXiv:1912.01703. doi: 10.48550/arXiv.1912.01703
15. Bajaj P., Campos D., Craswell N., Deng L., Gao J., Liu X., et al. MS MARCO: A human generated MAchine Reading COmprehension dataset. arXiv, 2016, arXiv:1611.09268. doi: 10.48550/arXiv.1611.09268
16. Wadden D., Lin S., Lo K., Wang L.L., van Zuylen M., Cohan A., Hajishirzi H. Fact or fiction: verifying scientific claims. Proc. of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 7534–7550. doi: 10.18653/v1/2020.emnlp-main.609
17. Malkov Y., Yashunin D.A. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, vol. 42, no. 4, pp. 824–836. doi: 10.1109/TPAMI.2018.2889473
Review
For citations:
Dobrynin V.Yu. K-sparse encoder for efficient information retrieval. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2025;25(4):710-717. (In Russ.) https://doi.org/10.17586/2226-1494-2025-25-4-710-717