K-sparse encoder for efficient information retrieval

V. Yu. Dobrynin

doi:10.17586/2226-1494-2025-25-4-710-717

K-sparse encoder for efficient information retrieval

V. Yu. Dobrynin

https://doi.org/10.17586/2226-1494-2025-25-4-710-717

Full Text:

PDF (Rus)

Generate QR code

Abstract

Modern industrial search engines typically employ a two-stage pipeline: fast candidate retrieval followed by reranking. This approach inevitably leads to the loss of some relevant documents due to the simplicity of algorithms used in the first stage. This work proposes a single-stage approach that combines the advantages of dense semantic search models with the efficiency of inverted indices. The key component of the solution is a K-sparse encoder used to convert densevectors into sparse ones compatible with inverted indices of the Lucene library. In contrast to the previously studied identifiable variational autoencoder, the proposed model is based on an autoencoder with a TopK activation function which explicitly enforces a fixed number of non-zero coordinates during training. This activation function makes the sparse vector generation process differentiable, eliminates the need for post-processing, and simplifies the loss function to a sum of reconstruction error and a component preserving relative distances between dense and sparse representations. The model was trained on a 300,000-document subset of the MS MARCO dataset using PyTorch and an NVIDIA L4 GPU. The proposed model achieves 96.6 % of the quality of the original dense model in terms of the NDCG@10 metric (0.57 vs. 0.59) on the SciFact dataset with 80 % sparsity. It is also shown that further increasing sparsity reduces index size and improves retrieval speed while maintaining acceptable search quality. In terms of memory usage, the approach outperforms the Hierarchical Navigable Small World (HNSW) graph-based algorithm, and at high sparsity levels, its speed approaches that of HNSW. The results confirm the applicability of the proposed approach to unstructured data retrieval. Direct control over sparsity enables balancing between search quality, latency, and memory requirements. Thanks to the use of an inverted index based on the Lucene library, the proposed solution is well suited for industrial-scale search systems. Future research directions include interpretability of the extracted features and improving retrieval quality under high sparsity conditions.

Keywords

information retrieval, sparse vector representations, autoencoder, TopK activation function, inverted index, single-stage architecture

About the Author

V. Yu. Dobrynin

ITMO University
Russian Federation

Viacheslav Yu. Dobrynin, PhD Student

197101; Saint Petersburg

sc 57223099701

References

1. Chen R., Gallagher L., Blanco R., Culpepper J.S. Efficient cost-aware cascade ranking in multi-stage retrieval. Proc. of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017, pp. 445–454. doi: 10.1145/3077136.3080819

2. Liu S., Xiao F., Ou W., Si L. Cascade ranking for operational E-commerce search. Proc. of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 1557–1565. doi: 10.1145/3097983.3098011

3. Furnas G.W., Landauer T.K., Gomez L.M., Dumais S.T. The vocabulary problem in human-system communication. Communications of the ACM, 1987, vol. 30, no. 11, pp. 964–971. doi: 10.1145/32206.32212

4. Zhao L., Callan J. Term necessity prediction. Proc. of the 19th ACM International Conference on Information and Knowledge Management, 2010, pp. 259–268. doi: 10.1145/1871437.1871474

5. Zamani H., Dehghani M., Croft W.B., Learned-Miller E., Kamps J. From neural re-ranking to neural ranking: Learning a sparse representation for inverted indexing. Proc. of the 27th ACM International Conference on Information and Knowledge Management, 2018, pp. 497–506. doi: 10.1145/3269206.3271800

6. Formal T., Piwowarski B., Clinchant S. SPLADE: Sparse lexical and expansion model for first stage ranking. Proc. of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp. 2288–2292. doi: 10.1145/3404835.3463098

7. Dobrynin V., Sherman M., Abramovich R., Platonov A. A sparsifier model for efficient information retrieval. IEEE 18th International Conference on Application of Information and Communication Technologies (AICT), 2024, pp. 1–4. doi: 10.1109/aict61888.2024.10740301

8. Dobrynin V.Yu., Abramovich R.K., Platonov A.V. Efficient sparse retrieval through embedding-based inverted index construction. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2025, vol. 25, no. 1, pp. 61–67. doi: 10.17586/2226-1494-2025-25-1-61-67

9. Khemakhem I., Kingma D.P., Monti R.P., Hyvarinen A. Variational autoencoders and nonlinear ICA: A unifying framework. Proc. of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), 2020, vol. 108, pp. 2207–2216.

10. Louizos C., Welling M., Kingma D.P. Learning sparse neural networks through L0 regularization. arXiv, 2017, arXiv:1712.01312. doi: 10.48550/arXiv.1712.01312

11. Makhzani A., Frey B. k-Sparse autoencoders. arXiv, 2013, arXiv:1312.5663. doi: 10.48550/arXiv.1312.5663

12. Gao L., Tour T.D., Tillman H., Goh G., Troll R., Radford A., Sutskever I., Leike J., Wu J. Scaling and evaluating sparse autoencoders. arXiv, 2024, arXiv:2406.04093. doi: 10.48550/arXiv.2406.04093

13. Bricken T., Templeton A., Batson J., Chen B., Jermyn A., Conerly T., et al. Towards monosemanticity: decomposing language models with dictionary learning. Transformer Circuits Thread, 2023.

14. Paszke A., Gross S., Massa F., Lerer A., Bradbury J., Chanan G., et al. PyTorch: An imperative style, high-performance deep learning library. arXiv, 2019, arXiv:1912.01703. doi: 10.48550/arXiv.1912.01703

15. Bajaj P., Campos D., Craswell N., Deng L., Gao J., Liu X., et al. MS MARCO: A human generated MAchine Reading COmprehension dataset. arXiv, 2016, arXiv:1611.09268. doi: 10.48550/arXiv.1611.09268

16. Wadden D., Lin S., Lo K., Wang L.L., van Zuylen M., Cohan A., Hajishirzi H. Fact or fiction: verifying scientific claims. Proc. of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 7534–7550. doi: 10.18653/v1/2020.emnlp-main.609

17. Malkov Y., Yashunin D.A. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, vol. 42, no. 4, pp. 824–836. doi: 10.1109/TPAMI.2018.2889473

Review

For citations:

Dobrynin V.Yu. K-sparse encoder for efficient information retrieval. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2025;25(4):710-717. (In Russ.) https://doi.org/10.17586/2226-1494-2025-25-4-710-717

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2226-1494 (Print)
ISSN 2500-0373 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

K-sparse encoder for efficient information retrieval

Full Text:

Abstract

Keywords

About the Author

References

Review

For citations:

Cookies policy