Donghyun Son

Hi! I’m an undergraduate student majoring Computer Science and Engineering at Seoul National University. I’m currently working on efficient LLM inference algorithms as an undergraduate researcher at CMA Lab, under the supervision of Sungjoo Yoo. Previously, I worked as a machine learning engineer at Hyperconnect (acquired by Match Group), where I built an ML-based content moderation system for Match Group brands.
I’m broadly interested in efficient algorithms for model training and inference. My previous works focus on data efficient methods including multiple subtasks approach, domain generalization, and few-shot personalization.
Recently, I’m working on efficient long-context inference of LLMs. Particularly, I’m interested in KV cache compression, sparse attention and their effects on reasoning abilities.
Aside from research, I enjoy algorithmic problem solving and have competed in various competitive programming contests such as ICPC, Google Hashcode, and SCPC. You can find me on codeforces and BOJ.
Links: Github / CV / Google Scholar / X
selected publications
- PreprintNSNQuant: A Double Normalization Approach for Calibration-Free Low-Bit Vector Quantization of KV Cache2025
- WSDM23 (Oral)Reliable decision from multiple subtasks through threshold optimization: Content moderation in the wildIn Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, 2023
- OOD-CV@ICCV23Gradient estimation for unseen domain risk minimization with pre-trained modelsIn Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023