Hi! I am a Ph.D. candidate in the department of Electrical and Computer Engineering (ECE) at University of Wisconsin-Madison, advised by Prof. Dimitris Papailiopoulos and Prof. Kangwook Lee. I received my M.S. in 2018 from Seoul National University, where I was advised by Prof. Jungwoo Lee and learned about communication systems. I also received my B.S. in ECE from Seoul National University. I am a recipient of the Korean Government Scholarship Program for Study Overseas.
I am open to research collaboration and internship opportunities.
I am interested in machine learning, with the focus on finding efficient, robust and scalable machine learning algorithm. Recently, I’ve been interested in understanding the algorithmic capabilities of large language models and ways to improve performance through data, prompting, inference-time methods, and verifiers. Previously, I have worked on neural network pruning.
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks, ICML 2024
J. Park, J. Park, Z. Xiong, N. Lee, J. Cho, S. Oymak, K. Lee, D. Papailiopoulos
Teaching Arithmetic to Small Transformers, ICLR 2024
N. Lee, K. Sreenivasan, J. D. Lee, K. Lee, D. Papailiopoulos
Super Seeds: extreme model compression by trading off storage with computation, ICML 2022 Workshop (spotlight)
N. Lee, S. Rajput, J. Sohn, H. Wang, A. Nagle, E. Xing, K. Lee, D. Papailiopoulos
On the Design of Tailored Neural Networks for Energy Harvesting Broadcast Channels: A Reinforcement Learning Approach, IEEE Access 2020
H. Kim, J. Kim, W. Shin, H. Yang, N. Lee, S. Kim, J. Lee.
Rate Maximization with Reinforcement Learning for Time-Varying Energy Harvesting Broadcast Channels, IEEE Globecom2019
H.Kim, W. Shin, H. Yang, N. Lee, J. Lee.
Analog network coding using differential and double-differential modulation with relay selection, ICT Express 2019
S. Heo, C. Kim, N. Lee, J. Lee.
LLMs trained on vast amounts of data, eventually learn basic arithmetic, even when these tasks are not explicitly encoded in the next-prediction loss objective. To untangle various factors in play, we investigate the arithmetic capabilities of small Transformer models. We find the importance of data sampling, formatting, and prompting for the model to elicit arithmetic capabilities.
We explore the trade-off between extreme model compression and computation of neural networks. We aim to answer the following question: For a given level of model compression, what is the minimum computation required to recover a high accuracy model? We discover that one can trade off some accuracy of the model with significant gains in storage cost, at a relatively small decompression cost.
Graduate Teaching Assistant
Undergraduate Teaching Assistant
(08.18.2023) Short Talk at the Simons Institute LLM Workshop - Video | Slides |