Hi! I am a Ph.D. student in the department of Electrical and Computer Engineering (ECE) at University of Wisconsin-Madison, advised by Prof. Dimitris Papailiopoulos and Prof. Kangwook Lee. I received my M.S. in 2018 from Seoul National University, where I was advised by Prof. Jungwoo Lee and learned about communication systems. I also received my B.S. in ECE from Seoul National University. I am a recipient of the Korean Government Scholarship Program for Study Overseas.
I am open to research collaboration and internship opportunities.
I am interested in the machine learning, with the focus on finding efficient, robust and scalable machine learning algorithm. Most recently I’ve been interested in understanding the Transformer models, their capabilities and ways to elicit them through data and prompting. Previously, I have worked on neural network pruning.
Teaching Arithmetic to Small Transformers, ICLR 2024
N. Lee, K. Sreenivasan, J. D. Lee, K. Lee, D. Papailiopoulos
Super Seeds: extreme model compression by trading off storage with computation, ICML 2022 Workshop (spotlight)
N. Lee, S. Rajput, J. Sohn, H. Wang, A. Nagle, E. Xing, K. Lee, D. Papailiopoulos
On the Design of Tailored Neural Networks for Energy Harvesting Broadcast Channels: A Reinforcement Learning Approach, IEEE Access 2020
H. Kim, J. Kim, W. Shin, H. Yang, N. Lee, S. Kim, J. Lee.
Rate Maximization with Reinforcement Learning for Time-Varying Energy Harvesting Broadcast Channels, IEEE Globecom2019
H.Kim, W. Shin, H. Yang, N. Lee, J. Lee.
Analog network coding using differential and double-differential modulation with relay selection, ICT Express 2019
S. Heo, C. Kim, N. Lee, J. Lee.
LLMs trained on vast amounts of data, eventually learn basic arithmetic, even when these tasks are not explicitly encoded in the next-prediction loss objective. To untangle various factors in play, we investigate the arithmetic capabilities of small Transformer models. We find the importance of data sampling, formatting, and prompting for the model to elicit arithmetic capabilities.
We explore the trade-off between extreme model compression and computation of neural networks. We aim to answer the following question: For a given level of model compression, what is the minimum computation required to recover a high accuracy model? We discover that one can trade off some accuracy of the model with significant gains in storage cost, at a relatively small decompression cost.