Hi! I am a postdoc at Stanford CS working with Profs. Tatsu Hashimoto, Percy Liang, and James Zou.
I received my PhD from MIT, where I was advised by Prof. Aleksander Mądry.
I’m currently interested in understanding and improving machine learning (ML) methodology
through the lens of data. Some questions I think about include:
How do we attribute model predictions back to training data?
How do we select the right data for a given task?
Can we derive insights about ML phenomena (e.g., scaling laws, emergence, in-context learning) through this lens?
I’m also more broadly interested in the science of machine learning/deep learning.
[News] (July, ‘24) I co-presented a tutorial at ICML ‘24 on Data Attribution at Scale: [video] [notes]
Bio
Previously at MIT, I worked on understanding statistical-computational tradeoffs in high-dimensional statistics with Prof. Guy Bresler for my SM thesis. Earlier during my PhD, I was supported by the MIT Akamai Presidential Fellowship and the Samsung Scholarship.
From 2016-18, I served in the Republic of Korea Army in the top signals intelligence unit as a researcher.
Prior to grad school, I received a BS in Computer Science from Cornell University (2011-14), where I was fortunate to work with Prof. Ramin Zabih and Prof. Bobby Kleinberg.
I have interned at Waymo, Dropbox, and Google.
Research
**Attribute-to-Delete: Machine Unlearning via Datamodel Matching**\
Kristian Georgiev\*, Roy Rinberg\*, Sung Min Park*, Shivam Garg\*, Andrew Ilyas, Aleksander Mądry, Seth Neel \
ICLR 2025\
[[arxiv]](https://arxiv.org/abs/2410.23232) [[blog](https://t.co/QVgG2FlNmB)]
**The Journey, Not the Destination: How Data Guides Diffusion Models**\
Kristian Georgiev\*, Josh Vendrow\*, Hadi Salman, Sung Min Park, Aleksander Mądry \
[[arxiv]](https://arxiv.org/abs/2312.06205)
**TRAK: Attributing Model Behavior at Scale**\
Sung Min Park*, Kristian Georgiev\*, Andrew Ilyas\*, Guillaume Leclerc, Aleksander Mądry \
ICML 2023 (**Oral presentation**)\
[[arxiv]](https://arxiv.org/abs/2303.14186) [[blog](https://gradientscience.org/trak/)][[code](https://github.com/MadryLab/trak)]
[[website]](https://trak.csail.mit.edu/)[[talk](https://icml.cc/virtual/2023/oral/25526)]
**ModelDiff: A Framework for Comparing Learning Algorithms**\
Harshay Shah\*, Sung Min Park*, Andrew Ilyas\*, Aleksander Mądry \
ICML 2023\
[[arxiv]](https://arxiv.org/abs/2211.12491) [[blog](https://gradientscience.org/modeldiff/)][[code](https://github.com/MadryLab/modeldiff)]
**FFCV: Accelerating Training by Removing Data Bottlenecks**\
Guillaume Leclerc, Andrew Ilyas, Logan Engstrom, Sung Min Park, Hadi Salman, Aleksander Mądry \
CVPR 2023\
[[code](https://github.com/libffcv/ffcv)]
**A Data-Based Perspective on Transfer Learning**\
Saachi Jain\*, Hadi Salman\*, Alaa Khaddaj\*, Eric Wong, Sung Min Park, Aleksander Mądry\
CVPR 2023\
[[arxiv]](https://arxiv.org/abs/2207.05739) [[blog](https://gradientscience.org/data-transfer/)]
**Datamodels: Predicting Predictions from Training Data**\
Andrew Ilyas\*, Sung Min Park*, Logan Engstrom\*, Guillaume Leclerc, Aleksander Mądry\
ICML 2022\
[[arxiv]](https://arxiv.org/abs/2202.00622) [blog [part 1](https://gradientscience.org/datamodels-1/) [part 2](https://gradientscience.org/datamodels-2/)] [[code](https://github.com/MadryLab/datamodels)][[data]](https://github.com/MadryLab/datamodels-data)
**On Distinctive Properties of Universal Perturbations**\
Sung Min Park, Kuo-An Wei, Kai Xiao, Jerry Li, Aleksander Mądry\
2021\
[[arxiv]](https://arxiv.org/abs/2112.15329)
**Sparse PCA from Sparse Linear Regression**\
(α-β order) Guy Bresler, Sung Min Park, Madalina Persu\
NeurIPS 2018\
[[arxiv]](https://arxiv.org/abs/1811.10106) [[poster]](/assets/files/neurips_2018_poster.pdf) [[code]](https://github.com/sung-max/SPCAvSLR)
**On the Equivalence of Sparse Statistical Problems**\
Sung Min Park\
SM thesis 2016\
[[pdf]](/assets/files/sm_thesis.pdf)
**Structured learning of sum-of-submodular higher order energy functions**\
Alexander Fix, Thorsten Joachims, Sung Min Park, Ramin Zabih\
ICCV 2013\
[[pdf]](/assets/files/submodular.pdf)
**Attribute-to-Delete: Machine Unlearning via Datamodel Matching**\
Kristian Georgiev\*, Roy Rinberg\*, Sung Min Park*, Shivam Garg\*, Andrew Ilyas, Aleksander Mądry, Seth Neel \
ICLR 2025\
[[arxiv]](https://arxiv.org/abs/2410.23232) [[blog](https://t.co/QVgG2FlNmB)]
**The Journey, Not the Destination: How Data Guides Diffusion Models**\
Kristian Georgiev\*, Josh Vendrow\*, Hadi Salman, Sung Min Park, Aleksander Mądry \
[[arxiv]](https://arxiv.org/abs/2312.06205)
**TRAK: Attributing Model Behavior at Scale**\
Sung Min Park*, Kristian Georgiev\*, Andrew Ilyas\*, Guillaume Leclerc, Aleksander Mądry \
ICML 2023 (**Oral presentation**)\
[[arxiv]](https://arxiv.org/abs/2303.14186) [[blog](https://gradientscience.org/trak/)][[code](https://github.com/MadryLab/trak)]
[[website]](https://trak.csail.mit.edu/)[[talk](https://icml.cc/virtual/2023/oral/25526)]
**ModelDiff: A Framework for Comparing Learning Algorithms**\
Harshay Shah\*, Sung Min Park*, Andrew Ilyas\*, Aleksander Mądry \
ICML 2023\
[[arxiv]](https://arxiv.org/abs/2211.12491) [[blog](https://gradientscience.org/modeldiff/)][[code](https://github.com/MadryLab/modeldiff)]
**FFCV: Accelerating Training by Removing Data Bottlenecks**\
Guillaume Leclerc, Andrew Ilyas, Logan Engstrom, Sung Min Park, Hadi Salman, Aleksander Mądry \
CVPR 2023\
[[code](https://github.com/libffcv/ffcv)]
**A Data-Based Perspective on Transfer Learning**\
Saachi Jain\*, Hadi Salman\*, Alaa Khaddaj\*, Eric Wong, Sung Min Park, Aleksander Mądry\
CVPR 2023\
[[arxiv]](https://arxiv.org/abs/2207.05739) [[blog](https://gradientscience.org/data-transfer/)]
**Datamodels: Predicting Predictions from Training Data**\
Andrew Ilyas\*, Sung Min Park*, Logan Engstrom\*, Guillaume Leclerc, Aleksander Mądry\
ICML 2022\
[[arxiv]](https://arxiv.org/abs/2202.00622) [blog [part 1](https://gradientscience.org/datamodels-1/) [part 2](https://gradientscience.org/datamodels-2/)] [[code](https://github.com/MadryLab/datamodels)][[data]](https://github.com/MadryLab/datamodels-data)
**On Distinctive Properties of Universal Perturbations**\
Sung Min Park, Kuo-An Wei, Kai Xiao, Jerry Li, Aleksander Mądry\
2021\
[[arxiv]](https://arxiv.org/abs/2112.15329)
**Sparse PCA from Sparse Linear Regression**\
(α-β order) Guy Bresler, Sung Min Park, Madalina Persu\
NeurIPS 2018\
[[arxiv]](https://arxiv.org/abs/1811.10106) [[poster]](/assets/files/neurips_2018_poster.pdf) [[code]](https://github.com/sung-max/SPCAvSLR)
**On the Equivalence of Sparse Statistical Problems**\
Sung Min Park\
SM thesis 2016\
[[pdf]](/assets/files/sm_thesis.pdf)
**Structured learning of sum-of-submodular higher order energy functions**\
Alexander Fix, Thorsten Joachims, Sung Min Park, Ramin Zabih\
ICCV 2013\
[[pdf]](/assets/files/submodular.pdf)