Publications
Computer Architecture and Systems
2024
Jiwon Lee, Yunjae Lee, Youngeun Kwon and Minsoo Rhu, "Characterization and Analysis of the 3D Gaussian Splatting Rendering Pipeline," IEEE Computer Architecture Letters, 2024
Eunyeong Cho, Jehyeon Bang, and Minsoo Rhu, "Characterization and Analysis of Text-to-Image Diffusion Models," IEEE Computer Architecture Letters, 2024
Zhixian Jin, Christopher Rocca, Jiho Kim, Hans Kasan, Minsoo Rhu, Ali Bakhoda, Tor Aamodt, and John Kim, "Uncovering Real GPU NoC Characteristics: Implications on Interconnect Architecture," The 57th IEEE/ACM International Symposium on Microarchitecture (MICRO-57), Austin, TX, Nov. 2024
Acceptance Rate: 22% (113 among 497)
[Paper]
Dongjae Lee, Bongjoon Hyun, Taehun Kim, and Minsoo Rhu, "PIM-MMU: A Memory Management Unit for Accelerating Data Transfers in Commercial PIM Systems," The 57th IEEE/ACM International Symposium on Microarchitecture (MICRO-57), Austin, TX, Nov. 2024
Acceptance Rate: 22% (113 among 497)
[Paper]
Jehyeon Bang, Yujeong Choi, Myeongwoo Kim, Yongdeok Kim, and Minsoo Rhu, "vTrain: A Simulation Framework for Evaluating Cost-effective and Compute-optimal Large Language Model Training," The 57th IEEE/ACM International Symposium on Microarchitecture (MICRO-57), Austin, TX, Nov. 2024
Dongho Yoon*, Taehun Kim*, Jae W. Lee, and Minsoo Rhu, "A Quantitative Analysis of State Space Model-based Large Language Model: Study of Hungry Hungry Hippos," IEEE Computer Architecture Letters, 2024
Dongho Yoon and Taehun Kim are co-first authors of this work*
Dongjae Lee, Bongjoon Hyun, Taehun Kim, and Minsoo Rhu, "Analysis of Data Transfer Bottlenecks in Commercial PIM Systems: A Study with UPMEM-PIM," IEEE Computer Architecture Letters, 2024
Ranggi Hwang*, Jianyu Wei*, Shijie Cao, Changho Hwang, Xiaohu Tang, Ting Cao, and Mao Yang, "Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference," The 51st IEEE/ACM International Symposium on Computer Architecture (ISCA-51), Buenos Aires, Argentina, Jun. 2024
Ranggi Hwang and Jianyu Wei are co-first authors of this work*
Acceptance Rate: 19% (83 among 423)
Yunjae Lee*, Hyeseong Kim*, and Minsoo Rhu, "PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models," The 51st IEEE/ACM International Symposium on Computer Architecture (ISCA-51), Buenos Aires, Argentina, Jun. 2024
Yunjae Lee and Hyeseong Kim are co-first authors of this work*
Acceptance Rate: 19% (83 among 423)
Yujeong Choi, Jiin Kim, and Minsoo Rhu, "ElasticRec: A Microservice-based Model Serving Architecture Enabling Elastic Resource Scaling for Recommendation Models," The 51st IEEE/ACM International Symposium on Computer Architecture (ISCA-51), Buenos Aires, Argentina, Jun. 2024
Acceptance Rate: 19% (83 among 423)
Juntaek Lim, Youngeun Kwon, Ranggi Hwang, Kiwan Maeng, Edward Suh, and Minsoo Rhu, "LazyDP: Co-Designing Algorithm-Software for Scalable Training of Differentially Private Recommendation Models," The 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-29), San Diego, CA, Apr. 2024
Acceptance Rate: 20% (193 among 921)
Maximilian Lam, Jeff Johnson, Wenjie Xiong, Kiwan Maeng, Udit Gupta, Yang Li, Liangzhen Lai, Minsoo Rhu, Hsien-Hsin S. Lee, Vijay Janapa Reddi, Gu-Yeon Wei, David Brooks, and Edward Suh, "GPU-based Private Information Retrieval for On-Device Machine Learning Inference," The 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-29), San Diego, CA, Mar. 2024
Acceptance Rate: 20% (193 among 921)
[Paper]
Bongjoon Hyun, Taehun Kim, Dongjae Lee, and Minsoo Rhu, "Pathfinding Future PIM Architectures by Demystifying a Commercial PIM Technology," The 30th IEEE International Symposium on High-Performance Computer Architecture (HPCA-30), Edinburgh, UK, Feb. 2024
Best Paper Award
Acceptance Rate: 18% (75 among 410)
[Paper] [Code] [Slides] [Presentation]
Hyeseong Kim*, Yunjae Lee*, and Minsoo Rhu, "FPGA-Accelerated Data Preprocessing for Personalized Recommendation Systems," IEEE Computer Architecture Letters, Jan. 2024
Hyeseong Kim and Yunjae Lee are co-first authors of this work*
2023
Ranggi Hwang*, Minhoo Kang*, Jiwon Lee, Dongyun Kam, Youngjoo Lee, and Minsoo Rhu, "GROW: A Row-Stationary Sparse-Dense GEMM Accelerator for Memory-Efficient Graph Convolutional Neural Networks," The 29th IEEE International Symposium on High-Performance Computer Architecture (HPCA-29), Montreal, Canada, Feb. 2023
Ranggi Hwang and Minhoo Kang are co-first authors of this work*
Acceptance Rate: 25% (91 among 364)
[Paper]
Seonho Lee, Ranggi Hwang, Jongse Park, and Minsoo Rhu, "HAMMER: Hardware-friendly Approximate Computing for Self-attention with Mean-redistribution and Linearization," IEEE Computer Architecture Letters, Jan. 2023
[Paper]
2022
Beomsik Park*, Ranggi Hwang*, Dongho Yoon, Yoonhyuk Choi, and Minsoo Rhu, "DiVa: An Accelerator for Differentially Private Machine Learning," The 55th IEEE/ACM International Symposium on Microarchitecture (MICRO-55), Chicago, IL, Oct. 2022
Beomsik Park and Ranggi Hwang are co-first authors of this work*
Acceptance Rate: 22% (83 among 369)
[Paper]
Jongmin Kim, Gwangho Lee, Sangpyo Kim, Gina Sohn, John Kim, Minsoo Rhu, and Jung Ho Ahn, "ARK: Fully Homomorphic Encryption Accelerator with Runtime Data Generation and Inter-Operation Key Reuse," The 55th IEEE/ACM International Symposium on Microarchitecture (MICRO-55), Chicago, IL, Oct. 2022
Acceptance Rate: 22% (83 among 369)
[Paper]
Yunseong Kim, Yujeong Choi, and Minsoo Rhu, "PARIS and ELSA: An Elastic Scheduling Algorithm for Reconfigurable Multi-GPU Inference Servers," The 59th ACM/ESDA/IEEE Design Automation Conference (DAC), San Francisco, CA, Jul. 2022
Acceptance Rate: 23%
[Paper] [Extended version] [Slide] [Presentation]
Yunjae Lee, Jinha Chung, and Minsoo Rhu, "SmartSAGE: Training Large-scale Graph Neural Networks using In-Storage Processing Architectures," The 49th IEEE/ACM International Symposium on Computer Architecture (ISCA-49), New York, NY, Jun. 2022
Acceptance Rate: 16% (67 among 400)
[Paper]
Youngeun Kwon and Minsoo Rhu, " Training Personalized Recommendation Systems from (GPU) Scratch: Look Forward not Backwards," The 49th IEEE/ACM International Symposium on Computer Architecture (ISCA-49), New York, NY, Jun. 2022
Acceptance Rate: 16% (67 among 400)
[Paper]
Sangpyo Kim, Jongmin Kim, Michael Jaemin Kim, Wonkyung Jung, John Kim, Minsoo Rhu, and Jung Ho Ahn, "BTS: An Accelerator for Bootstrappable Fully Homomorphic Encryption," The 49th IEEE/ACM International Symposium on Computer Architecture (ISCA-49), New York, NY, Jun. 2022
Acceptance Rate: 16% (67 among 400)
[Paper]
2021
Jaehyun Park, Byeongho Kim, Sungmin Yun, Eojin Lee, Minsoo Rhu, and Jung Ho Ahn, "TRiM: Enhancing Processor-Memory Interfaces with Scalable Tensor Reduction in Memory," The 54th IEEE/ACM International Symposium on Microarchitecture (MICRO-54), Athens, Greece, Oct. 2021
Acceptance Rate: 22% (94 among 430)
[Paper]
Bongjoon Hyun, Jiwon Lee, and Minsoo Rhu, "Characterization and Analysis of Deep Learning for 3D Point Cloud Analytics," IEEE Computer Architecture Letters, Jul. 2021
[Paper]
Yunjae Lee, Youngeun Kwon, and Minsoo Rhu, "Understanding the Implication of Non-Volatile Memory for Large-Scale Graph Neural Network Training," IEEE Computer Architecture Letters, Jul. 2021
[Paper]
Youngeun Kwon, Yunjae Lee, and Minsoo Rhu, "Tensor Casting: Co-Designing Algorithm-Architecture for Personalized Recommendation Training," The 27th IEEE International Symposium on High-Performance Computer Architecture (HPCA-27), Seoul, South Korea, Feb. 2021
Acceptance Rate: 24% (63 among 258)
[Paper] [Presentation]
Yujeong Choi, Yunseong Kim, and Minsoo Rhu, "LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference," The 27th IEEE International Symposium on High-Performance Computer Architecture (HPCA-27), Seoul, South Korea, Feb. 2021
Acceptance Rate: 24% (63 among 258)
[Paper] [Presentation]
Jaeguk Ahn, Cheolgyu Jin, Jiho Kim, Minsoo Rhu, Yunsi Fei, David Kaeli, and John Kim, "Trident: A Hybrid Correlation-Collision GPU Cache Timing Attack for AES Key Recovery," The 27th IEEE International Symposium on High-Performance Computer Architecture (HPCA-27), Seoul, South Korea, Feb. 2021
Acceptance Rate: 24% (63 among 258)
[Paper]
2020
Byeongho Kim, Jaehyun Park, Eojin Lee, Minsoo Rhu, and Jung Ho Ahn, "TRiM: Tensor Reduction in Memory," IEEE Computer Architecture Letters, Dec. 2020
[Paper]
Ranggi Hwang, Taehun Kim, Youngeun Kwon, and Minsoo Rhu, "Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations," The 47th IEEE/ACM International Symposium on Computer Architecture (ISCA-47), Valencia, Spain, Jun. 2020
Acceptance Rate: 18% (77 among 421)
[Paper] [Presentation]
Bongjoon Hyun, Youngeun Kwon, Yujeong Choi, John Kim, and Minsoo Rhu, "NeuMMU: Architectural Support for Efficient Address Translations in NPUs," The 25th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-25), Lausanne, Switzerland, Mar. 2020
Selected for IEEE Micro Top Picks - Honorable Mention ("IEEE Micro - Special Issue on Top Picks from the 2020 Computer Architecture Conferences")
Acceptance Rate: 18% (86 among 476)
[Paper] [Presentation]
Yujeong Choi and Minsoo Rhu, "PREMA: A Predictive Multi-task Scheduling Algorithm For Preemptible Neural Processing Units," The 26th IEEE International Symposium on High-Performance Computer Architecture (HPCA-26), San Diego, CA, Feb. 2020
Acceptance Rate: 19% (48 among 248)
[Paper]
2019
Youngeun Kwon, Yunjae Lee, and Minsoo Rhu, "TensorDIMM: A Practical Near-Memory Processing Architecture for Embeddings and Tensor Operations in Deep Learning," The 52nd IEEE/ACM International Symposium on Microarchitecture (MICRO-52), Columbus, OH, Oct. 2019
Selected for IEEE Micro Top Picks - Honorable Mention ("IEEE Micro - Special Issue on Top Picks from the 2019 Computer Architecture Conferences")
Acceptance Rate: 22% (79 among 344)
[Paper]
Youngeun Kwon and Minsoo Rhu, "A Disaggregated Memory System for Deep Learning," IEEE Micro, Special Issue on Machine Learning Acceleration, Volume 39, Issue 5, Sep/Oct., 2019
[Paper]
2018
Youngeun Kwon and Minsoo Rhu, "Beyond the Memory Wall: A Case for Memory-centric HPC System for Deep Learning," The 51st IEEE/ACM International Symposium on Microarchitecture (MICRO-51), Fukuoka, Japan, Oct. 2018
Acceptance Rate: 21% (74 among 351)
[Paper] [Presentation]
Youngeun Kwon and Minsoo Rhu, "A Case for Memory-Centric HPC System Architecture for Training Deep Neural Networks," IEEE Computer Architecture Letters, Jul. 2018
[Paper]
Maohua Zhu, Jason Clemons, Jeff Pool, Minsoo Rhu, Stephen W. Keckler, and Yuan Xie, "Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training," arXiv.org
Minsoo Rhu, Mike O'Connor, Niladrish Chatterjee, Jeff Pool, Youngeun Kwon, and Stephen W. Keckler, "Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks," The 24th IEEE International Symposium on High-Performance Computer Architecture (HPCA-24), Vienna, Austria, Feb. 2018
Minsoo Rhu, "Accelerator-centric Deep Learning Systems for Enhanced Scalability, Energy-Efficiency, and Programmability", (Invited Paper) The 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), Jeju, South Korea, Feb. 2018
2017
Youngsok Kim, Jae-Eon Jo, Hanhwi Jang, Minsoo Rhu, Hanjun Kim, and Jangwoo Kim, "GPUpd: A Fast and Scalable Multi-GPU Architecture Using Cooperative Projection and Distribution," The 50th IEEE/ACM International Symposium on Microarchitecture (MICRO-50), Boston, MA, Oct. 2017
Acceptance Rate: 19% (61 among 327)
[Paper]
Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W. Keckler, and William J. Dally, "SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks," The 44th IEEE/ACM International Symposium on Computer Architecture (ISCA-44), Toronto, ON, Canada, Jun. 2017
Niladrish Chatterjee, Mike O'Connor, Donghyuk Lee, Daniel R. Johnson, Stephen W. Keckler, Minsoo Rhu, and William J. Dally, "Architecting an Energy-Efficient DRAM System For GPUs," The 23rd IEEE International Symposium on High-Performance Computer Architecture (HPCA-23), Austin, TX, Feb. 2017
Acceptance Rate:
2016
Minsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, and Stephen W. Keckler, "vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design," The 49th IEEE/ACM International Symposium on Microarchitecture (MICRO-49), Taipei, Taiwan, Oct. 2016
2015
Seong-Lyong Gong, Minsoo Rhu, Jungrae Kim, Jinsuk Chung, and Mattan Erez, "CLEAN-ECC: High Reliability ECC for Adaptive Granularity Memory System," The 48th IEEE/ACM International Symposium on Microarchitecture (MICRO-48), Waikiki, HI, Dec. 2015
Acceptance Rate: 22% (61 among 283)
Dong Li*, Minsoo Rhu, Daniel R. Johnson, Mike O'Connor, Mattan Erez, Doug Burger, Donald S. Fussell, and Stephen W. Keckler, "Priority-Based Cache Allocation for Throughput Processors," The 21st IEEE International Symposium on High-Performance Computer Architecture (HPCA-21), San Francisco, CA, Feb. 2015
2014
Jingwen Leng, Yazhou Zu, Minsoo Rhu, Meeta Sharma Gupta, and Vijay Janapa Reddi, "GPUVolt: Characterizing and Mitigating Voltage Noise in GPUs," The IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED-2014), La Jolla, CA, Aug. 2014
Acceptance Rate: 23%
2013
Minsoo Rhu, Michael Sullivan, Jingwen Leng and Mattan Erez, "A Locality-Aware Memory Hierarchy for Energy-Efficient GPU Architectures," The 46th IEEE/ACM International Symposium on Microarchitecture (MICRO-46), Davis, CA, Dec. 2013
Minsoo Rhu and Mattan Erez, "Maximizing SIMD Resource Utilization in GPGPUs with SIMD Lane Permutation," The 40th IEEE/ACM International Symposium on Computer Architecture (ISCA-40), Tel-Aviv, Israel, Jun. 2013
Minsoo Rhu and Mattan Erez, "The Dual-Path Execution Model for Efficient GPU Control Flow," The 19th IEEE International Symposium on High-Performance Computer Architecture (HPCA-19), Shenzhen, China, Feb. 2013
2012
Minsoo Rhu and Mattan Erez, "CAPRI: Prediction of Compaction-Adequacy for Handling Control-Divergence in GPGPU Architectures," The 39th IEEE/ACM International Symposium on Computer Architecture (ISCA-39), Portland, OR, Jun. 2012
ASIC Design
2010
Minsoo Rhu and In-Cheol Park, "Optimization of Arithmetic Coding for JPEG2000," IEEE Transactions on Circuits and System for Video Technology, Vol.20, No.3, pp.446-451, Mar. 2010
2009
Minsoo Rhu and In-Cheol Park, "Memory-less Bit-Plane Coder Architecture for JPEG2000 with Concurrent Column-Stripe Coding", IEEE International Conference on Image Processing 2009 (ICIP 2009), Cairo, Egypt, p.2673-2676, Nov. 2009
Minsoo Rhu and In-Cheol Park, "Architecture Design of a High-Performance Dual-Symbol Binary Arithmetic Coder for JPEG2000", IEEE International Conference on Image Processing 2009 (ICIP 2009), Cairo, Egypt, p.2665-2668, Nov. 2009
Minsoo Rhu and In-Cheol Park, "A Novel Trace-Pipelined Binary Arithmetic Coder Architecture for JPEG2000", IEEE Workshop on Signal Processing Systems 2009 (SiPS 2009), Tampere, Finland, p.243-248, Oct. 2009