Oltron: Algorithm-Hardware Co-design for Outlier-Aware Quantization of LLMs with Inter-/Intra-Layer Adaptation

Chenhao Xue, Chen Zhang 0001, Xun Jiang 0002, Zhutianya Gao, Yibo Lin, Guangyu Sun 0003. Oltron: Algorithm-Hardware Co-design for Outlier-Aware Quantization of LLMs with Inter-/Intra-Layer Adaptation. In Vivek De, editor, Proceedings of the 61st ACM/IEEE Design Automation Conference, DAC 2024, San Francisco, CA, USA, June 23-27, 2024. ACM, 2024. [doi]

Abstract

Abstract is missing.

  翻译: