Lian Liu, Long Cheng, Haimeng Ren, Zhaohui Xu, Yudong Pan, Mengdi Wang, Xiaowei Li, Yinhe Han, Ying Wang. COMET: Towards partical W4A4KV4 LLMs serving, Proc. ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2025.
Ying Xu, Long Cheng, Xuyi Cai, Xiaohan Ma, Weiwei Chen, Lei Zhang, Ying Wang. Efficient Supernet Training Using Path Parallelism. Proc. 29th IEEE International Symposium on High-Performance Computer Architecture (HPCA 2023), Montreal, Canada, Feb 2023.