HAL: Hardware-assisted Load Balancing for Energy-efficient SNIC-Host Cooperative Computing
Huang, Jinghan, Lou, Jiaqi, Vanavasam, Srikar, Kong, Xinhao, Ji, Houxiang, Jeong, Ipoom, Zhuo, Danyang, Lee, Eun Kyung, Kim, Nam Sung
Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
Get full text
Conference Proceeding
VcLLM: Video Codecs are Secretly Tensor Codecs
Xu, Ceyu, Wu, Yongji, Yang, Xinyu, Chen, Beidi, Lentz, Matthew, Zhuo, Danyang, Wills, Lisa Wu
Year of Publication 29.06.2024
Year of Publication 29.06.2024
Get full text
Journal Article
Adaptive and Dynamic Multi-Resolution Hashing for Pairwise Summations
Qin, Lianke, Reddy, Aravind, Song, Zhao, Xu, Zhaozhuo, Zhuo, Danyang
Published in 2022 IEEE International Conference on Big Data (Big Data) (17.12.2022)
Published in 2022 IEEE International Conference on Big Data (Big Data) (17.12.2022)
Get full text
Conference Proceeding
Curator: Efficient Indexing for Multi-Tenant Vector Databases
Jin, Yicheng, Wu, Yongji, Hu, Wenjun, Maggs, Bruce M, Zhang, Xiao, Zhuo, Danyang
Year of Publication 13.01.2024
Year of Publication 13.01.2024
Get full text
Journal Article
Punica: Multi-Tenant LoRA Serving
Chen, Lequn, Ye, Zihao, Wu, Yongji, Zhuo, Danyang, Ceze, Luis, Krishnamurthy, Arvind
Year of Publication 27.10.2023
Year of Publication 27.10.2023
Get full text
Journal Article
Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models with Adaptive Expert Placement
Wu, Yongji, Qu, Wenjie, Tao, Tianyang, Wang, Zhuang, Bai, Wei, Li, Zhuohao, Tian, Yuan, Zhang, Jiaheng, Lentz, Matthew, Zhuo, Danyang
Year of Publication 05.07.2024
Year of Publication 05.07.2024
Get full text
Journal Article
Symphony: Optimized DNN Model Serving using Deferred Batch Scheduling
Chen, Lequn, Deng, Weixin, Canumalla, Anirudh, Xin, Yu, Zhuo, Danyang, Philipose, Matthai, Krishnamurthy, Arvind
Year of Publication 14.08.2023
Year of Publication 14.08.2023
Get full text
Journal Article
Agile Development of Linux Schedulers with Ekiben
Miller, Samantha, Kumar, Anirudh, Vakharia, Tanay, Anderson, Tom, Chen, Ang, Zhuo, Danyang
Year of Publication 26.06.2023
Year of Publication 26.06.2023
Get full text
Journal Article
Collie: Finding Performance Anomalies in RDMA Subsystems
Kong, Xinhao, Zhu, Yibo, Zhou, Huaping, Jiang, Zhuo, Ye, Jianxi, Guo, Chuanxiong, Zhuo, Danyang
Year of Publication 22.04.2023
Year of Publication 22.04.2023
Get full text
Journal Article
Adaptive Skeleton Graph Decoding
Jin, Shuowei, Wu, Yongji, Zheng, Haizhong, Zhang, Qingzhao, Lentz, Matthew, Mao, Z. Morley, Prakash, Atul, Qian, Feng, Zhuo, Danyang
Year of Publication 19.02.2024
Year of Publication 19.02.2024
Get full text
Journal Article
Fairness in Serving Large Language Models
Sheng, Ying, Cao, Shiyi, Li, Dacheng, Zhu, Banghua, Li, Zhuohan, Zhuo, Danyang, Gonzalez, Joseph E, Stoica, Ion
Year of Publication 31.12.2023
Year of Publication 31.12.2023
Get full text
Journal Article