CompressPoints: An Evaluation Methodology for Compressed Memory Systems
Choukse, Esha, Erez, Mattan, Alameldeen, Alaa
Published in IEEE computer architecture letters (01.07.2018)
Published in IEEE computer architecture letters (01.07.2018)
Get full text
Journal Article
Towards Improved Power Management in Cloud GPUs
Patel, Pratyush, Gong, Zibo, Rizvi, Syeda, Choukse, Esha, Misra, Pulkit, Anderson, Tom, Sriraman, Akshitha
Published in IEEE computer architecture letters (01.07.2023)
Published in IEEE computer architecture letters (01.07.2023)
Get full text
Journal Article
Compresso: pragmatic main memory compression
Choukse, Esha, Erez, Mattan, Alameldeen, Alaa R.
Published in 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (20.10.2018)
Published in 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (20.10.2018)
Get full text
Conference Proceeding
Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs
Choukse, Esha, Sullivan, Michael B., O'Connor, Mike, Erez, Mattan, Pool, Jeff, Nellans, David, Keckler, Stephen W.
Published in 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) (01.05.2020)
Published in 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) (01.05.2020)
Get full text
Conference Proceeding
Overclocking in Immersion-Cooled Datacenters
Misra, Pulkit A., Manousakis, Ioannis, Choukse, Esha, Jalili, Majid, Goiri, Inigo, Raniwala, Ashish, Warrier, Brijesh, Alissa, Husam, Ramakrishnan, Bharath, Tuma, Phillip, Belady, Christian, Fontoura, Marcus, Bianchini, Ricardo
Published in IEEE MICRO (01.07.2022)
Published in IEEE MICRO (01.07.2022)
Get full text
Journal Article
Translation-optimized Memory Compression for Capacity
Panwar, Gagandeep, Laghari, Muhammad, Bears, David, Liu, Yuqing, Jearls, Chandler, Choukse, Esha, Cameron, Kirk W., Butt, Ali R., Jian, Xun
Published in 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO) (01.10.2022)
Published in 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO) (01.10.2022)
Get full text
Conference Proceeding
DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency
Stojkovic, Jovan, Zhang, Chaojie, Goiri, Íñigo, Torrellas, Josep, Choukse, Esha
Year of Publication 01.08.2024
Year of Publication 01.08.2024
Get full text
Journal Article
Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference
Stojkovic, Jovan, Choukse, Esha, Zhang, Chaojie, Goiri, Inigo, Torrellas, Josep
Year of Publication 29.03.2024
Year of Publication 29.03.2024
Get full text
Journal Article
Mnemosyne: Parallelization Strategies for Efficiently Serving Multi-Million Context Length LLM Inference Requests Without Approximations
Agrawal, Amey, Chen, Junda, Goiri, Íñigo, Ramjee, Ramachandran, Zhang, Chaojie, Tumanov, Alexey, Choukse, Esha
Year of Publication 25.09.2024
Year of Publication 25.09.2024
Get full text
Journal Article
Splitwise: Efficient Generative LLM Inference Using Phase Splitting
Patel, Pratyush, Choukse, Esha, Zhang, Chaojie, Shah, Aashaka, Goiri, Inigo, Maleki, Saeed, Bianchini, Ricardo
Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
Get full text
Conference Proceeding
Splitwise: Efficient generative LLM inference using phase splitting
Patel, Pratyush, Choukse, Esha, Zhang, Chaojie, Shah, Aashaka, Goiri, Íñigo, Maleki, Saeed, Bianchini, Ricardo
Year of Publication 30.11.2023
Year of Publication 30.11.2023
Get full text
Journal Article
POLCA: Power Oversubscription in LLM Cloud Providers
Patel, Pratyush, Choukse, Esha, Zhang, Chaojie, Goiri, Íñigo, Warrier, Brijesh, Mahalingam, Nithish, Bianchini, Ricardo
Year of Publication 24.08.2023
Year of Publication 24.08.2023
Get full text
Journal Article
Junctiond: Extending FaaS Runtimes with Kernel-Bypass
Saurez, Enrique, Fried, Joshua, Chaudhry, Gohar Irfan, Choukse, Esha, Goiri, Íñigo, Elnikety, Sameh, Belay, Adam, Fonseca, Rodrigo
Year of Publication 05.03.2024
Year of Publication 05.03.2024
Get full text
Journal Article
Intelligent Router for LLM Workloads: Improving Performance Through Workload-Aware Scheduling
Jain, Kunal, Parayil, Anjaly, Mallick, Ankur, Choukse, Esha, Qin, Xiaoting, Zhang, Jue, Goiri, Íñigo, Wang, Rujia, Bansal, Chetan, Rühle, Victor, Kulkarni, Anoop, Kofsky, Steve, Rajmohan, Saravan
Year of Publication 24.08.2024
Year of Publication 24.08.2024
Get full text
Journal Article
Designing Cloud Servers for Lower Carbon
Wang, Jaylen, Berger, Daniel S., Kazhamiaka, Fiodar, Irvene, Celine, Zhang, Chaojie, Choukse, Esha, Frost, Kali, Fonseca, Rodrigo, Warrier, Brijesh, Bansal, Chetan, Stern, Jonathan, Bianchini, Ricardo, Sriraman, Akshitha
Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
Get full text
Conference Proceeding