Efficient automatic parallelization of a single GPU program for a multiple GPU system
Kiran Kumar, Matam, Abdel-Majeed, Mohammad Rajab, Annavaram, Murali
Published in Integration (Amsterdam) (01.05.2019)
Published in Integration (Amsterdam) (01.05.2019)
Get full text
Journal Article
Accelerating Sparse Matrix Vector Multiplication in Iterative Methods Using GPU
Matam, K. K., Kothapalli, K.
Published in 2011 International Conference on Parallel Processing (01.09.2011)
Published in 2011 International Conference on Parallel Processing (01.09.2011)
Get full text
Conference Proceeding
GraphSSD: Graph Semantics Aware SSD
Matam, Kiran Kumar, Koo, Gunjae, Zha, Haipeng, Tseng, Hung-Wei, Annavaram, Murali
Published in 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA) (01.06.2019)
Published in 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA) (01.06.2019)
Get full text
Conference Proceeding
Summarizer: trading communication with computing near storage
Koo, Gunjae, Matam, Kiran Kumar, I, Te, Narra, H. V. Krishna Giri, Li, Jing, Tseng, Hung-Wei, Swanson, Steven, Annavaram, Murali
Published in 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (14.10.2017)
Published in 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (14.10.2017)
Get full text
Conference Proceeding
Efficient Discrete Range Searching primitives on the GPU with applications
Soman, J, Kumar, M K, Kothapalli, K, Narayanan, P J
Published in 2010 International Conference on High Performance Computing (01.12.2010)
Published in 2010 International Conference on High Performance Computing (01.12.2010)
Get full text
Conference Proceeding
Check-N-Run: A Checkpointing System for Training Deep Learning Recommendation Models
Eisenman, Assaf, Matam, Kiran Kumar, Ingram, Steven, Mudigere, Dheevatsa, Krishnamoorthi, Raghuraman, Nair, Krishnakumar, Smelyanskiy, Misha, Annavaram, Murali
Year of Publication 16.10.2020
Year of Publication 16.10.2020
Get full text
Journal Article
Energy-efficient large-scale matrix multiplication on FPGAs
Matam, Kiran Kumar, Prasanna, Viktor K.
Published in 2013 International Conference on Reconfigurable Computing and FPGAs (ReConFig) (01.12.2013)
Published in 2013 International Conference on Reconfigurable Computing and FPGAs (ReConFig) (01.12.2013)
Get full text
Conference Proceeding
Evaluating energy efficiency of floating point matrix multiplication on FPGAs
Matam, Kiran Kumar, Hoang Le, Prasanna, Viktor K.
Published in 2013 IEEE High Performance Extreme Computing Conference (HPEC) (01.09.2013)
Published in 2013 IEEE High Performance Extreme Computing Conference (HPEC) (01.09.2013)
Get full text
Conference Proceeding
Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
Mudigere, Dheevatsa, Hao, Yuchen, Huang, Jianyu, Jia, Zhihao, Tulloch, Andrew, Sridharan, Srinivas, Liu, Xing, Ozdal, Mustafa, Nie, Jade, Park, Jongsoo, Luo, Liang, Yang, Jie Amy, Gao, Leon, Ivchenko, Dmytro, Basant, Aarti, Hu, Yuxi, Yang, Jiyan, Ardestani, Ehsan K, Wang, Xiaodong, Komuravelli, Rakesh, Chu, Ching-Hsiang, Yilmaz, Serhat, Li, Huayu, Qian, Jiyuan, Feng, Zhuobo, Ma, Yinbin, Yang, Junjie, Wen, Ellie, Li, Hong, Yang, Lin, Sun, Chonglin, Zhao, Whitney, Melts, Dimitry, Dhulipala, Krishna, Kishore, KR, Graf, Tyler, Eisenman, Assaf, Matam, Kiran Kumar, Gangidi, Adi, Chen, Guoqiang Jerry, Krishnan, Manoj, Nayak, Avinash, Nair, Krishnakumar, Muthiah, Bharath, khorashadi, Mahmoud, Bhattacharya, Pallab, Lapukhov, Petr, Naumov, Maxim, Mathews, Ajit, Qiao, Lin, Smelyanskiy, Mikhail, Jia, Bill, Rao, Vijay
Year of Publication 11.04.2021
Year of Publication 11.04.2021
Get full text
Journal Article
Check-N-Run: A Checkpointing System for Training Deep Learning Recommendation Models
Eisenman, Assaf, Matam, Kiran Kumar, Ingram, Steven, Mudigere, Dheevatsa, Krishnamoorthi, Raghuraman, Nair, Krishnakumar, Smelyanskiy, Misha, Annavaram, Murali
Published in arXiv.org (04.05.2021)
Get full text
Published in arXiv.org (04.05.2021)
Paper
Energy efficient architecture for matrix multiplication on FPGAs
Matam, Kiran Kumar, Hoang Le, Prasanna, Viktor K.
Published in 2013 23rd International Conference on Field programmable Logic and Applications (01.09.2013)
Published in 2013 23rd International Conference on Field programmable Logic and Applications (01.09.2013)
Get full text
Conference Proceeding
Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
Mudigere, Dheevatsa, Hao, Yuchen, Huang, Jianyu, Jia, Zhihao, Tulloch, Andrew, Sridharan, Srinivas, Liu, Xing, Ozdal, Mustafa, Nie, Jade, Park, Jongsoo, Luo, Liang, Yang, Jie Amy, Gao, Leon, Ivchenko, Dmytro, Basant, Aarti, Hu, Yuxi, Yang, Jiyan, Ardestani, Ehsan K, Wang, Xiaodong, Komuravelli, Rakesh, Chu, Ching-Hsiang, Yilmaz, Serhat, Li, Huayu, Qian, Jiyuan, Feng, Zhuobo, Ma, Yinbin, Yang, Junjie, Wen, Ellie, Li, Hong, Yang, Lin, Sun, Chonglin, Zhao, Whitney, Melts, Dimitry, Dhulipala, Krishna, Kishore, K R, Graf, Tyler, Eisenman, Assaf, Matam, Kiran Kumar, Gangidi, Adi, Chen, Guoqiang Jerry, Krishnan, Manoj, Nayak, Avinash, Nair, Krishnakumar, Muthiah, Bharath, khorashadi, Mahmoud, Bhattacharya, Pallab, Lapukhov, Petr, Naumov, Maxim, Mathews, Ajit, Lin, Qiao, Smelyanskiy, Mikhail, Jia, Bill, Rao, Vijay
Published in arXiv.org (27.02.2023)
Get full text
Published in arXiv.org (27.02.2023)
Paper
GPU Accelerated Lanczos Algorithm with Applications
Matam, K K, Kothapalli, K
Published in 2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications (01.03.2011)
Published in 2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications (01.03.2011)
Get full text
Conference Proceeding