Automatic Cross-Replica Sharding of Weight Update in Data-Parallel Training
Xu, Yuanzhong, Lee, HyoukJoong, Chen, Dehao, Choi, Hongjun, Hechtman, Blake, Wang, Shibo
Year of Publication 28.04.2020
Year of Publication 28.04.2020
Get full text
Journal Article
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Lepikhin, Dmitry, Lee, HyoukJoong, Xu, Yuanzhong, Chen, Dehao, Firat, Orhan, Huang, Yanping, Krikun, Maxim, Shazeer, Noam, Chen, Zhifeng
Year of Publication 30.06.2020
Year of Publication 30.06.2020
Get full text
Journal Article
GSPMD: General and Scalable Parallelization for ML Computation Graphs
Xu, Yuanzhong, Lee, HyoukJoong, Chen, Dehao, Hechtman, Blake, Huang, Yanping, Joshi, Rahul, Krikun, Maxim, Lepikhin, Dmitry, Ly, Andy, Maggioni, Marcello, Pang, Ruoming, Shazeer, Noam, Wang, Shibo, Wang, Tao, Wu, Yonghui, Chen, Zhifeng
Year of Publication 10.05.2021
Year of Publication 10.05.2021
Get full text
Journal Article
Exploring the limits of Concurrency in ML Training on Google TPUs
Kumar, Sameer, Bradbury, James, Young, Cliff, Wang, Yu Emma, Levskaya, Anselm, Hechtman, Blake, Chen, Dehao, Lee, HyoukJoong, Deveci, Mehmet, Kumar, Naveen, Kanwar, Pankaj, Wang, Shibo, Wanderman-Milne, Skye, Lacy, Steve, Wang, Tao, Oguntebi, Tayo, Zu, Yazhou, Xu, Yuanzhong, Swing, Andy
Year of Publication 06.11.2020
Year of Publication 06.11.2020
Get full text
Journal Article
Scale MLPerf-0.6 models on Google TPU-v3 Pods
Kumar, Sameer, Bitorff, Victor, Chen, Dehao, Chou, Chiachen, Hechtman, Blake, Lee, HyoukJoong, Kumar, Naveen, Mattson, Peter, Wang, Shibo, Wang, Tao, Xu, Yuanzhong, Zhou, Zongwei
Year of Publication 20.09.2019
Year of Publication 20.09.2019
Get full text
Journal Article
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Huang, Yanping, Cheng, Youlong, Bapna, Ankur, Firat, Orhan, Chen, Mia Xu, Chen, Dehao, Lee, HyoukJoong, Ngiam, Jiquan, Le, Quoc V, Wu, Yonghui, Chen, Zhifeng
Year of Publication 16.11.2018
Year of Publication 16.11.2018
Get full text
Journal Article
LaMDA: Language Models for Dialog Applications
Thoppilan, Romal, De Freitas, Daniel, Hall, Jamie, Shazeer, Noam, Kulshreshtha, Apoorv, Cheng, Heng-Tze, Jin, Alicia, Bos, Taylor, Baker, Leslie, Du, Yu, Li, YaGuang, Lee, Hongrae, Zheng, Huaixiu Steven, Ghafouri, Amin, Menegali, Marcelo, Huang, Yanping, Krikun, Maxim, Lepikhin, Dmitry, Qin, James, Chen, Dehao, Xu, Yuanzhong, Chen, Zhifeng, Roberts, Adam, Bosma, Maarten, Zhao, Vincent, Zhou, Yanqi, Chang, Chung-Ching, Krivokon, Igor, Rusch, Will, Pickett, Marc, Srinivasan, Pranesh, Man, Laichee, Meier-Hellstern, Kathleen, Morris, Meredith Ringel, Doshi, Tulsee, Santos, Renelito Delos, Duke, Toju, Soraker, Johnny, Zevenbergen, Ben, Prabhakaran, Vinodkumar, Diaz, Mark, Hutchinson, Ben, Olson, Kristen, Molina, Alejandra, Hoffman-John, Erin, Lee, Josh, Aroyo, Lora, Rajakumar, Ravi, Butryna, Alena, Lamm, Matthew, Kuzmina, Viktoriya, Fenton, Joe, Cohen, Aaron, Bernstein, Rachel, Kurzweil, Ray, Aguera-Arcas, Blaise, Cui, Claire, Croak, Marian, Chi, Ed, Le, Quoc
Year of Publication 20.01.2022
Year of Publication 20.01.2022
Get full text
Journal Article
MLPerf Training Benchmark
Mattson, Peter, Cheng, Christine, Coleman, Cody, Diamos, Greg, Micikevicius, Paulius, Patterson, David, Tang, Hanlin, Wei, Gu-Yeon, Bailis, Peter, Bittorf, Victor, Brooks, David, Chen, Dehao, Dutta, Debojyoti, Gupta, Udit, Hazelwood, Kim, Hock, Andrew, Huang, Xinyuan, Ike, Atsushi, Jia, Bill, Kang, Daniel, Kanter, David, Kumar, Naveen, Liao, Jeffery, Ma, Guokai, Narayanan, Deepak, Oguntebi, Tayo, Pekhimenko, Gennady, Pentecost, Lillian, Reddi, Vijay Janapa, Robie, Taylor, John, Tom St, Tabaru, Tsuguchika, Wu, Carole-Jean, Xu, Lingjie, Yamazaki, Masafumi, Young, Cliff, Zaharia, Matei
Year of Publication 02.10.2019
Year of Publication 02.10.2019
Get full text
Journal Article
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Shen, Jonathan, Nguyen, Patrick, Wu, Yonghui, Chen, Zhifeng, Chen, Mia X, Jia, Ye, Kannan, Anjuli, Sainath, Tara, Cao, Yuan, Chiu, Chung-Cheng, He, Yanzhang, Chorowski, Jan, Hinsu, Smit, Laurenzo, Stella, Qin, James, Firat, Orhan, Macherey, Wolfgang, Gupta, Suyog, Bapna, Ankur, Zhang, Shuyuan, Pang, Ruoming, Weiss, Ron J, Prabhavalkar, Rohit, Liang, Qiao, Jacob, Benoit, Liang, Bowen, Lee, HyoukJoong, Chelba, Ciprian, Jean, Sébastien, Li, Bo, Johnson, Melvin, Anil, Rohan, Tibrewal, Rajat, Liu, Xiaobing, Eriguchi, Akiko, Jaitly, Navdeep, Ari, Naveen, Cherry, Colin, Haghani, Parisa, Good, Otavio, Cheng, Youlong, Alvarez, Raziel, Caswell, Isaac, Hsu, Wei-Ning, Yang, Zongheng, Wang, Kuan-Chieh, Gonina, Ekaterina, Tomanek, Katrin, Vanik, Ben, Wu, Zelin, Jones, Llion, Schuster, Mike, Huang, Yanping, Chen, Dehao, Irie, Kazuki, Foster, George, Richardson, John, Macherey, Klaus, Bruguier, Antoine, Zen, Heiga, Raffel, Colin, Kumar, Shankar, Rao, Kanishka, Rybach, David, Murray, Matthew, Peddinti, Vijayaditya, Krikun, Maxim, Bacchiani, Michiel A. U, Jablin, Thomas B, Suderman, Rob, Williams, Ian, Lee, Benjamin, Bhatia, Deepti, Carlson, Justin, Yavuz, Semih, Zhang, Yu, McGraw, Ian, Galkin, Max, Ge, Qi, Pundak, Golan, Whipkey, Chad, Wang, Todd, Alon, Uri, Lepikhin, Dmitry, Tian, Ye, Sabour, Sara, Chan, William, Toshniwal, Shubham, Liao, Baohua, Nirschl, Michael, Rondon, Pat
Year of Publication 21.02.2019
Year of Publication 21.02.2019
Get full text
Journal Article