A Theory on Adam Instability in Large-Scale Machine Learning
Molybog, Igor, Albert, Peter, Chen, Moya, DeVito, Zachary, Esiobu, David, Goyal, Naman, Koura, Punit Singh, Narang, Sharan, Poulton, Andrew, Silva, Ruan, Tang, Binh, Liskovich, Diana, Xu, Puxin, Zhang, Yuchen, Kambadur, Melanie, Roller, Stephen, Zhang, Susan
Year of Publication 19.04.2023
Year of Publication 19.04.2023
Get full text
Journal Article
The Llama 3 Herd of Models
Hartshorn, Anthony, Yang, Aobo, Gregerson, Austen, Roziere, Baptiste, McConnell, Chris, Wong, Corinne, Song, Daniel, Dinan, Emily, Synnaeve, Gabriel, Nail, Graeme, Touvron, Hugo, Evtimov, Ivan, Liu, Jiawen, Park, Jongsoo, Alwala, Kalyan Vasuden, Plawiak, Kate, Stone, Kevin, van der Maaten, Laurens, Tan, Liang, Pavlova, Maya, Kambadur, Melanie, Chatterji, Niladri, Vasic, Petar, Weng, Peter, Bhargava, Prajjwal, Krishnan, Praveen, He, Qing, Srinivasan, Ragavan, Ganapathy, Raj, Raileanu, Roberta, Taylor, Ross, Hosseini, Saghar, Kim, Seohyun Sonia, Zhang, Shun, Whitman, Spencer, Sheasha, Tarek, Xiong, Wenhan, Fu, Wenyin, Zhang, Yuchen, Yan, Zheng, Grattafiori, Aaron, Jain, Abha, Sharma, Ajay, Boesenberg, Alex, Vaughan, Alex, Baevski, Alexei, Sangani, Amit, Leonhardi, Benjamin, Wu, Bo, Hancock, Braden, Chu, Ching-Hsiang, Civin, Damon, Wyatt, Danny, Foss, Didem, Montgomery, Elaine, Kreuk, Felix, Guzmán, Francisco, Swee, Georgia, Damlaj, Ibrahim, Molybog, Igor, Gat, Itai, Kohli, James, Asher, Japhet, Marcus, Jeff, Jin, Jian, Cummings, Joe, Carvill, Jon, Ginsburg, Josh, Wang, Junjie, Huang, Kyle, Silva, Leandro, Zhang, Lei, Yu, Licheng, Mankus, Martynas, Valko, Michal, Patel, Mihir, Samvelyan, Mikayel, Metanat, Mo, Bansal, Munish, White, Natasha, Singhal, Nayan, Egebo, Nick, Cheng, Norman, Ayub, Rafi, Bondu, Sai Jayesh, Verma, Saurabh, Wang, Sinong, Chen, Stephen, Virk, Sunny, Remez, Tal, Albiero, Vítor, Ionescu, Vlad, Poenaru, Vlad, Ivanov, Vladimir, Wu, Xilun, Hu, Ye, Adi, Yossi, Hao, Yuchen, Rosnbrick, Zef, Wen, Zhaoduo
Year of Publication 31.07.2024
Year of Publication 31.07.2024
Get full text
Journal Article
Llama 2: Open Foundation and Fine-Tuned Chat Models
Touvron, Hugo, Martin, Louis, Stone, Kevin, Albert, Peter, Almahairi, Amjad, Babaei, Yasmine, Bashlykov, Nikolay, Batra, Soumya, Bhargava, Prajjwal, Bhosale, Shruti, Bikel, Dan, Blecher, Lukas, Ferrer, Cristian Canton, Chen, Moya, Cucurull, Guillem, Esiobu, David, Fernandes, Jude, Fu, Jeremy, Fu, Wenyin, Fuller, Brian, Gao, Cynthia, Goswami, Vedanuj, Goyal, Naman, Hartshorn, Anthony, Hosseini, Saghar, Hou, Rui, Inan, Hakan, Kardas, Marcin, Kerkez, Viktor, Khabsa, Madian, Kloumann, Isabel, Korenev, Artem, Koura, Punit Singh, Lachaux, Marie-Anne, Lavril, Thibaut, Lee, Jenya, Liskovich, Diana, Lu, Yinghai, Mao, Yuning, Martinet, Xavier, Mihaylov, Todor, Mishra, Pushkar, Molybog, Igor, Nie, Yixin, Poulton, Andrew, Reizenstein, Jeremy, Rungta, Rashi, Saladi, Kalyan, Schelten, Alan, Silva, Ruan, Smith, Eric Michael, Subramanian, Ranjan, Tan, Xiaoqing Ellen, Tang, Binh, Taylor, Ross, Williams, Adina, Kuan, Jian Xiang, Xu, Puxin, Yan, Zheng, Zarov, Iliyan, Zhang, Yuchen, Fan, Angela, Kambadur, Melanie, Narang, Sharan, Rodriguez, Aurelien, Stojnic, Robert, Edunov, Sergey, Scialom, Thomas
Year of Publication 18.07.2023
Year of Publication 18.07.2023
Get full text
Journal Article
A Theory on Adam Instability in Large-Scale Machine Learning
Molybog, Igor, Albert, Peter, Chen, Moya, DeVito, Zachary, Esiobu, David, Goyal, Naman, Punit Singh Koura, Narang, Sharan, Poulton, Andrew, Silva, Ruan, Tang, Binh, Liskovich, Diana, Xu, Puxin, Zhang, Yuchen, Kambadur, Melanie, Roller, Stephen, Zhang, Susan
Published in arXiv.org (25.04.2023)
Get full text
Published in arXiv.org (25.04.2023)
Paper
The Llama 3 Herd of Models
Dubey, Abhimanyu, Letman, Aiesha, Yang, Aobo, Mitra, Archi, Bi, Chloe, Touret, Christophe, Song, Daniel, Perino, Diego, Mialon, Gregoire, Pang, Guan, Hailey Nguyen, Korevaar, Hannah, Imanol Arrieta Ibarra, Kloumann, Isabel, Shah, Jeet, Fu, Jeremy, Spisak, Joe, Jia, Junteng, Upasani, Kartikeya, Heafield, Kenneth, Stone, Kevin, El-Arini, Khalid, Iyer, Krithika, Chiu, Kuenley, Martin, Louis, Malo, Lubo, Duchenne, Olivier, Weng, Peter, Bhargava, Prajjwal, Dong, Qingxiao, Patel, Rohit, Sauvestre, Romain, Taylor, Ross, Wan, Shengye, Bhosale, Shruti, Vandenhende, Simon, Whitman, Spencer, Sootla, Sten, Speckbacher, Tobias, Karn, Ujjwal, Wang, Xuewei, Goldschlag, Yaelle, Wen, Yi, Song, Yiwen, Chen, Zhengxing, Jain, Abha, Kelsey, Adam, Victoria, Adolfo, Saraf, Aparajita, Eisenman, Assaf, Hancock, Braden, Spence, Brandon, Hu, Chester, Beaty, Dana, Xu, David, Dowling, Edward, Sun, Fei, Tian, Feng, Seide, Frank, Gabriela Medina Florez, Schwarz, Gabriella, Zhang, Zou, Han, Haroun Habeeb, Goldman, Hunter, Kohli, James, Tang, Jeff, Zhong, Jessica, Yang, Jingyi, Kam, Hou U, Lakhotia, Kushal, Moshkovich, Liron, Khabsa, Madian, Bhatt, Manish, Mankus, Martynas, Keneally, Meghan, Clark, Mike, Laptev, Nikolay Pavlovich, Parkin, Kent, Pavan Balaji, Dollar, Piotr, Yuvraj, Pritish, Hogan, Rebekkah, Maheswari, Rohan, Sai, Jayesh Bondu, Datta, Samyak, Chugh, Sara, Dhillon, Sargun, Virk, Sunny, Remez, Tal, Glaser, Tamar, Robinson, Thomas, Li, Tianhe, Matthews, Tim, Mangla, Vishal, Mihailescu, Vlad Tiberiu, Wu, Xilun, Hu, Ye, Qian, Yundi, Rait, Zach
Published in arXiv.org (15.08.2024)
Get full text
Published in arXiv.org (15.08.2024)
Paper
Llama 2: Open Foundation and Fine-Tuned Chat Models
Touvron, Hugo, Martin, Louis, Stone, Kevin, Albert, Peter, Almahairi, Amjad, Babaei, Yasmine, Bashlykov, Nikolay, Batra, Soumya, Bhargava, Prajjwal, Bhosale, Shruti, Bikel, Dan, Blecher, Lukas, Cristian Canton Ferrer, Chen, Moya, Cucurull, Guillem, Esiobu, David, Fernandes, Jude, Fu, Jeremy, Fu, Wenyin, Fuller, Brian, Gao, Cynthia, Goswami, Vedanuj, Goyal, Naman, Hartshorn, Anthony, Hosseini, Saghar, Hou, Rui, Inan, Hakan, Kardas, Marcin, Kerkez, Viktor, Khabsa, Madian, Kloumann, Isabel, Korenev, Artem, Punit Singh Koura, Marie-Anne Lachaux, Lavril, Thibaut, Lee, Jenya, Liskovich, Diana, Lu, Yinghai, Mao, Yuning, Martinet, Xavier, Mihaylov, Todor, Mishra, Pushkar, Molybog, Igor, Nie, Yixin, Poulton, Andrew, Reizenstein, Jeremy, Rungta, Rashi, Saladi, Kalyan, Schelten, Alan, Silva, Ruan, Smith, Eric Michael, Subramanian, Ranjan, Tan, Xiaoqing Ellen, Tang, Binh, Taylor, Ross, Williams, Adina, Jian Xiang Kuan, Xu, Puxin, Zheng, Yan, Zarov, Iliyan, Zhang, Yuchen, Fan, Angela, Kambadur, Melanie, Narang, Sharan, Rodriguez, Aurelien, Stojnic, Robert, Edunov, Sergey, Scialom, Thomas
Published in arXiv.org (19.07.2023)
Get full text
Published in arXiv.org (19.07.2023)
Paper