SoK: On Finding Common Ground in Loss Landscapes Using Deep Model Merging Techniques
Khan, Arham, Nief, Todd, Hudson, Nathaniel, Sakarvadia, Mansi, Grzenda, Daniel, Ajith, Aswathy, Pettyjohn, Jordan, Chard, Kyle, Foster, Ian
Year of Publication 16.10.2024
Year of Publication 16.10.2024
Get full text
Journal Article
Attention Lens: A Tool for Mechanistically Interpreting the Attention Head Information Retrieval Mechanism
Sakarvadia, Mansi, Khan, Arham, Ajith, Aswathy, Grzenda, Daniel, Hudson, Nathaniel, Bauer, André, Chard, Kyle, Foster, Ian
Year of Publication 24.10.2023
Year of Publication 24.10.2023
Get full text
Journal Article
Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models
Sakarvadia, Mansi, Ajith, Aswathy, Khan, Arham, Grzenda, Daniel, Hudson, Nathaniel, Bauer, André, Chard, Kyle, Foster, Ian
Year of Publication 11.09.2023
Year of Publication 11.09.2023
Get full text
Journal Article
NuGraph2: A Graph Neural Network for Neutrino Physics Event Reconstruction
Hewes, V, Aurisano, Adam, Cerati, Giuseppe, Kowalkowski, Jim, Lee, Claire, Liao, Wei-keng, Grzenda, Daniel, Gumpula, Kaushal, Zhang, Xiaohe
Year of Publication 18.03.2024
Year of Publication 18.03.2024
Get full text
Journal Article
An Empirical Investigation of Container Building Strategies and Warm Times to Reduce Cold Starts in Scientific Computing Serverless Functions
Bauer, Andre, Gonthier, Maxime, Pan, Haochen, Chard, Ryan, Grzenda, Daniel, Straesser, Martin, Pauloski, J. Gregory, Kamatar, Alok, Baughman, Matt, Hudson, Nathaniel, Foster, Ian, Chard, Kyle
Published in 2024 IEEE 20th International Conference on e-Science (e-Science) (16.09.2024)
Published in 2024 IEEE 20th International Conference on e-Science (e-Science) (16.09.2024)
Get full text
Conference Proceeding