Understanding Robustness of Transformers for Image Classification
Bhojanapalli, Srinadh, Chakrabarti, Ayan, Glasner, Daniel, Li, Daliang, Unterthiner, Thomas, Veit, Andreas
Published in 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (01.10.2021)
Published in 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (01.10.2021)
Get full text
Conference Proceeding
Provable compressed sensing quantum state tomography via non-convex methods
Kyrillidis, Anastasios, Kalev, Amir, Park, Dohyung, Bhojanapalli, Srinadh, Caramanis, Constantine, Sanghavi, Sujay
Published in npj quantum information (01.08.2018)
Published in npj quantum information (01.08.2018)
Get full text
Journal Article
Implicit Regularization in Matrix Factorization
Gunasekar, Suriya, Woodworth, Blake, Bhojanapalli, Srinadh, Neyshabur, Behnam, Srebro, Nathan
Published in 2018 Information Theory and Applications Workshop (ITA) (01.02.2018)
Published in 2018 Information Theory and Applications Workshop (ITA) (01.02.2018)
Get full text
Conference Proceeding
Mimetic Initialization Helps State Space Models Learn to Recall
Trockman, Asher, Harutyunyan, Hrayr, Kolter, J. Zico, Kumar, Sanjiv, Bhojanapalli, Srinadh
Year of Publication 14.10.2024
Year of Publication 14.10.2024
Get full text
Journal Article
On the Adversarial Robustness of Mixture of Experts
Puigcerver, Joan, Jenatton, Rodolphe, Riquelme, Carlos, Awasthi, Pranjal, Bhojanapalli, Srinadh
Year of Publication 18.10.2022
Year of Publication 18.10.2022
Get full text
Journal Article
Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure
Cho, Hanseul, Cha, Jaeyoung, Awasthi, Pranjal, Bhojanapalli, Srinadh, Gupta, Anupam, Yun, Chulhee
Year of Publication 31.05.2024
Year of Publication 31.05.2024
Get full text
Journal Article
Efficient Language Model Architectures for Differentially Private Federated Learning
Ro, Jae Hun, Bhojanapalli, Srinadh, Xu, Zheng, Zhang, Yanxiang, Suresh, Ananda Theertha
Year of Publication 12.03.2024
Year of Publication 12.03.2024
Get full text
Journal Article
HiRE: High Recall Approximate Top-$k$ Estimation for Efficient LLM Inference
L, Yashas Samaga B, Yerram, Varun, You, Chong, Bhojanapalli, Srinadh, Kumar, Sanjiv, Jain, Prateek, Netrapalli, Praneeth
Year of Publication 14.02.2024
Year of Publication 14.02.2024
Get full text
Journal Article
Dual-Encoders for Extreme Multi-Label Classification
Gupta, Nilesh, Khatri, Devvrit, Rawat, Ankit S, Bhojanapalli, Srinadh, Jain, Prateek, Dhillon, Inderjit
Year of Publication 16.10.2023
Year of Publication 16.10.2023
Get full text
Journal Article
Depth Dependence of $\mu$P Learning Rates in ReLU MLPs
Jelassi, Samy, Hanin, Boris, Ji, Ziwei, Reddi, Sashank J, Bhojanapalli, Srinadh, Kumar, Sanjiv
Year of Publication 12.05.2023
Year of Publication 12.05.2023
Get full text
Journal Article
Teacher's pet: understanding and mitigating biases in distillation
Lukasik, Michal, Bhojanapalli, Srinadh, Menon, Aditya Krishna, Kumar, Sanjiv
Year of Publication 19.06.2021
Year of Publication 19.06.2021
Get full text
Journal Article
On student-teacher deviations in distillation: does it pay to disobey?
Nagarajan, Vaishnavh, Menon, Aditya Krishna, Bhojanapalli, Srinadh, Mobahi, Hossein, Kumar, Sanjiv
Year of Publication 30.01.2023
Year of Publication 30.01.2023
Get full text
Journal Article
Functional Interpolation for Relative Positions Improves Long Context Transformers
Li, Shanda, You, Chong, Guruganesh, Guru, Ainslie, Joshua, Ontanon, Santiago, Zaheer, Manzil, Sanghai, Sumit, Yang, Yiming, Kumar, Sanjiv, Bhojanapalli, Srinadh
Year of Publication 06.10.2023
Year of Publication 06.10.2023
Get full text
Journal Article
An efficient nonconvex reformulation of stagewise convex optimization problems
Bunel, Rudy, Hinder, Oliver, Bhojanapalli, Srinadh, Krishnamurthy, Dvijotham
Year of Publication 27.10.2020
Year of Publication 27.10.2020
Get full text
Journal Article
Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count
Cho, Hanseul, Cha, Jaeyoung, Bhojanapalli, Srinadh, Yun, Chulhee
Published in arXiv.org (21.10.2024)
Get full text
Published in arXiv.org (21.10.2024)
Paper
Robust Training of Neural Networks Using Scale Invariant Architectures
Li, Zhiyuan, Bhojanapalli, Srinadh, Zaheer, Manzil, Reddi, Sashank J, Kumar, Sanjiv
Year of Publication 02.02.2022
Year of Publication 02.02.2022
Get full text
Journal Article
Does label smoothing mitigate label noise?
Lukasik, Michal, Bhojanapalli, Srinadh, Menon, Aditya Krishna, Kumar, Sanjiv
Year of Publication 05.03.2020
Year of Publication 05.03.2020
Get full text
Journal Article