Modeling Causal Mechanisms with Diffusion Models for Interventional and Counterfactual Queries
Chao, Patrick, Blöbaum, Patrick, Patel, Sapan, Kasiviswanathan, Shiva Prasad
Year of Publication 01.02.2023
Year of Publication 01.02.2023
Get full text
Journal Article
Jailbreaking Black Box Large Language Models in Twenty Queries
Chao, Patrick, Robey, Alexander, Dobriban, Edgar, Hassani, Hamed, Pappas, George J, Wong, Eric
Year of Publication 12.10.2023
Year of Publication 12.10.2023
Get full text
Journal Article
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Chao, Patrick, Debenedetti, Edoardo, Robey, Alexander, Andriushchenko, Maksym, Croce, Francesco, Sehwag, Vikash, Dobriban, Edgar, Flammarion, Nicolas, Pappas, George J, Tramer, Florian, Hassani, Hamed, Wong, Eric
Year of Publication 27.03.2024
Year of Publication 27.03.2024
Get full text
Journal Article
A Safe Harbor for AI Evaluation and Red Teaming
Longpre, Shayne, Kapoor, Sayash, Klyman, Kevin, Ramaswami, Ashwin, Bommasani, Rishi, Blili-Hamelin, Borhane, Huang, Yangsibo, Skowron, Aviya, Yong, Zheng-Xin, Kotha, Suhas, Zeng, Yi, Shi, Weiyan, Yang, Xianjun, Southen, Reid, Robey, Alexander, Chao, Patrick, Yang, Diyi, Jia, Ruoxi, Kang, Daniel, Pentland, Sandy, Narayanan, Arvind, Liang, Percy, Henderson, Peter
Year of Publication 07.03.2024
Year of Publication 07.03.2024
Get full text
Journal Article
Jailbreaking Black Box Large Language Models in Twenty Queries
Chao, Patrick, Robey, Alexander, Dobriban, Edgar, Hassani, Hamed, Pappas, George J, Wong, Eric
Published in arXiv.org (18.07.2024)
Get full text
Published in arXiv.org (18.07.2024)
Paper
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Chao, Patrick, Debenedetti, Edoardo, Robey, Alexander, Andriushchenko, Maksym, Croce, Francesco, Sehwag, Vikash, Dobriban, Edgar, Flammarion, Nicolas, Pappas, George J, Tramer, Florian, Hassani, Hamed, Wong, Eric
Published in arXiv.org (31.10.2024)
Get full text
Published in arXiv.org (31.10.2024)
Paper
The Stochastic Replica Approach to Machine Learning: Stability and Parameter Optimization
Chao, Patrick, Mazaheri, Tahereh, Sun, Bo, Weingartner, Nicholas B, Nussinov, Zohar
Year of Publication 18.08.2017
Year of Publication 18.08.2017
Get full text
Journal Article