Imperceptible Content Poisoning in LLM-Powered Applications

Large Language Models (LLMs) have shown their superior capability in natural language processing, promoting extensive LLM-powered applications to be the new portals for people to access various content on the Internet. However, LLM-powered applications do not have sufficient security considerations...

Full description

Saved in:

Bibliographic Details
Published in	IEEE/ACM International Conference on Automated Software Engineering : [proceedings] pp. 242 - 254
Main Authors	Zhang, Quan, Zhou, Chijin, Go, Gwihwan, Zeng, Binqi, Shi, Heyuan, Xu, Zichen, Jiang, Yu
Format	Conference Proceeding
Language	English
Published	ACM 27.10.2024
Subjects	Content Poisoning Focusing Internet Large language models LLM Applications Machine learning Natural language processing Portals Prevention and mitigation Privacy Security Software engineering
Online Access	Get full text
ISSN	2643-1572
DOI	10.1145/3691620.3695001

Cover

Abstract	Large Language Models (LLMs) have shown their superior capability in natural language processing, promoting extensive LLM-powered applications to be the new portals for people to access various content on the Internet. However, LLM-powered applications do not have sufficient security considerations on untrusted content, leading to potential threats. In this paper, we reveal content poisoning, where attackers can tailor attack content that appears benign to humans but causes LLM-powered applications to generate malicious responses. To highlight the impact of content poisoning and inspire the development of effective defenses, we systematically analyze the attack, focusing on the attack modes in various content, exploitable design features of LLM application frameworks, and the generation of attack content. We carry out a comprehensive evaluation on five LLMs, where content poisoning achieves an average attack success rate of 89.60%. Additionally, we assess content poisoning on four popular LLM-powered applications, achieving the attack on 72.00% of the content. Our experimental results also show that existing defenses are ineffective against content poisoning. Finally, we discuss potential mitigations for LLM application frameworks to counter content poisoning.CCS CONCEPTS* Computing methodologies → Machine learning; * Security and privacy;
AbstractList	Large Language Models (LLMs) have shown their superior capability in natural language processing, promoting extensive LLM-powered applications to be the new portals for people to access various content on the Internet. However, LLM-powered applications do not have sufficient security considerations on untrusted content, leading to potential threats. In this paper, we reveal content poisoning, where attackers can tailor attack content that appears benign to humans but causes LLM-powered applications to generate malicious responses. To highlight the impact of content poisoning and inspire the development of effective defenses, we systematically analyze the attack, focusing on the attack modes in various content, exploitable design features of LLM application frameworks, and the generation of attack content. We carry out a comprehensive evaluation on five LLMs, where content poisoning achieves an average attack success rate of 89.60%. Additionally, we assess content poisoning on four popular LLM-powered applications, achieving the attack on 72.00% of the content. Our experimental results also show that existing defenses are ineffective against content poisoning. Finally, we discuss potential mitigations for LLM application frameworks to counter content poisoning.CCS CONCEPTS* Computing methodologies → Machine learning; * Security and privacy;
Author	Shi, Heyuan Zhou, Chijin Xu, Zichen Zeng, Binqi Go, Gwihwan Zhang, Quan Jiang, Yu
Author_xml	– sequence: 1 givenname: Quan surname: Zhang fullname: Zhang, Quan organization: Tsinghua University,Beijing,China – sequence: 2 givenname: Chijin surname: Zhou fullname: Zhou, Chijin organization: Tsinghua University,Beijing,China – sequence: 3 givenname: Gwihwan surname: Go fullname: Go, Gwihwan organization: Tsinghua University,Beijing,China – sequence: 4 givenname: Binqi surname: Zeng fullname: Zeng, Binqi organization: Central South University,Changsha,China – sequence: 5 givenname: Heyuan surname: Shi fullname: Shi, Heyuan organization: Central South University,Changsha,China – sequence: 6 givenname: Zichen surname: Xu fullname: Xu, Zichen organization: Nanchang University,Nanchang,China – sequence: 7 givenname: Yu surname: Jiang fullname: Jiang, Yu organization: Tsinghua University,Beijing,China
BookMark	eNotjU9LwzAcQKMoOGfPXjz0C3Tml__B0yhOBxV30PNI0l8k0CWlLYjf3oGe3ju9d0uucslIyD3QDYCQj1xZUIxuzpSUwgWprLZGUKqBCaMvyYopwRuQmt2Qap6Tp2eVCkCtyNP-NOIUcFySH7BuS14wL_WhpLnklL_qlOuue2sO5Rsn7OvtOA4puCWVPN-R6-iGGat_rsnn7vmjfW2695d9u-0ad_4vjUVmBZMuSFA-xN4IERCt5iZEtN5HJ00PKIOTUdPYW2lYL7zkkXrhwfM1efjrJkQ8jlM6uennCFQrYS3nv5QmSr4
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1145/3691620.3695001
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	9798400712487
EISSN	2643-1572
EndPage	254
ExternalDocumentID	10764993
Genre	orig-research
GrantInformation_xml	– fundername: Research and Development funderid: 10.13039/100006190
GroupedDBID	6IE 6IF 6IH 6IK 6IL 6IM 6IN 6J9 AAJGR AAWTH ABLEC ACREN ADYOE ADZIZ AFYQB ALMA_UNASSIGNED_HOLDINGS AMTXH BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL
ID	FETCH-LOGICAL-a248t-9e29425ac516bcfd844cee9738cfe9bbfa58d1e5ca5f70fd9582d4b53f0b4b1b3
IEDL.DBID	RIE
IngestDate	Wed Jan 15 06:20:43 EST 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a248t-9e29425ac516bcfd844cee9738cfe9bbfa58d1e5ca5f70fd9582d4b53f0b4b1b3
PageCount	13
ParticipantIDs	ieee_primary_10764993
PublicationCentury	2000
PublicationDate	2024-Oct.-27
PublicationDateYYYYMMDD	2024-10-27
PublicationDate_xml	– month: 10 year: 2024 text: 2024-Oct.-27 day: 27
PublicationDecade	2020
PublicationTitle	IEEE/ACM International Conference on Automated Software Engineering : [proceedings]
PublicationTitleAbbrev	ASE
PublicationYear	2024
Publisher	ACM
Publisher_xml	– name: ACM
SSID	ssib057256116 ssj0051577
Score	2.2864428
Snippet	Large Language Models (LLMs) have shown their superior capability in natural language processing, promoting extensive LLM-powered applications to be the new...
SourceID	ieee
SourceType	Publisher
StartPage	242
SubjectTerms	Content Poisoning Focusing Internet Large language models LLM Applications Machine learning Natural language processing Portals Prevention and mitigation Privacy Security Software engineering
Title	Imperceptible Content Poisoning in LLM-Powered Applications
URI	https://ieeexplore.ieee.org/document/10764993
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LSgMxFA22K1f1UfFNFm5TZ_IOrkQsVdrShYXuSjK5AVGmItONX28yDxVBcFbDwAyXTO6ck8k99yB0peOSQpjMEkpBEy4YJZZZIJypEJwFk_ukHZ7N5WTJH1di1YrVay0MANTFZzBKp_Vevt8U2_SrLGa4kpGhsx7qxXnWiLW6ySNUBO88cZ3mMxxxWqm2l0_OxTWTkQjRuEaVRmTJAuaHmUqNJeMBmndRNCUkL6Nt5UbFx68Gjf8Ocw8Nv2V7ePEFSPtoB8oDNOh8G3Cbxofo5iFy5aaexb0CrhtUlRVebFJlUbwTP5d4Op2RRXJQA49vf2xyD9FyfP90NyGtiQKxlOuKGKAm5qUtRC5dEbzmPIZhFNNFAONcsEL7HERhRVBZ8EZo6rkTLGSOu9yxI9QvNyUcI5yoBsu85fHJ3CnrQLICtLTxMDK4EzRMg7F-a_pkrLtxOP3j-hnapZEiJCSg6hz1q_ctXESIr9xl_Wo_Aai1phQ
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bS8MwFA46H_RpXibe7YOvmW0uTYJPIo5Nu7GHDfY2kuYEROlEuhd_vUkvOgTBPpVCyyHp6felOd_5ELqRfknBVawxISAx45RgTTVgRoVzRoNKbNAOjyfpcM6eFnzRiNUrLQwAVMVn0A-n1V6-XeXr8KvMZ7hIPUOn22jHAz_jtVyrfX248PCdBLZTf4g9UgvRdPNJGL-lqadCxK9SU8XjYAKzYadSocmgiyZtHHURyWt_XZp-_vmrReO_A91HvR_hXjT9hqQDtAXFIeq2zg1Rk8hH6G7k2XJd0WLeIKpaVBVlNF2F2iJ_Z_RSRFk2xtPgoQY2ut_Y5u6h-eBx9jDEjY0C1oTJEisgymemznmSmtxZyZgPQwkqcwfKGKe5tAnwXHMnYmcVl8Qyw6mLDTOJoceoU6wKOEFRIBs0tpr5JzMjtIGU5iBT7Q-VOnOKemEwlu91p4xlOw5nf1y_RrvD2ThbZqPJ8znaI54wBFwg4gJ1yo81XHrAL81VNc1fi7KpYQ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=Imperceptible+Content+Poisoning+in+LLM-Powered+Applications&rft.au=Zhang%2C+Quan&rft.au=Zhou%2C+Chijin&rft.au=Go%2C+Gwihwan&rft.au=Zeng%2C+Binqi&rft.date=2024-10-27&rft.pub=ACM&rft.eissn=2643-1572&rft.spage=242&rft.epage=254&rft_id=info:doi/10.1145%2F3691620.3695001&rft.externalDocID=10764993