StyleFool: Fooling Video Classification Systems via Style Transfer

Video classification systems are vulnerable to adversarial attacks, which can create severe security problems in video verification. Current black-box attacks need a large number of queries to succeed, resulting in high computational overhead in the process of attack. On the other hand, attacks with...

Full description

Saved in:
Bibliographic Details
Main Authors Cao, Yuxin, Xiao, Xi, Sun, Ruoxi, Wang, Derui, Xue, Minhui, Wen, Sheng
Format Journal Article
LanguageEnglish
Published 29.03.2022
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Video classification systems are vulnerable to adversarial attacks, which can create severe security problems in video verification. Current black-box attacks need a large number of queries to succeed, resulting in high computational overhead in the process of attack. On the other hand, attacks with restricted perturbations are ineffective against defenses such as denoising or adversarial training. In this paper, we focus on unrestricted perturbations and propose StyleFool, a black-box video adversarial attack via style transfer to fool the video classification system. StyleFool first utilizes color theme proximity to select the best style image, which helps avoid unnatural details in the stylized videos. Meanwhile, the target class confidence is additionally considered in targeted attacks to influence the output distribution of the classifier by moving the stylized video closer to or even across the decision boundary. A gradient-free method is then employed to further optimize the adversarial perturbations. We carry out extensive experiments to evaluate StyleFool on two standard datasets, UCF-101 and HMDB-51. The experimental results demonstrate that StyleFool outperforms the state-of-the-art adversarial attacks in terms of both the number of queries and the robustness against existing defenses. Moreover, 50% of the stylized videos in untargeted attacks do not need any query since they can already fool the video classification model. Furthermore, we evaluate the indistinguishability through a user study to show that the adversarial samples of StyleFool look imperceptible to human eyes, despite unrestricted perturbations.
AbstractList Video classification systems are vulnerable to adversarial attacks, which can create severe security problems in video verification. Current black-box attacks need a large number of queries to succeed, resulting in high computational overhead in the process of attack. On the other hand, attacks with restricted perturbations are ineffective against defenses such as denoising or adversarial training. In this paper, we focus on unrestricted perturbations and propose StyleFool, a black-box video adversarial attack via style transfer to fool the video classification system. StyleFool first utilizes color theme proximity to select the best style image, which helps avoid unnatural details in the stylized videos. Meanwhile, the target class confidence is additionally considered in targeted attacks to influence the output distribution of the classifier by moving the stylized video closer to or even across the decision boundary. A gradient-free method is then employed to further optimize the adversarial perturbations. We carry out extensive experiments to evaluate StyleFool on two standard datasets, UCF-101 and HMDB-51. The experimental results demonstrate that StyleFool outperforms the state-of-the-art adversarial attacks in terms of both the number of queries and the robustness against existing defenses. Moreover, 50% of the stylized videos in untargeted attacks do not need any query since they can already fool the video classification model. Furthermore, we evaluate the indistinguishability through a user study to show that the adversarial samples of StyleFool look imperceptible to human eyes, despite unrestricted perturbations.
Author Xiao, Xi
Sun, Ruoxi
Wen, Sheng
Wang, Derui
Cao, Yuxin
Xue, Minhui
Author_xml – sequence: 1
  givenname: Yuxin
  surname: Cao
  fullname: Cao, Yuxin
– sequence: 2
  givenname: Xi
  surname: Xiao
  fullname: Xiao, Xi
– sequence: 3
  givenname: Ruoxi
  surname: Sun
  fullname: Sun, Ruoxi
– sequence: 4
  givenname: Derui
  surname: Wang
  fullname: Wang, Derui
– sequence: 5
  givenname: Minhui
  surname: Xue
  fullname: Xue, Minhui
– sequence: 6
  givenname: Sheng
  surname: Wen
  fullname: Wen, Sheng
BackLink https://doi.org/10.48550/arXiv.2203.16000$$DView paper in arXiv
BookMark eNotj01OwzAYRL2ARSkcoCt8gYTPf7HNDiJKkSqxaMQ2cmobWUptZEcVuT00II1mVm-kd4OuYooOoQ2Bmish4MHk73CuKQVWkwYAVuj5MM2j26Y0PuJLh_iJP4J1CbejKSX4cDRTSBEf5jK5U8HnYPDC4C6bWLzLt-jam7G4u_9do2770rW7av_--tY-7SvTSKis5cfG_UZrSqz1jEnluOWDVoNgAiS3VFtPHCOC6YFRoewAwGUjHVWg2Rrd_90uEv1XDieT5_4i0y8y7AdApUWj
ContentType Journal Article
Copyright http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID AKY
GOX
DOI 10.48550/arxiv.2203.16000
DatabaseName arXiv Computer Science
arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 2203_16000
GroupedDBID AKY
GOX
ID FETCH-LOGICAL-a670-dd4c6ec6e9921ddf3378e4d4b98b535074d29df1e31539b3258db004767e28093
IEDL.DBID GOX
IngestDate Wed Apr 03 12:16:21 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a670-dd4c6ec6e9921ddf3378e4d4b98b535074d29df1e31539b3258db004767e28093
OpenAccessLink https://arxiv.org/abs/2203.16000
ParticipantIDs arxiv_primary_2203_16000
PublicationCentury 2000
PublicationDate 2022-03-29
PublicationDateYYYYMMDD 2022-03-29
PublicationDate_xml – month: 03
  year: 2022
  text: 2022-03-29
  day: 29
PublicationDecade 2020
PublicationYear 2022
Score 1.8421509
SecondaryResourceType preprint
Snippet Video classification systems are vulnerable to adversarial attacks, which can create severe security problems in video verification. Current black-box attacks...
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Computer Vision and Pattern Recognition
Computer Science - Cryptography and Security
Title StyleFool: Fooling Video Classification Systems via Style Transfer
URI https://arxiv.org/abs/2203.16000
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1LT8QgECbrnrwYjZr1GQ5eieXVgjc11o0HPbia3poCQ9JkY03d3ei_F2iNXkwIBxhI-MgwMzDMIHThHWfUgCOGMyDhwDOB58AQzsFKWvhoZURvi8d8_iIeKllNEP75C9P0n-1miA9sPi4Zy3i8_8iCUb7FWHTZun-qhsfJFIprpP-lCzpmavojJMpdtDNqd_h62I49NIG3fXTzvPpaQtl1yysc6yAs8GvroMMpI2X01Unw4DF6ON60DU5jcJIkHvoDtCjvFrdzMqYuIE1eZMQ5YXMIRWtGnfOcFwqEE0YrI3lQwYRj2nkKPBw4OsAklYv8U-QFMJVpfoimwfqHGcIyt45bQRuhqfBSKWstSBnm9CBz0xyhWVpw_T5Ep6gjFnXC4vj_rhO0zaIffxYTuZ2i6apfw1mQritzniD-Bsokehg
link.rule.ids 228,230,783,888
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=StyleFool%3A+Fooling+Video+Classification+Systems+via+Style+Transfer&rft.au=Cao%2C+Yuxin&rft.au=Xiao%2C+Xi&rft.au=Sun%2C+Ruoxi&rft.au=Wang%2C+Derui&rft.date=2022-03-29&rft_id=info:doi/10.48550%2Farxiv.2203.16000&rft.externalDocID=2203_16000