Fuzzing MLIR Compilers with Custom Mutation Synthesis
Compiler technologies in deep learning and domain-specific hardware acceleration are increasingly adopting extensible compiler frameworks such as Multi-Level Intermediate Representation (MLIR) to facilitate more efficient development. With MLIR, compiler developers can easily define their own custom...
Saved in:
Published in | Proceedings / International Conference on Software Engineering pp. 217 - 229 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
26.04.2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Compiler technologies in deep learning and domain-specific hardware acceleration are increasingly adopting extensible compiler frameworks such as Multi-Level Intermediate Representation (MLIR) to facilitate more efficient development. With MLIR, compiler developers can easily define their own custom IRs in the form of MLIR dialects. However, the diversity and rapid evolution of such custom IRs make it impractical to manually write a custom test generator for each dialect. To address this problem, we design a new test generator called SynthFuzz that combines grammar-based fuzzing with custom mutation synthesis. The key essence of SynthFuzz is two fold: (1) It automatically infers parameterized context-dependent custom mutations from existing test cases. (2) It then concretizes the mutation's content depending on the target context and reduces the chance of inserting invalid edits by performing k - ancestor and prefix/postfix matching. It obviates the need to manually define custom mutation operators for each dialect. We compare SynthFuzz to three baselines: Grammarinator-a grammar-based fuzzer without custom mutations, MLIRSmith-a custom test generator for MLIR core dialects, and NeuRI-a custom test generator for ML models with parameterization of tensor shapes. We conduct this comprehensive comparison on four different MLIR projects. Each project defines a new set of MLIR dialects where manually writing a custom test generator would take weeks of effort. Our evaluation shows that SynthFuzz on average improves MLIR dialect pair coverage by 1.75 ×, which increases branch coverage by 1.22 ×. Further, we show that our context dependent custom mutation increases the proportion of valid tests by up to 1.11 ×, indicating that SynthFuzz correctly concretizes its parameterized mutations with respect to the target context. Parameterization of the mutations reduces the fraction of tests violating the base MLIR constraints by 0.57 ×, increasing the time spent fuzzing dialect-specific code. |
---|---|
AbstractList | Compiler technologies in deep learning and domain-specific hardware acceleration are increasingly adopting extensible compiler frameworks such as Multi-Level Intermediate Representation (MLIR) to facilitate more efficient development. With MLIR, compiler developers can easily define their own custom IRs in the form of MLIR dialects. However, the diversity and rapid evolution of such custom IRs make it impractical to manually write a custom test generator for each dialect. To address this problem, we design a new test generator called SynthFuzz that combines grammar-based fuzzing with custom mutation synthesis. The key essence of SynthFuzz is two fold: (1) It automatically infers parameterized context-dependent custom mutations from existing test cases. (2) It then concretizes the mutation's content depending on the target context and reduces the chance of inserting invalid edits by performing k - ancestor and prefix/postfix matching. It obviates the need to manually define custom mutation operators for each dialect. We compare SynthFuzz to three baselines: Grammarinator-a grammar-based fuzzer without custom mutations, MLIRSmith-a custom test generator for MLIR core dialects, and NeuRI-a custom test generator for ML models with parameterization of tensor shapes. We conduct this comprehensive comparison on four different MLIR projects. Each project defines a new set of MLIR dialects where manually writing a custom test generator would take weeks of effort. Our evaluation shows that SynthFuzz on average improves MLIR dialect pair coverage by 1.75 ×, which increases branch coverage by 1.22 ×. Further, we show that our context dependent custom mutation increases the proportion of valid tests by up to 1.11 ×, indicating that SynthFuzz correctly concretizes its parameterized mutations with respect to the target context. Parameterization of the mutations reduces the fraction of tests violating the base MLIR constraints by 0.57 ×, increasing the time spent fuzzing dialect-specific code. |
Author | Limpanukorn, Ben Kim, Miryung Zhou, Zitong Kang, Hong Jin Wang, Jiyuan |
Author_xml | – sequence: 1 givenname: Ben surname: Limpanukorn fullname: Limpanukorn, Ben email: blimpan@cs.ucla.edu organization: University of California,Los Angeles – sequence: 2 givenname: Jiyuan surname: Wang fullname: Wang, Jiyuan email: wangjiyuan@cs.ucla.edu organization: University of California,Los Angeles – sequence: 3 givenname: Hong Jin surname: Kang fullname: Kang, Hong Jin email: hjkang@cs.ucla.edu organization: University of California,Los Angeles – sequence: 4 givenname: Zitong surname: Zhou fullname: Zhou, Zitong email: zitongzhou@cs.ucla.edu organization: University of California,Los Angeles – sequence: 5 givenname: Miryung surname: Kim fullname: Kim, Miryung email: miryung@cs.ucla.edu organization: University of California,Los Angeles |
BookMark | eNotj8tKw0AUQEdRsK39gy7mBxLvnVdylxJaDaQItvsySSZ2pElKJkHar7egq7M5HDhz9tD1nWNshRAjAr3k2W6ttVRJLEDoGABkcseWlFAqJWrQhvCezVDrNEIh9BObh_B904wimjG9ma5X333xbZF_8qxvz_7khsB__Hjk2RTGvuXbabSj7zu-u3Tj0QUfntljY0_BLf-5YPvNep-9R8XHW569FpEnOUalSK1AWScNgUKoyTpblnUDqa4aoIqs0WSEQ1s1llA0KJ0Bp5RTwoAxcsFWf1nvnDucB9_a4XK4bQsihfIXofJHvw |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/ICSE55347.2025.00037 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISBN | 9798331505691 |
EISSN | 1558-1225 |
EndPage | 229 |
ExternalDocumentID | 11029941 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Science Foundation grantid: 2106838,1764077,1956322,2106404 funderid: 10.13039/100000001 |
GroupedDBID | -~X .4S .DC 29O 5VS 6IE 6IF 6IH 6IK 6IL 6IM 6IN 8US AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS ARCSS AVWKF BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO EDO FEDTE I-F IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO |
ID | FETCH-LOGICAL-i93t-b28a213d7f90410d9aeabbdf085cf09c9a65962e1acfa912f13e60e44e4260663 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 01:40:12 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i93t-b28a213d7f90410d9aeabbdf085cf09c9a65962e1acfa912f13e60e44e4260663 |
PageCount | 13 |
ParticipantIDs | ieee_primary_11029941 |
PublicationCentury | 2000 |
PublicationDate | 2025-April-26 |
PublicationDateYYYYMMDD | 2025-04-26 |
PublicationDate_xml | – month: 04 year: 2025 text: 2025-April-26 day: 26 |
PublicationDecade | 2020 |
PublicationTitle | Proceedings / International Conference on Software Engineering |
PublicationTitleAbbrev | ICSE |
PublicationYear | 2025 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0006499 |
Score | 2.2896361 |
Snippet | Compiler technologies in deep learning and domain-specific hardware acceleration are increasingly adopting extensible compiler frameworks such as Multi-Level... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 217 |
SubjectTerms | code patterns Codes compiler testing Fuzzing Generators Grammar-based fuzzing Hardware acceleration MLIR Program processors program synthesis program transformation Shape Software engineering Tensors Testing Writing |
Title | Fuzzing MLIR Compilers with Custom Mutation Synthesis |
URI | https://ieeexplore.ieee.org/document/11029941 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LawIxEA6tp57sw9I3OfQa3WSzcXMWRUuVUi14k-xmAlK6Ft096K_vZHe1tFDoLSSBhITMfDOTb4aQR-V0VwNETCEmYlJYyUwSW2YgVD4XiYyd90OOJ2r4Jp_m0bwmq5dcGAAoP59B2zfLWL5dpYV3lXVQVaH09DT1Y7TcKrLWQewqxO41N44HujPqTftRFMou2oDC-02C8GcFlVKBDJpksl-6-jfy3i7ypJ3ufmVl_PfeTknrm6tHXw5a6IwcQXZOmvtiDbR-uxckGhS7Hc6g4-fRK_XjKA_WG-odsbRXIAb8oOOiCszT6TZDYLhZblpkNujPekNW10xgSx3mLBGxETy0XacDyQOrDZgksQ6BVeoCnWqjfLkd4CZ1RnPheAgqACnBZ6pH9HFJGtkqgytCA2eFi2IeOUgRZbnEpLHQBi0aNPli665Jy5_C4rPKirHYH8DNH_235MTfhI_ECHVHGvm6gHtU6HnyUF7kF27Kn9o |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA4yD3ry18Tf5uC1W9MmWXMeG5uuQ9yE3UbavMAQO9nag_vrfWm7iYLgrbSBlIS873vv5XuPkAdpVUcBCE8iJ_J4YLink8h4GkLpapHwyLo4ZDyWg1f-OBOzWqxeamEAoLx8Bi33WObyzTItXKisjVCF1tPJ1PcR-AWr5Fo7wyuRvdfqOOar9rA76QkR8g56gYGLnPjhzx4qJYT0j8h4O3l1c-StVeRJK938qsv47787Js1vtR593uHQCdmD7JQcbds10Pr0nhHRLzYbHEHj0fCFuu9oEVZr6kKxtFsgC3yncVGl5unkM0NquF6sm2Ta7027A6_umuAtVJh7SRDpgIWmY5XPmW-UBp0kxiK1Sq2vUqWla7gDTKdWKxZYFoL0gXNwteqRf5yTRrbM4IJQ35rAiogJCynyLJvoNAqURp8Gnb7I2EvSdKsw_6jqYsy3C3D1x_t7cjCYxqP5aDh-uiaHbldcXiaQN6SRrwq4RXjPk7tyU78A-EejIw |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Software+Engineering&rft.atitle=Fuzzing+MLIR+Compilers+with+Custom+Mutation+Synthesis&rft.au=Limpanukorn%2C+Ben&rft.au=Wang%2C+Jiyuan&rft.au=Kang%2C+Hong+Jin&rft.au=Zhou%2C+Zitong&rft.date=2025-04-26&rft.pub=IEEE&rft.eissn=1558-1225&rft.spage=217&rft.epage=229&rft_id=info:doi/10.1109%2FICSE55347.2025.00037&rft.externalDocID=11029941 |