Preprocessing is What You Need: Understanding and Predicting the Complexity of SAT-based Uniform Random Sampling
Despite its NP-completeness, the Boolean satisfiability problem gave birth to highly efficient tools that are able to find solutions to a Boolean formula and compute their number. Boolean formulae compactly encode huge, constrained search spaces for variability-intensive systems, e.g., the possible...
Saved in:
Published in | 2024 IEEE/ACM 12th International Conference on Formal Methods in Software Engineering (FormaliSE) pp. 23 - 32 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
ACM
14.04.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Despite its NP-completeness, the Boolean satisfiability problem gave birth to highly efficient tools that are able to find solutions to a Boolean formula and compute their number. Boolean formulae compactly encode huge, constrained search spaces for variability-intensive systems, e.g., the possible configurations of the Linux kernel. These search spaces are generally too big to explore exhaustively, leading most testing approaches to sample a few solutions before analysing them. A desirable property of such samples is uniformity: each solution should get the same selection probability. This property motivated the design of uniform random samplers, relying on SAT solvers and counters and achieving different tradeoffs between uniformity and scalability. Though we can observe their performance in practice, understanding the complexity these tools face and accurately predicting it is an under-explored problem. Indeed, structural metrics such as the number of variables and clauses involved in a formula poorly predict the sampling complexity. More elaborated ones, such as minimal independent support (MIS), are intractable to compute on large formulae. We provide an efficient parallel algorithm to compute a related metric, the number of equivalence classes, and demonstrate that this metric is highly correlated to time and memory usage of uniform random sampling and model counting tools. We explore the role of formula preprocessing on various metrics and show its positive influence on correlations. Relying on these correlations, we train an efficient classifier (F1-score 0.97) to predict whether uniformly sampling a given formula will exceed a specified budget. Our results allow us to characterise the similarities and differences between (uniform) sampling, solving and counting. |
---|---|
AbstractList | Despite its NP-completeness, the Boolean satisfiability problem gave birth to highly efficient tools that are able to find solutions to a Boolean formula and compute their number. Boolean formulae compactly encode huge, constrained search spaces for variability-intensive systems, e.g., the possible configurations of the Linux kernel. These search spaces are generally too big to explore exhaustively, leading most testing approaches to sample a few solutions before analysing them. A desirable property of such samples is uniformity: each solution should get the same selection probability. This property motivated the design of uniform random samplers, relying on SAT solvers and counters and achieving different tradeoffs between uniformity and scalability. Though we can observe their performance in practice, understanding the complexity these tools face and accurately predicting it is an under-explored problem. Indeed, structural metrics such as the number of variables and clauses involved in a formula poorly predict the sampling complexity. More elaborated ones, such as minimal independent support (MIS), are intractable to compute on large formulae. We provide an efficient parallel algorithm to compute a related metric, the number of equivalence classes, and demonstrate that this metric is highly correlated to time and memory usage of uniform random sampling and model counting tools. We explore the role of formula preprocessing on various metrics and show its positive influence on correlations. Relying on these correlations, we train an efficient classifier (F1-score 0.97) to predict whether uniformly sampling a given formula will exceed a specified budget. Our results allow us to characterise the similarities and differences between (uniform) sampling, solving and counting. |
Author | Perrouin, Gilles Zeyen, Olivier Cordy, Maxime Acher, Mathieu |
Author_xml | – sequence: 1 givenname: Olivier surname: Zeyen fullname: Zeyen, Olivier organization: University of Luxembourg, SnT,Luxembourg – sequence: 2 givenname: Maxime surname: Cordy fullname: Cordy, Maxime organization: University of Luxembourg, SnT,Luxembourg – sequence: 3 givenname: Gilles surname: Perrouin fullname: Perrouin, Gilles organization: PReCISE/NaDI, University of Namur,Belgium – sequence: 4 givenname: Mathieu surname: Acher fullname: Acher, Mathieu organization: Univ Rennes, Inria, CNRS, IRISA,France |
BookMark | eNqFjsGKwjAUReOgMFX7By7eDxSe1ljjbpCRWckwrYgrieZVIzYpSQb07ycF97O6XM49cIesb6yhHktFIZZzxAL5UszeWDLjBc84CvHOUu9viJhPBSLnCWu_HbXOnsl7bS6gPeyvMsDB_sKWSK1gZxQ5H6RRHY8B0VD6HLoargRr27R3eujwBFtD-VFlJ-lJRVHX1jXwEx3bQCnjLDpjNqjl3VP6yhGbbD6r9VemiejYOt1I9zxO4zm-KPL8H_wHgRlJNA |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK ESBDL RIE RIL |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore Open Access Journals IEEE/IET Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISBN | 9798400705892 |
EISSN | 2575-5099 |
EndPage | 32 |
ExternalDocumentID | 10555673 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IL 6IN AAJGR ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO ESBDL IEGSK OCL RIE RIL |
ID | FETCH-ieee_primary_105556733 |
IEDL.DBID | RIE |
IngestDate | Wed Jul 03 05:40:23 EDT 2024 |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-ieee_primary_105556733 |
OpenAccessLink | https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/document/10555673 |
ParticipantIDs | ieee_primary_10555673 |
PublicationCentury | 2000 |
PublicationDate | 2024-April-14 |
PublicationDateYYYYMMDD | 2024-04-14 |
PublicationDate_xml | – month: 04 year: 2024 text: 2024-April-14 day: 14 |
PublicationDecade | 2020 |
PublicationTitle | 2024 IEEE/ACM 12th International Conference on Formal Methods in Software Engineering (FormaliSE) |
PublicationTitleAbbrev | FORMALISE |
PublicationYear | 2024 |
Publisher | ACM |
Publisher_xml | – name: ACM |
SSID | ssj0003190055 |
Score | 3.8426292 |
Snippet | Despite its NP-completeness, the Boolean satisfiability problem gave birth to highly efficient tools that are able to find solutions to a Boolean formula and... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 23 |
SubjectTerms | Complexity theory Computational efficiency Correlation Measurement Memory management Model Counting Predictive models Preprocessing SAT Scalability Uniform random sampling |
Title | Preprocessing is What You Need: Understanding and Predicting the Complexity of SAT-based Uniform Random Sampling |
URI | https://ieeexplore.ieee.org/document/10555673 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LS8NAEB5sT57qI-Kjyhy8JprnJt5ELEUwFNtCb6WbTESkSdHkoL_emaSpDxQ8JYTMhCVZvuzO930DcG5TRMr3tRmKA6HHawxTM5CakdKLxKGQnECEwvdxMJx6dzN_thar11oYIqrJZ2TJaV3LT4ukkq2yC2nm6AfK7UBHRVEj1tpsqPC3JIZS37qk1CAx6EHcpm-4Ic9WVWoref_hvPjv5--A8anHw9EGaXZhi_I96LUNGXA9P_dhNRKPypr5z_fh0yuKMzfyjMaYY69w-lXKgnzgtFKpEe4z8q8gSk6xyCzfsMhwfD0xBeZSDhQF1xIfOKZY4nghRPT80YD-4HZyMzRlFPNV41wxbwfgHkA3L3I6BPQoS3jN4bqZVl4WhDrwIztRmUeXqR-l6giMX1Mc_3H9BLYdxn0puNheH7rlS0WnjNulPqvf1wdFQqDl |
link.rule.ids | 310,311,783,787,792,793,799,55088 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT4NAEJ5oPeipPmp8VJ2DV1AKC8WbMTaoLWksTXprujCYxhQapQf99c7Qh49o4glCmIENbGZ25_u-ATi3yCdPKW00RYHQ4TWGoTmQGr6nR3GDmtRwhSjcCd2g79wP1GBBVi-5MERUgs_IlNOylp_k8Uy2yi6kmaNyPXsdNpQkFnO61mpLhf8mkZT61ielDBOtKoTLB8zRIc_mrNBm_P5De_Hfb7ANtU9GHnZXsWYH1ijbheqyJQMuZugeTLuiUlli__k-HL-iaHMjz2kM2fYK-1_JLMgHdiu1GkE_IyeDKD5FJLN4wzzF3nVkSKBL2FA4XBN8ZJt8gr2RQNGzpxrUW7fRTWDIKIbTuXbFcDkAex8qWZ7RAaBDacyrDttOteekblO7yrdiL3XoMlF-4h1C7VcXR39cP4PNIOq0h-278OEYthqcBUj5xXLqUCleZnTCUbzQp-W3-wDYtaQy |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2024+IEEE%2FACM+12th+International+Conference+on+Formal+Methods+in+Software+Engineering+%28FormaliSE%29&rft.atitle=Preprocessing+is+What+You+Need%3A+Understanding+and+Predicting+the+Complexity+of+SAT-based+Uniform+Random+Sampling&rft.au=Zeyen%2C+Olivier&rft.au=Cordy%2C+Maxime&rft.au=Perrouin%2C+Gilles&rft.au=Acher%2C+Mathieu&rft.date=2024-04-14&rft.pub=ACM&rft.eissn=2575-5099&rft.spage=23&rft.epage=32&rft.externalDocID=10555673 |