Debugging a Policy: Automatic Action-Policy Testing in AI Planning
Testing is a promising way to gain trust in neural action policies π. Previous work on policy testing in sequential decision making targeted environment behavior leading to failure conditions. But if the failure is unavoidable given that behavior, then π is not actually to blame. For a situation to...
Saved in:
Published in | Proceedings of the International Conference on Automated Planning and Scheduling Vol. 32; pp. 353 - 361 |
---|---|
Main Authors | , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
13.06.2022
|
Online Access | Get full text |
Cover
Loading…
Abstract | Testing is a promising way to gain trust in neural action policies π. Previous work on policy testing in sequential decision making targeted environment behavior leading to failure conditions. But if the failure is unavoidable given that behavior, then π is not actually to blame. For a situation to qualify as a "bug" in π, there must be an alternative policy π' that does better. We introduce a generic policy testing framework based on that intuition. This raises the bug confirmation problem, deciding whether or not a state is a bug. We analyze the use of optimistic and pessimistic bounds for the design of test oracles approximating that problem. We contribute an implementation of our framework in classical planning, experimenting with several test oracles and with random-walk methods generating test states biased to poor policy performance and/or state novelty. We evaluate these techniques on policies π learned with ASNets. We find that they are able to effectively identify bugs in these π, and that our random-walk biases improve over uninformed baselines. |
---|---|
AbstractList | Testing is a promising way to gain trust in neural action policies π. Previous work on policy testing in sequential decision making targeted environment behavior leading to failure conditions. But if the failure is unavoidable given that behavior, then π is not actually to blame. For a situation to qualify as a "bug" in π, there must be an alternative policy π' that does better. We introduce a generic policy testing framework based on that intuition. This raises the bug confirmation problem, deciding whether or not a state is a bug. We analyze the use of optimistic and pessimistic bounds for the design of test oracles approximating that problem. We contribute an implementation of our framework in classical planning, experimenting with several test oracles and with random-walk methods generating test states biased to poor policy performance and/or state novelty. We evaluate these techniques on policies π learned with ASNets. We find that they are able to effectively identify bugs in these π, and that our random-walk biases improve over uninformed baselines. |
Author | Ferber, Patrick Fišer, Daniel Eniser, Hasan Ferit Höller, Daniel Hoffmann, Jörg Steinmetz, Marcel Christakis, Maria Gros, Timo P. Schuler, Xandra Wüstholz, Valentin Heim, Philippe |
Author_xml | – sequence: 1 givenname: Marcel surname: Steinmetz fullname: Steinmetz, Marcel – sequence: 2 givenname: Daniel surname: Fišer fullname: Fišer, Daniel – sequence: 3 givenname: Hasan Ferit surname: Eniser fullname: Eniser, Hasan Ferit – sequence: 4 givenname: Patrick surname: Ferber fullname: Ferber, Patrick – sequence: 5 givenname: Timo P. surname: Gros fullname: Gros, Timo P. – sequence: 6 givenname: Philippe surname: Heim fullname: Heim, Philippe – sequence: 7 givenname: Daniel surname: Höller fullname: Höller, Daniel – sequence: 8 givenname: Xandra surname: Schuler fullname: Schuler, Xandra – sequence: 9 givenname: Valentin surname: Wüstholz fullname: Wüstholz, Valentin – sequence: 10 givenname: Maria surname: Christakis fullname: Christakis, Maria – sequence: 11 givenname: Jörg surname: Hoffmann fullname: Hoffmann, Jörg |
BookMark | eNo9kM1Kw0AAhBepYK19AG_7Aom72d94i9VqoWAP9bzsb1hINyWbCn17bSqeZoYZ5vDdg1nqkwfgEaMSc1Q_RauPufwmVcQlrmWFbsC8IoQWSFIy-_eE3YFlztEgSgXjNSNz8PLqzaltY2qhhru-i_b8DJvT2B_0GC1s7Bj7VFwLuPd5vCxjgs0G7jqd0m98ALdBd9kv_3QBvtZv-9VHsf1836yabWExrsZCOkedtTYwybTDVAgtpbZUcimMxVw4g31gzIjgDWOS6CoYwSWxThvCHVkAfP21Q5_z4IM6DvGgh7PCSF04qImDmjioiQP5ASnLVNI |
ContentType | Journal Article |
DBID | AAYXX CITATION |
DOI | 10.1609/icaps.v32i1.19820 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | CrossRef |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2334-0843 |
EndPage | 361 |
ExternalDocumentID | 10_1609_icaps_v32i1_19820 |
GroupedDBID | AAYXX ALMA_UNASSIGNED_HOLDINGS CITATION M~E |
ID | FETCH-LOGICAL-c112t-8dd4dcccf585ad1477a88ac48687bc167db1ef55b7feb5583a2fb7683cdab36d3 |
ISSN | 2334-0835 |
IngestDate | Fri Aug 23 03:35:19 EDT 2024 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c112t-8dd4dcccf585ad1477a88ac48687bc167db1ef55b7feb5583a2fb7683cdab36d3 |
PageCount | 9 |
ParticipantIDs | crossref_primary_10_1609_icaps_v32i1_19820 |
PublicationCentury | 2000 |
PublicationDate | 2022-06-13 |
PublicationDateYYYYMMDD | 2022-06-13 |
PublicationDate_xml | – month: 06 year: 2022 text: 2022-06-13 day: 13 |
PublicationDecade | 2020 |
PublicationTitle | Proceedings of the International Conference on Automated Planning and Scheduling |
PublicationYear | 2022 |
SSID | ssib044756953 |
Score | 1.8710369 |
Snippet | Testing is a promising way to gain trust in neural action policies π. Previous work on policy testing in sequential decision making targeted environment... |
SourceID | crossref |
SourceType | Aggregation Database |
StartPage | 353 |
Title | Debugging a Policy: Automatic Action-Policy Testing in AI Planning |
Volume | 32 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9tAEF6lcOFStWorWkq1h3IpcmBf9oabQURQCdRDkLhZ-zKKqhpEHA499Df1J3Z2196YR6XSi5XY0Tj2fJrH7sw3CH2eWK4IJy6ThrMMPJTNpHQkg-DXKGmkq5VPFM_O85ML_vVSXI5GvwdVS8tWj83PJ_tK_kercA706rtkn6HZJBROwGfQLxxBw3D8Jx2DsVhehSlDajcS_IZ1vmV7HYlYy9C0kMVLuzNPqBE7WMrTNK1oGJ1-S95s0dcO3F8xXPUH-k2G7kYQsvbCYiEo4MD6AvertHzjR2r-cGF0rO8OMi7VdUznO0dipyQRObHhPUX5jR8UHZyjWoAdmsI7TVU68EXHi3HKwPfh-gWkvn7wD1uZOcoY9xzZcW_bDc9FAqfeTnfroNHQskgx3PlsFgndH7mDPLCpAtpvFuM7RudkTCaS7q98X7_f_8AlpkJFnyKBkCqIqIKIKoh4gdYpmDZvU89-HfcWzLMn5pNAfZoeq9tKByl7j_7IIBgaRDWzV-hll47gMmLrNRq55g06TLjCCkfwHOCEKnwPVbhDFZ43uDzFPRDeoovp8ezoJOtmbWQGIu42k9Zya4ypIX1UlvCiUFIqw2UuC21IXlhNXC2ELmqnhZBM0VpDqsqMVZrllr1Da8114zYRhozV0v2JdpRDuMqFVIqqXBSGiUJRSt6jL_1DVzeRUqX663v-8Jwfb6GNFcI-orX2dum2IWZs9aegpj9JC26K |
link.rule.ids | 315,783,787,27936,27937 |
linkProvider | ISSN International Centre |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Debugging+a+Policy%3A+Automatic+Action-Policy+Testing+in+AI+Planning&rft.jtitle=Proceedings+of+the+International+Conference+on+Automated+Planning+and+Scheduling&rft.au=Steinmetz%2C+Marcel&rft.au=Fi%C5%A1er%2C+Daniel&rft.au=Eniser%2C+Hasan+Ferit&rft.au=Ferber%2C+Patrick&rft.date=2022-06-13&rft.issn=2334-0835&rft.eissn=2334-0843&rft.volume=32&rft.spage=353&rft.epage=361&rft_id=info:doi/10.1609%2Ficaps.v32i1.19820&rft.externalDBID=n%2Fa&rft.externalDocID=10_1609_icaps_v32i1_19820 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2334-0835&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2334-0835&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2334-0835&client=summon |