Predicting Patch Correctness Based on the Similarity of Failing Test Cases

Towards predicting patch correctness in APR, we propose a simple, but novel hypothesis on how the link between the patch behaviour and failing test specifications can be drawn: similar failing test cases should require similar patches. We then propose BATS, an unsupervised learning-based system to p...

Full description

Saved in:
Bibliographic Details
Main Authors Tian, Haoye, Li, Yinghua, Pian, Weiguo, Kaboré, Abdoul Kader, Liu, Kui, Habib, Andrew, Klein, Jacques, Bissyande, Tegawendé F
Format Journal Article
LanguageEnglish
Published 28.07.2021
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Towards predicting patch correctness in APR, we propose a simple, but novel hypothesis on how the link between the patch behaviour and failing test specifications can be drawn: similar failing test cases should require similar patches. We then propose BATS, an unsupervised learning-based system to predict patch correctness by checking patch Behaviour Against failing Test Specification. BATS exploits deep representation learning models for code and patches: for a given failing test case, the yielded embedding is used to compute similarity metrics in the search for historical similar test cases in order to identify the associated applied patches, which are then used as a proxy for assessing generated patch correctness. Experimentally, we first validate our hypothesis by assessing whether ground-truth developer patches cluster together in the same way that their associated failing test cases are clustered. Then, after collecting a large dataset of 1278 plausible patches (written by developers or generated by some 32 APR tools), we use BATS to predict correctness: BATS achieves an AUC between 0.557 to 0.718 and a recall between 0.562 and 0.854 in identifying correct patches. Compared against previous work, we demonstrate that our approach outperforms state-of-the-art performance in patch correctness prediction, without the need for large labeled patch datasets in contrast with prior machine learning-based approaches. While BATS is constrained by the availability of similar test cases, we show that it can still be complementary to existing approaches: used in conjunction with a recent approach implementing supervised learning, BATS improves the overall recall in detecting correct patches. We finally show that BATS can be complementary to the state-of-the-art PATCH-SIM dynamic approach of identifying the correct patches for APR tools.
AbstractList Towards predicting patch correctness in APR, we propose a simple, but novel hypothesis on how the link between the patch behaviour and failing test specifications can be drawn: similar failing test cases should require similar patches. We then propose BATS, an unsupervised learning-based system to predict patch correctness by checking patch Behaviour Against failing Test Specification. BATS exploits deep representation learning models for code and patches: for a given failing test case, the yielded embedding is used to compute similarity metrics in the search for historical similar test cases in order to identify the associated applied patches, which are then used as a proxy for assessing generated patch correctness. Experimentally, we first validate our hypothesis by assessing whether ground-truth developer patches cluster together in the same way that their associated failing test cases are clustered. Then, after collecting a large dataset of 1278 plausible patches (written by developers or generated by some 32 APR tools), we use BATS to predict correctness: BATS achieves an AUC between 0.557 to 0.718 and a recall between 0.562 and 0.854 in identifying correct patches. Compared against previous work, we demonstrate that our approach outperforms state-of-the-art performance in patch correctness prediction, without the need for large labeled patch datasets in contrast with prior machine learning-based approaches. While BATS is constrained by the availability of similar test cases, we show that it can still be complementary to existing approaches: used in conjunction with a recent approach implementing supervised learning, BATS improves the overall recall in detecting correct patches. We finally show that BATS can be complementary to the state-of-the-art PATCH-SIM dynamic approach of identifying the correct patches for APR tools.
Author Bissyande, Tegawendé F
Kaboré, Abdoul Kader
Klein, Jacques
Tian, Haoye
Habib, Andrew
Li, Yinghua
Liu, Kui
Pian, Weiguo
Author_xml – sequence: 1
  givenname: Haoye
  surname: Tian
  fullname: Tian, Haoye
– sequence: 2
  givenname: Yinghua
  surname: Li
  fullname: Li, Yinghua
– sequence: 3
  givenname: Weiguo
  surname: Pian
  fullname: Pian, Weiguo
– sequence: 4
  givenname: Abdoul Kader
  surname: Kaboré
  fullname: Kaboré, Abdoul Kader
– sequence: 5
  givenname: Kui
  surname: Liu
  fullname: Liu, Kui
– sequence: 6
  givenname: Andrew
  surname: Habib
  fullname: Habib, Andrew
– sequence: 7
  givenname: Jacques
  surname: Klein
  fullname: Klein, Jacques
– sequence: 8
  givenname: Tegawendé F
  surname: Bissyande
  fullname: Bissyande, Tegawendé F
BackLink https://doi.org/10.48550/arXiv.2107.13296$$DView paper in arXiv
BookMark eNotj0FOwzAURL2ABRQOwApfIMF2HNtZQkShqBKVyD76-XaopdRBtoXo7UkLq9m8Gc27JhdhDo6QO85KaeqaPUD88d-l4EyXvBKNuiJvu-isx-zDJ91Bxj1t5xgd5uBSok-QnKVzoHnv6Ic_-Amiz0c6j3QNfjqVOpcybRcu3ZDLEabkbv9zRbr1c9e-Ftv3l037uC1AaVUYQGFrO-oG3TCMTb380qgkApcGG9BSKsG5MZxZq5WrEAUo5ANb8FqKakXu_2bPMv1X9AeIx_4k1Z-lql_B2Ujw
ContentType Journal Article
Copyright http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID AKY
GOX
DOI 10.48550/arxiv.2107.13296
DatabaseName arXiv Computer Science
arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 2107_13296
GroupedDBID AKY
GOX
ID FETCH-LOGICAL-a676-8ac2d5df79cebbf955507c64ca148c9a74462118810dd76e3cc2a6c1b0ebb5423
IEDL.DBID GOX
IngestDate Mon Jan 08 05:39:13 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a676-8ac2d5df79cebbf955507c64ca148c9a74462118810dd76e3cc2a6c1b0ebb5423
OpenAccessLink https://arxiv.org/abs/2107.13296
ParticipantIDs arxiv_primary_2107_13296
PublicationCentury 2000
PublicationDate 2021-07-28
PublicationDateYYYYMMDD 2021-07-28
PublicationDate_xml – month: 07
  year: 2021
  text: 2021-07-28
  day: 28
PublicationDecade 2020
PublicationYear 2021
Score 1.8181942
SecondaryResourceType preprint
Snippet Towards predicting patch correctness in APR, we propose a simple, but novel hypothesis on how the link between the patch behaviour and failing test...
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Artificial Intelligence
Computer Science - Software Engineering
Title Predicting Patch Correctness Based on the Similarity of Failing Test Cases
URI https://arxiv.org/abs/2107.13296
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV09T8MwED21nVgQCFD51A2shsR17GSEilJVAipRpG6RP2KRgQSlBfHzOSdBsLDat_gs3XtPPr8DuExirxM_0SwmRctE5iKWGceZkX6iYh2lnfH8w6Ocv4jFOlkPAH_-wujmq_zs_IHN5pr0iLoKo9DlEIach5at-6d19zjZWnH18b9xxDHbpT8gMduD3Z7d4U13HfswKKoDWCyb8BoS-otxSZXvFadhJIbdhiqDtwQjDusKiYnhc_lWktQkZoy1x5kuw2dxXFHlxinFbQ5hNbtbTeesH2HAtFSSpdpylzivMlsY47MkuIdZKawmFWIzrUiMkQJL0zhyTsliYi3X0sYmovCEmM4RjKq6KsaAQhlCVuG0CxNChDeaO0v0xRpBmO-zYxi3B8_fO5eKPOQkb3Ny8v_WKezw0KQRKcbTMxhtm4_inFB2ay7aVH8Dpzt8mw
link.rule.ids 228,230,786,891
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Predicting+Patch+Correctness+Based+on+the+Similarity+of+Failing+Test+Cases&rft.au=Tian%2C+Haoye&rft.au=Li%2C+Yinghua&rft.au=Pian%2C+Weiguo&rft.au=Kabor%C3%A9%2C+Abdoul+Kader&rft.date=2021-07-28&rft_id=info:doi/10.48550%2Farxiv.2107.13296&rft.externalDocID=2107_13296