Will They Like This? Evaluating Code Contributions with Language Models
Popular open-source software projects receive and review contributions from a diverse array of developers, many of whom have little to no prior involvement with the project. A recent survey reported that reviewers consider conformance to the project's code style to be one of the top priorities...
Saved in:
Published in | 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories pp. 157 - 167 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.05.2015
|
Subjects | |
Online Access | Get full text |
ISSN | 2160-1852 |
DOI | 10.1109/MSR.2015.22 |
Cover
Abstract | Popular open-source software projects receive and review contributions from a diverse array of developers, many of whom have little to no prior involvement with the project. A recent survey reported that reviewers consider conformance to the project's code style to be one of the top priorities when evaluating code contributions on Github. We propose to quantitatively evaluate the existence and effects of this phenomenon. To this aim we use language models, which were shown to accurately capture stylistic aspects of code. We find that rejected change sets do contain code significantly less similar to the project than accepted ones, furthermore, the less similar change sets are more likely to be subject to thorough review. Armed with these results we further investigate whether new contributors learn to conform to the project style and find that experience is positively correlated with conformance to the project's code style. |
---|---|
AbstractList | Popular open-source software projects receive and review contributions from a diverse array of developers, many of whom have little to no prior involvement with the project. A recent survey reported that reviewers consider conformance to the project's code style to be one of the top priorities when evaluating code contributions on Github. We propose to quantitatively evaluate the existence and effects of this phenomenon. To this aim we use language models, which were shown to accurately capture stylistic aspects of code. We find that rejected change sets do contain code significantly less similar to the project than accepted ones, furthermore, the less similar change sets are more likely to be subject to thorough review. Armed with these results we further investigate whether new contributors learn to conform to the project style and find that experience is positively correlated with conformance to the project's code style. |
Author | Bacchelli, Alberto Devanbu, Premkumar T. Hellendoorn, Vincent J. |
Author_xml | – sequence: 1 givenname: Vincent J. surname: Hellendoorn fullname: Hellendoorn, Vincent J. organization: SORCERERS @ Software Eng. Res. Group, Delft Univ. of Technol., Delft, Netherlands – sequence: 2 givenname: Premkumar T. surname: Devanbu fullname: Devanbu, Premkumar T. organization: Dept. of Comput. Sci., Univ. of California, Davis, Davis, CA, USA – sequence: 3 givenname: Alberto surname: Bacchelli fullname: Bacchelli, Alberto organization: SORCERERS @ Software Eng. Res. Group, Delft Univ. of Technol., Delft, Netherlands |
BookMark | eNotT8FKxDAUjLCC69qTRy_5gdYkbdK8k0hZV6GLoCsel2T72g3WVJpW2b83oJeZOcwMM5dk4QePhFxzlnHO4Hb7-pIJxmUmxBlJoNSsVCClhEIsyFJwxVKupbggSQjOMlGoXEApl2Tz7vqe7o54orX7wKhcuKPrb9PPZnK-o9XQYAQ_jc7Okxt8oD9uOtLa-G42HdJtNPThipy3pg-Y_POKvD2sd9VjWj9vnqr7OnWC6SlFi4AKdA5G2sYgbzUi8oMp0WqtmQWtIG4ruS5ASSxadpCWmQZlY6FV-Yrc_PW6mNt_je7TjKd9tLP4OP8Fa8pNpA |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/MSR.2015.22 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 9780769555942 0769555942 |
EndPage | 167 |
ExternalDocumentID | 7180076 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IH 6IK 6IL AAJGR ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIL |
ID | FETCH-LOGICAL-i208t-ebe9e69839a5bdae1f8eee1ca7eb8880b98696327184965e4f0c5b0ade5db9f63 |
IEDL.DBID | RIE |
ISSN | 2160-1852 |
IngestDate | Wed Aug 27 02:33:47 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i208t-ebe9e69839a5bdae1f8eee1ca7eb8880b98696327184965e4f0c5b0ade5db9f63 |
PageCount | 11 |
ParticipantIDs | ieee_primary_7180076 |
PublicationCentury | 2000 |
PublicationDate | 20150501 |
PublicationDateYYYYMMDD | 2015-05-01 |
PublicationDate_xml | – month: 05 year: 2015 text: 20150501 day: 01 |
PublicationDecade | 2010 |
PublicationTitle | 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories |
PublicationTitleAbbrev | MSR |
PublicationYear | 2015 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssib024632975 ssib024632974 ssib025354978 ssib011891898 ssib010674209 ssib030084171 |
Score | 1.8368692 |
Snippet | Popular open-source software projects receive and review contributions from a diverse array of developers, many of whom have little to no prior involvement... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 157 |
SubjectTerms | code review Context Context modeling Data mining Entropy Java language model Mathematical model pull request Software |
Title | Will They Like This? Evaluating Code Contributions with Language Models |
URI | https://ieeexplore.ieee.org/document/7180076 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELbaTkyAWsRbHhhJmjh2Ek8MVQtCLUJApW5VHF9QVZQi2g7w67nLqwgxIGWwowx-XOx7fh9jV0pL48vIOnj2o4GiY-sY4weOIjKRzCgDUGRbPIR3U3k_U7MWu25qYQCgSD4Dl5pFLN-u0i25yvp4jlLkqM3aKGZlrVYtO4SEJsUOigr1Zo1PI1tChoHQO1Wn6jeyLFSgiGyt7gcENO8X9prwQwp_KlGV9_me7k-enygtTLlEuvuDlqW4lUb7bFLPp0xGWbrbjXHTr19Qj_-d8AHr7er_-GNzsx2yFuRddkuuGY5i9cnHiyVga7G-4cMKLTx_5YOVBU5oVzWH1pqTl5ePK5coJ961t3WPTUfDl8GdU9EwOAvhxRsHt1lDqFGTSpSxCfhZjAP10yQCg_azZ3Qc4m8scLAEPg8y81JlvMSCskZnYXDEOvkqh2PGbZR4KWitgjSR-GmcJVmgIglRFKSRDk9YlxZh_l4ibcyr-Z_-_fqM7dEelOmH56yz-djCBaoIG3NZyMY3boeyMQ |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKGWAC1CK-8cBIQhLbSTwxVC0F0gpBK3Wr4viCqqIU0XSAX48vSVOEGJAy2FEGn32x78537xFyJSRXLg-0ZfZ-46DIUFtKucwSSCaSKqEAimyLod8f84eJmDTIdV0LAwBF8hnY2Czu8vUiWWGo7Mbso3hztEW2zbnPRVmttdYexELj3gaMyljO0jy1dnncZ57cGDtVv9ZmTzCBdGvrPkOoebfw2DzXxwtQ4VUFfq4jbwYvz5gYJmyk3f1BzFKcS709MlhLVKajzO1Vruzk6xfY439F3iftTQUgfarPtgPSgKxF7jA4Q41ifdJoNgfTmi1vabfCC89eaWehgSLe1ZpFa0kxzkujKihKkXntbdkm41531OlbFRGDNfOcMLfMQkvwpbGlYqF0DG4amoG6SRyAMh60o2Tomx_ZM4NF-HngqZMI5cQahFYy9dkhaWaLDI4I1UHsJCClYEnMzadhGqdMBByCgCWB9I9JCydh-l5ibUwr-U_-fn1JdvqjQTSN7oePp2QX16NMRjwjzfxjBefGYMjVRaEn3_sJtX4 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2015+IEEE%2FACM+12th+Working+Conference+on+Mining+Software+Repositories&rft.atitle=Will+They+Like+This%3F+Evaluating+Code+Contributions+with+Language+Models&rft.au=Hellendoorn%2C+Vincent+J.&rft.au=Devanbu%2C+Premkumar+T.&rft.au=Bacchelli%2C+Alberto&rft.date=2015-05-01&rft.pub=IEEE&rft.issn=2160-1852&rft.spage=157&rft.epage=167&rft_id=info:doi/10.1109%2FMSR.2015.22&rft.externalDocID=7180076 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2160-1852&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2160-1852&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2160-1852&client=summon |