Will They Like This? Evaluating Code Contributions with Language Models

Popular open-source software projects receive and review contributions from a diverse array of developers, many of whom have little to no prior involvement with the project. A recent survey reported that reviewers consider conformance to the project's code style to be one of the top priorities...

Full description

Saved in:
Bibliographic Details
Published in2015 IEEE/ACM 12th Working Conference on Mining Software Repositories pp. 157 - 167
Main Authors Hellendoorn, Vincent J., Devanbu, Premkumar T., Bacchelli, Alberto
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2015
Subjects
Online AccessGet full text
ISSN2160-1852
DOI10.1109/MSR.2015.22

Cover

Abstract Popular open-source software projects receive and review contributions from a diverse array of developers, many of whom have little to no prior involvement with the project. A recent survey reported that reviewers consider conformance to the project's code style to be one of the top priorities when evaluating code contributions on Github. We propose to quantitatively evaluate the existence and effects of this phenomenon. To this aim we use language models, which were shown to accurately capture stylistic aspects of code. We find that rejected change sets do contain code significantly less similar to the project than accepted ones, furthermore, the less similar change sets are more likely to be subject to thorough review. Armed with these results we further investigate whether new contributors learn to conform to the project style and find that experience is positively correlated with conformance to the project's code style.
AbstractList Popular open-source software projects receive and review contributions from a diverse array of developers, many of whom have little to no prior involvement with the project. A recent survey reported that reviewers consider conformance to the project's code style to be one of the top priorities when evaluating code contributions on Github. We propose to quantitatively evaluate the existence and effects of this phenomenon. To this aim we use language models, which were shown to accurately capture stylistic aspects of code. We find that rejected change sets do contain code significantly less similar to the project than accepted ones, furthermore, the less similar change sets are more likely to be subject to thorough review. Armed with these results we further investigate whether new contributors learn to conform to the project style and find that experience is positively correlated with conformance to the project's code style.
Author Bacchelli, Alberto
Devanbu, Premkumar T.
Hellendoorn, Vincent J.
Author_xml – sequence: 1
  givenname: Vincent J.
  surname: Hellendoorn
  fullname: Hellendoorn, Vincent J.
  organization: SORCERERS @ Software Eng. Res. Group, Delft Univ. of Technol., Delft, Netherlands
– sequence: 2
  givenname: Premkumar T.
  surname: Devanbu
  fullname: Devanbu, Premkumar T.
  organization: Dept. of Comput. Sci., Univ. of California, Davis, Davis, CA, USA
– sequence: 3
  givenname: Alberto
  surname: Bacchelli
  fullname: Bacchelli, Alberto
  organization: SORCERERS @ Software Eng. Res. Group, Delft Univ. of Technol., Delft, Netherlands
BookMark eNotT8FKxDAUjLCC69qTRy_5gdYkbdK8k0hZV6GLoCsel2T72g3WVJpW2b83oJeZOcwMM5dk4QePhFxzlnHO4Hb7-pIJxmUmxBlJoNSsVCClhEIsyFJwxVKupbggSQjOMlGoXEApl2Tz7vqe7o54orX7wKhcuKPrb9PPZnK-o9XQYAQ_jc7Okxt8oD9uOtLa-G42HdJtNPThipy3pg-Y_POKvD2sd9VjWj9vnqr7OnWC6SlFi4AKdA5G2sYgbzUi8oMp0WqtmQWtIG4ruS5ASSxadpCWmQZlY6FV-Yrc_PW6mNt_je7TjKd9tLP4OP8Fa8pNpA
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/MSR.2015.22
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9780769555942
0769555942
EndPage 167
ExternalDocumentID 7180076
Genre orig-research
GroupedDBID 6IE
6IF
6IH
6IK
6IL
AAJGR
ALMA_UNASSIGNED_HOLDINGS
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i208t-ebe9e69839a5bdae1f8eee1ca7eb8880b98696327184965e4f0c5b0ade5db9f63
IEDL.DBID RIE
ISSN 2160-1852
IngestDate Wed Aug 27 02:33:47 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i208t-ebe9e69839a5bdae1f8eee1ca7eb8880b98696327184965e4f0c5b0ade5db9f63
PageCount 11
ParticipantIDs ieee_primary_7180076
PublicationCentury 2000
PublicationDate 20150501
PublicationDateYYYYMMDD 2015-05-01
PublicationDate_xml – month: 05
  year: 2015
  text: 20150501
  day: 01
PublicationDecade 2010
PublicationTitle 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories
PublicationTitleAbbrev MSR
PublicationYear 2015
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib024632975
ssib024632974
ssib025354978
ssib011891898
ssib010674209
ssib030084171
Score 1.8368692
Snippet Popular open-source software projects receive and review contributions from a diverse array of developers, many of whom have little to no prior involvement...
SourceID ieee
SourceType Publisher
StartPage 157
SubjectTerms code review
Context
Context modeling
Data mining
Entropy
Java
language model
Mathematical model
pull request
Software
Title Will They Like This? Evaluating Code Contributions with Language Models
URI https://ieeexplore.ieee.org/document/7180076
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELbaTkyAWsRbHhhJmjh2Ek8MVQtCLUJApW5VHF9QVZQi2g7w67nLqwgxIGWwowx-XOx7fh9jV0pL48vIOnj2o4GiY-sY4weOIjKRzCgDUGRbPIR3U3k_U7MWu25qYQCgSD4Dl5pFLN-u0i25yvp4jlLkqM3aKGZlrVYtO4SEJsUOigr1Zo1PI1tChoHQO1Wn6jeyLFSgiGyt7gcENO8X9prwQwp_KlGV9_me7k-enygtTLlEuvuDlqW4lUb7bFLPp0xGWbrbjXHTr19Qj_-d8AHr7er_-GNzsx2yFuRddkuuGY5i9cnHiyVga7G-4cMKLTx_5YOVBU5oVzWH1pqTl5ePK5coJ961t3WPTUfDl8GdU9EwOAvhxRsHt1lDqFGTSpSxCfhZjAP10yQCg_azZ3Qc4m8scLAEPg8y81JlvMSCskZnYXDEOvkqh2PGbZR4KWitgjSR-GmcJVmgIglRFKSRDk9YlxZh_l4ibcyr-Z_-_fqM7dEelOmH56yz-djCBaoIG3NZyMY3boeyMQ
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKGWAC1CK-8cBIQhLbSTwxVC0F0gpBK3Wr4viCqqIU0XSAX48vSVOEGJAy2FEGn32x78537xFyJSRXLg-0ZfZ-46DIUFtKucwSSCaSKqEAimyLod8f84eJmDTIdV0LAwBF8hnY2Czu8vUiWWGo7Mbso3hztEW2zbnPRVmttdYexELj3gaMyljO0jy1dnncZ57cGDtVv9ZmTzCBdGvrPkOoebfw2DzXxwtQ4VUFfq4jbwYvz5gYJmyk3f1BzFKcS709MlhLVKajzO1Vruzk6xfY439F3iftTQUgfarPtgPSgKxF7jA4Q41ifdJoNgfTmi1vabfCC89eaWehgSLe1ZpFa0kxzkujKihKkXntbdkm41531OlbFRGDNfOcMLfMQkvwpbGlYqF0DG4amoG6SRyAMh60o2Tomx_ZM4NF-HngqZMI5cQahFYy9dkhaWaLDI4I1UHsJCClYEnMzadhGqdMBByCgCWB9I9JCydh-l5ibUwr-U_-fn1JdvqjQTSN7oePp2QX16NMRjwjzfxjBefGYMjVRaEn3_sJtX4
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2015+IEEE%2FACM+12th+Working+Conference+on+Mining+Software+Repositories&rft.atitle=Will+They+Like+This%3F+Evaluating+Code+Contributions+with+Language+Models&rft.au=Hellendoorn%2C+Vincent+J.&rft.au=Devanbu%2C+Premkumar+T.&rft.au=Bacchelli%2C+Alberto&rft.date=2015-05-01&rft.pub=IEEE&rft.issn=2160-1852&rft.spage=157&rft.epage=167&rft_id=info:doi/10.1109%2FMSR.2015.22&rft.externalDocID=7180076
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2160-1852&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2160-1852&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2160-1852&client=summon