SoftMon: A Tool to Compare Similar Open-source Software from a Performance Perspective

Over the past two decades, a rich ecosystem of open-source software has evolved. For every type of application, there are a wide variety of alternatives. We observed that even if different applications that perform similar tasks and compiled with the same versions of the compiler and the libraries,...

Full description

Saved in:
Bibliographic Details
Published in2020 IEEE/ACM 17th International Conference on Mining Software Repositories (MSR) pp. 397 - 408
Main Authors Singh, Shubhankar Suman, Sarangi, Smruti R.
Format Conference Proceeding
LanguageEnglish
Published ACM 01.05.2020
Subjects
Online AccessGet full text
ISSN2574-3864
DOI10.1145/3379597.3387444

Cover

Loading…
Abstract Over the past two decades, a rich ecosystem of open-source software has evolved. For every type of application, there are a wide variety of alternatives. We observed that even if different applications that perform similar tasks and compiled with the same versions of the compiler and the libraries, they perform very differently while running on the same system. Sadly prior work in this area that compares two code bases for similarities does not help us in finding the reasons for the differences in performance. In this paper, we develop a tool, SoftMon, that can compare the codebases of two separate applications and pinpoint the exact set of functions that are disproportionately responsible for differences in performance. Our tool uses machine learning and NLP techniques to analyze why a given open-source application has a lower performance as compared to its peers, design bespoke applications that can incorporate specific innovations (identified by SoftMon) in competing applications, and diagnose performance bugs. In this paper, we compare a wide variety of large open-source programs such as image editors, audio players, text editors, PDF readers, mail clients and even full-fledged operating systems (OSs). In all cases, our tool was able to pinpoint a set of at the most 10-15 functions that are responsible for the differences within 200 seconds. A subsequent manual analysis assisted by our graph visualization engine helps us find the reasons. We were able to validate most of the reasons by correlating them with subsequent observations made by developers or from existing technical literature. The manual phase of our analysis is limited to 30 minutes (tested with human subjects).
AbstractList Over the past two decades, a rich ecosystem of open-source software has evolved. For every type of application, there are a wide variety of alternatives. We observed that even if different applications that perform similar tasks and compiled with the same versions of the compiler and the libraries, they perform very differently while running on the same system. Sadly prior work in this area that compares two code bases for similarities does not help us in finding the reasons for the differences in performance. In this paper, we develop a tool, SoftMon, that can compare the codebases of two separate applications and pinpoint the exact set of functions that are disproportionately responsible for differences in performance. Our tool uses machine learning and NLP techniques to analyze why a given open-source application has a lower performance as compared to its peers, design bespoke applications that can incorporate specific innovations (identified by SoftMon) in competing applications, and diagnose performance bugs. In this paper, we compare a wide variety of large open-source programs such as image editors, audio players, text editors, PDF readers, mail clients and even full-fledged operating systems (OSs). In all cases, our tool was able to pinpoint a set of at the most 10-15 functions that are responsible for the differences within 200 seconds. A subsequent manual analysis assisted by our graph visualization engine helps us find the reasons. We were able to validate most of the reasons by correlating them with subsequent observations made by developers or from existing technical literature. The manual phase of our analysis is limited to 30 minutes (tested with human subjects).
Author Singh, Shubhankar Suman
Sarangi, Smruti R.
Author_xml – sequence: 1
  givenname: Shubhankar Suman
  surname: Singh
  fullname: Singh, Shubhankar Suman
  email: shubhankar@cse.iitd.ac.in
  organization: IIT Delhi,Computer Science and Engineering
– sequence: 2
  givenname: Smruti R.
  surname: Sarangi
  fullname: Sarangi, Smruti R.
  email: srsarangi@cse.iitd.ac.in
  organization: IIT Delhi,Computer Science
BookMark eNotjF1LwzAYhaMouM1de-NF_kBn3rxJ39S7UfyCyYRNb0fSJVBpm5JWxX9vRa_O4XkOZ87Outh5xq5ArACUvkGkQhe0QjSklDph84kKJA0kTtlMalIZmlxdsOUwvAshpNEGyczY2y6G8Tl2t3zN9zE2fIy8jG1vk-e7uq0bm_i29102xI9UTWyaf_3KkGLLLX_xKcTU2m5yUx96X431p79k58E2g1_-54K93t_ty8dss314KtebzErKx6w4SudkbqpjTqYqtBeohUJCAUEZA0Y6CpALIBMQhK3Qh8JqBzk5qcDhgl3__dbe-0Of6tam7wMIUIa0xB9iVlE6
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1145/3379597.3387444
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1450375170
9781450375177
EISSN 2574-3864
EndPage 408
ExternalDocumentID 10148752
Genre orig-research
GroupedDBID 6IE
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-a276t-9d2bb268cd678c95e0350437301f488182b7f160178f310ac3ef9a5b167b241b3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:21:50 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a276t-9d2bb268cd678c95e0350437301f488182b7f160178f310ac3ef9a5b167b241b3
PageCount 12
ParticipantIDs ieee_primary_10148752
PublicationCentury 2000
PublicationDate 2020-May
PublicationDateYYYYMMDD 2020-05-01
PublicationDate_xml – month: 05
  year: 2020
  text: 2020-May
PublicationDecade 2020
PublicationTitle 2020 IEEE/ACM 17th International Conference on Mining Software Repositories (MSR)
PublicationTitleAbbrev MSR
PublicationYear 2020
Publisher ACM
Publisher_xml – name: ACM
SSID ssj0002858378
ssj0003211714
Score 1.7430533
Snippet Over the past two decades, a rich ecosystem of open-source software has evolved. For every type of application, there are a wide variety of alternatives. We...
SourceID ieee
SourceType Publisher
StartPage 397
SubjectTerms Codes
Libraries
Machine learning
Manuals
NLP based matching
Operating systems
Performance debugging
Software comparison
Technological innovation
Visualization
Title SoftMon: A Tool to Compare Similar Open-source Software from a Performance Perspective
URI https://ieeexplore.ieee.org/document/10148752
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA7akycVK77JwWu23U2yab1JsRTBUmgrvZVMNoGidkW2CP56Z7J9oCB4yz4OIcnkmy-Zb4axW-Q2CrRvi3bwWijl0OYQJoSzaEw53fXF4PGnYT6YqseZnq3F6lEL472PwWc-oWa8yy9Kt6KjshbVlUX_GnfcfVxntVhre6CSdTQlR98-S6Q2JlXrdD6p0i0pqbC2SZCVGaXUj3oqEU76h2y46UgdRfKSrCpI3NevHI3_7ukRa-6Ue3y0xaRjtueXJ-x5jHstGu8dv-eTsnzlVcl7dew5Hy_eFkhuOQWWiPokn9Pvn_SRtCfc8tFOXEDtjTizyab9h0lvINb1FITNTF6JbpEBZHnHFYhQroszJHVMbdROA9oxMg0wIUWGZjoBvT7rpA9dqyHNDSDQgzxljWW59GeMB2mLzBiQQQMSSgl1LbNUWfQPrINz1qRBmb_XKTPmm_G4-OP9JTvIiMjGSMIr1qg-Vv4a0b6CmzjL3034ph4
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA5SD3pSseLbHLxm290km9abFEvVthTaSm8lySZQtF2RLYK_3plsHygI3rKPQ0iYfPNN5psh5Ba4jTDS1VndO8mEsGBzABPMajCmFO_6QvJ4r592xuJpIicrsXrQwjjnQvKZi3AY7vKz3C4xVFbDvrLgX8OJuwvAL2Qp19qEVJKGxPLom2cO5EbFYlXQJxayxjm21lYR8DIlhPjRUSUASvuA9NdTKfNIXqNlYSL79atK47_nekiqW-0eHWxQ6YjsuMUxeRnCaQvme0fv6SjP32iR01aZfU6Hs_kM6C3F1BJWxvIp_v6JH1F9QjUdbOUFOF7LM6tk3H4YtTps1VGB6USlBWtmiTFJ2rAZYJRtwh5xGYob1WMPlgxcwygfA0dTDQ9-n7bc-aaWJk6VAag3_IRUFvnCnRLquc4SpQz30gCl5KbsZhYLDR6CtuaMVHFRpu9l0Yzpej3O_3h_Q_Y6o1532n3sP1-Q_QRpbcgrvCSV4mPprgD7C3MddvwbNRqpaw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2020+IEEE%2FACM+17th+International+Conference+on+Mining+Software+Repositories+%28MSR%29&rft.atitle=SoftMon%3A+A+Tool+to+Compare+Similar+Open-source+Software+from+a+Performance+Perspective&rft.au=Singh%2C+Shubhankar+Suman&rft.au=Sarangi%2C+Smruti+R.&rft.date=2020-05-01&rft.pub=ACM&rft.eissn=2574-3864&rft.spage=397&rft.epage=408&rft_id=info:doi/10.1145%2F3379597.3387444&rft.externalDocID=10148752