Contextual Documentation Referencing on Stack Overflow

Software engineering is knowledge-intensive and requires software developers to continually search for knowledge, often on community question answering platforms such as Stack Overflow. Such information sharing platforms do not exist in isolation, and part of the evidence that they exist in a broade...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on software engineering Vol. 48; no. 1; pp. 135 - 149
Main Authors Baltes, Sebastian, Treude, Christoph, Robillard, Martin P.
Format Journal Article
LanguageEnglish
Published New York IEEE 01.01.2022
IEEE Computer Society
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Software engineering is knowledge-intensive and requires software developers to continually search for knowledge, often on community question answering platforms such as Stack Overflow. Such information sharing platforms do not exist in isolation, and part of the evidence that they exist in a broader software documentation ecosystem is the common presence of hyperlinks to other documentation resources found in forum posts. With the goal of helping to improve the information diffusion between Stack Overflow and other documentation resources, we conducted a study to answer the question of how and why documentation is referenced in Stack Overflow threads. We sampled and classified 759 links from two different domains, regular expressions and Android development, to qualitatively and quantitatively analyze the links' context and purpose, including attribution, awareness, and recommendations. We found that links on Stack Overflow serve a wide range of distinct purposes, ranging from citation links attributing content copied into Stack Overflow, over links clarifying concepts using Wikipedia pages, to recommendations of software components and resources for background reading. This purpose spectrum has major corollaries, including our observation that links to documentation resources are a reflection of the information needs typical to a technology domain. We contribute a framework and method to analyze the context and purpose of Stack Overflow links, a public dataset of annotated links, and a description of five major observations about linking practices on Stack Overflow. Those observations include the above-mentioned purpose spectrum, its interplay with documentation resources and applications domains, and the fact that links on Stack Overflow often lack context in form of accompanying quotes or summaries. We further point to potential tool support to enhance the information diffusion between Stack Overflow and other documentation resources.
AbstractList Software engineering is knowledge-intensive and requires software developers to continually search for knowledge, often on community question answering platforms such as Stack Overflow. Such information sharing platforms do not exist in isolation, and part of the evidence that they exist in a broader software documentation ecosystem is the common presence of hyperlinks to other documentation resources found in forum posts. With the goal of helping to improve the information diffusion between Stack Overflow and other documentation resources, we conducted a study to answer the question of how and why documentation is referenced in Stack Overflow threads. We sampled and classified 759 links from two different domains, regular expressions and Android development, to qualitatively and quantitatively analyze the links' context and purpose, including attribution, awareness, and recommendations. We found that links on Stack Overflow serve a wide range of distinct purposes, ranging from citation links attributing content copied into Stack Overflow, over links clarifying concepts using Wikipedia pages, to recommendations of software components and resources for background reading. This purpose spectrum has major corollaries, including our observation that links to documentation resources are a reflection of the information needs typical to a technology domain. We contribute a framework and method to analyze the context and purpose of Stack Overflow links, a public dataset of annotated links, and a description of five major observations about linking practices on Stack Overflow. Those observations include the above-mentioned purpose spectrum, its interplay with documentation resources and applications domains, and the fact that links on Stack Overflow often lack context in form of accompanying quotes or summaries. We further point to potential tool support to enhance the information diffusion between Stack Overflow and other documentation resources.
Author Treude, Christoph
Baltes, Sebastian
Robillard, Martin P.
Author_xml – sequence: 1
  givenname: Sebastian
  orcidid: 0000-0002-2442-7522
  surname: Baltes
  fullname: Baltes, Sebastian
  email: sebastian.baltes@adelaide.edu.au
  organization: University of Adelaide, Adelaide, SA, Australia
– sequence: 2
  givenname: Christoph
  orcidid: 0000-0002-6919-2149
  surname: Treude
  fullname: Treude, Christoph
  email: christoph.treude@adelaide.edu.au
  organization: University of Adelaide, Adelaide, SA, Australia
– sequence: 3
  givenname: Martin P.
  orcidid: 0000-0002-0248-1384
  surname: Robillard
  fullname: Robillard, Martin P.
  email: martin@cs.mcgill.ca
  organization: McGill University, Montréal, QC, Canada
BookMark eNp9kM9LwzAUx4NMcJveBS8Fz50vSdMmR5mbCoOB2z2kaSKdXTLT1B__vR0dHjx4ejz4ft6X95mgkfPOIHSNYYYxiLvtZjEjQGBGBMdc8DM0xoKKlDICIzQGEDxljIsLNGnbHQCwomBjlM-9i-YrdqpJHrzu9sZFFWvvkhdjTTBO1-416ddNVPotWX-YYBv_eYnOrWpac3WaU7RdLrbzp3S1fnye369STQSOKdUMMAOicqrAcLAFrqxSNqMlzzOaYVzaimvgBYGs4EWJc1FxC7RSZWGATtHtcPYQ_Htn2ih3vguub5QkxwIIJRz3KRhSOvi2DcbKQ6j3KnxLDPIoR_Zy5FGOPMnpkfwPouvh7xhU3fwH3gxgbYz57RGQEco4_QEhgXJG
CODEN IESEDJ
CitedBy_id crossref_primary_10_1142_S0218194023500274
crossref_primary_10_1109_TSE_2021_3086494
crossref_primary_10_1007_s10664_023_10325_8
crossref_primary_10_1007_s10664_024_10540_x
crossref_primary_10_1109_TSE_2024_3403108
Cites_doi 10.1177/001316446002000104
10.1007/s10664-016-9430-z
10.1145/2970276.2970357
10.1145/1985793.1985907
10.1007/s10664-010-9150-8
10.1109/ICECCS.2017.30
10.1109/TSE.2013.12
10.1145/2631775.2631809
10.1145/1978942.1979366
10.1145/170035.170072
10.1145/3183519.3183547
10.1109/MSR.2013.6624008
10.1016/j.infsof.2017.10.009
10.2307/2340521
10.1109/ICSME.2018.00018
10.1109/ICSE.2019.00123
10.1109/ICSME.2017.24
10.1145/3173574.3174140
10.1109/MSR.2013.6624011
10.1109/MSR.2015.51
10.1007/978-981-15-0310-8_7
10.1108/eb046814
10.1007/s11280-018-0621-y
10.1007/s10664-018-9650-5
10.1145/3358931.3358937
10.1109/QSIC.2014.27
ContentType Journal Article
Copyright Copyright IEEE Computer Society 2022
Copyright_xml – notice: Copyright IEEE Computer Society 2022
DBID 97E
RIA
RIE
AAYXX
CITATION
JQ2
K9.
DOI 10.1109/TSE.2020.2981898
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
ProQuest Computer Science Collection
ProQuest Health & Medical Complete (Alumni)
DatabaseTitle CrossRef
ProQuest Health & Medical Complete (Alumni)
ProQuest Computer Science Collection
DatabaseTitleList
ProQuest Health & Medical Complete (Alumni)
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1939-3520
EndPage 149
ExternalDocumentID 10_1109_TSE_2020_2981898
9042358
Genre orig-research
GrantInformation_xml – fundername: Australian Research Council
  grantid: DE180100153
  funderid: 10.13039/501100000923
GroupedDBID --Z
-DZ
-~X
.DC
0R~
29I
4.4
5GY
6IK
85S
8R4
8R5
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABPPZ
ABQJQ
ABVLG
ACGFO
ACGOD
ACIWK
ACNCT
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BKOMP
BPEOZ
CS3
DU5
EBS
EDO
EJD
HZ~
I-F
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
Q2X
RIA
RIE
RNS
RXW
S10
TAE
TN5
TWZ
UHB
UPT
WH7
YZZ
AAYXX
ALIPV
CITATION
JQ2
K9.
ID FETCH-LOGICAL-c291t-3c501502a63a0e80f71dfaaf43b8643411bfd8c087204787b169d8f03dab7e03
IEDL.DBID RIE
ISSN 0098-5589
IngestDate Mon Jun 30 07:47:10 EDT 2025
Thu Apr 24 23:11:09 EDT 2025
Tue Jul 01 01:53:18 EDT 2025
Wed Aug 27 03:03:14 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c291t-3c501502a63a0e80f71dfaaf43b8643411bfd8c087204787b169d8f03dab7e03
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-0248-1384
0000-0002-2442-7522
0000-0002-6919-2149
PQID 2619023281
PQPubID 21418
PageCount 15
ParticipantIDs proquest_journals_2619023281
ieee_primary_9042358
crossref_citationtrail_10_1109_TSE_2020_2981898
crossref_primary_10_1109_TSE_2020_2981898
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-Jan.-1
2022-1-1
20220101
PublicationDateYYYYMMDD 2022-01-01
PublicationDate_xml – month: 01
  year: 2022
  text: 2022-Jan.-1
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on software engineering
PublicationTitleAbbrev TSE
PublicationYear 2022
Publisher IEEE
IEEE Computer Society
Publisher_xml – name: IEEE
– name: IEEE Computer Society
References ref12
ref15
ref14
ref11
ref17
ref16
ref19
ref18
Baltes (ref30) 2018
Baltes (ref33) 2019
ref50
ref46
ref45
ref48
ref47
Agrawal (ref37)
ref41
ref44
ref43
(ref10) 2019
(ref28) 2019
ref8
ref7
ref9
ref3
ref6
ref5
Baltes (ref32) 2019
ref40
ref35
ref36
ref31
ref2
ref39
ref38
McFadden (ref42) 1973
Baltes (ref24)
Parnin (ref4) 2012
Chevalier (ref49) 2006; 64
ref23
ref26
ref25
ref20
ref22
ref21
Charmaz (ref34) 2014
ref27
Baltes (ref29) 2018
Slegers (ref13) 2018
Spolsky (ref1) 2018
References_xml – year: 2019
  ident: ref10
  article-title: How do I write a good answer?
– year: 2019
  ident: ref28
  article-title: How to reference material written by others
– ident: ref35
  doi: 10.1177/001316446002000104
– ident: ref14
  article-title: Self-signed certificate on android.
– volume: 64
  start-page: 1031
  issue: 10
  volume-title: Int. J. Hum.-Comput. Stud.
  year: 2006
  ident: ref49
  article-title: Web designers and web users: Influence of the ergonomic quality of the web site on the information search
– ident: ref25
  doi: 10.1007/s10664-016-9430-z
– ident: ref38
  article-title: Finding the index of the first match of a regular expression in Java.
– ident: ref19
  doi: 10.1145/2970276.2970357
– year: 2012
  ident: ref4
  article-title: Crowd documentation: Exploring the coverage and the dynamics of API discussions on stack overflow
– ident: ref3
  doi: 10.1145/1985793.1985907
– ident: ref6
  doi: 10.1007/s10664-010-9150-8
– ident: ref18
  doi: 10.1109/ICECCS.2017.30
– ident: ref5
  doi: 10.1109/TSE.2013.12
– year: 2018
  ident: ref1
  article-title: Stack overflow launches
– ident: ref20
  doi: 10.1145/2631775.2631809
– ident: ref2
  doi: 10.1145/1978942.1979366
– ident: ref31
  doi: 10.1145/170035.170072
– ident: ref43
  doi: 10.1145/3183519.3183547
– year: 2018
  ident: ref30
  article-title: SOTorrent dataset 2018–07-31
– ident: ref50
  doi: 10.1109/MSR.2013.6624008
– ident: ref23
  doi: 10.1016/j.infsof.2017.10.009
– year: 2019
  ident: ref32
  article-title: Contextual documentation referencing on stack overflow—Supplementary Material
– ident: ref17
  article-title: Java regular expression to discover regular expression.
– ident: ref36
  doi: 10.2307/2340521
– start-page: 487
  volume-title: Proc. 20th Int. Conf. Very Large Data Bases
  ident: ref37
  article-title: Fast algorithms for mining association rules in large databases
– ident: ref44
  doi: 10.1109/ICSME.2018.00018
– ident: ref26
  doi: 10.1109/ICSE.2019.00123
– year: 2018
  ident: ref29
  article-title: SOTorrent dataset 2018–06-17
– ident: ref16
  article-title: Mongo find() with regex in java only return one entry.
– year: 2018
  ident: ref13
  article-title: The decline of Stack Overflow: How trolls have taken over your favorite programming Q&A site
– ident: ref15
  doi: 10.1109/ICSME.2017.24
– ident: ref9
  doi: 10.1145/3173574.3174140
– ident: ref40
  article-title: Regular expression match a-alphanumeric&b-digits&c-digits.
– ident: ref39
  article-title: How to check if string contains only numerics or letters properly? Android.
– start-page: 105
  volume-title: Frontiers in Econometrics
  year: 1973
  ident: ref42
  article-title: Conditional logit analysis of qualitative choice behavior
– ident: ref8
  doi: 10.1109/MSR.2013.6624011
– ident: ref22
  doi: 10.1109/MSR.2015.51
– ident: ref46
  article-title: Split regex to extract strings of contiguous characters.
– volume-title: Constructing Grounded Theory
  year: 2014
  ident: ref34
– ident: ref48
  article-title: Regex handling zero-length match.
– ident: ref27
  doi: 10.1007/978-981-15-0310-8_7
– ident: ref11
  article-title: Searching for both word and its negation in a string using Java regex.
– ident: ref41
  doi: 10.1108/eb046814
– ident: ref12
  doi: 10.1007/s11280-018-0621-y
– ident: ref45
  doi: 10.1007/s10664-018-9650-5
– ident: ref47
  article-title: Usage of | and :== in java doc.
– ident: ref7
  doi: 10.1145/3358931.3358937
– start-page: 319
  volume-title: Proc. 15th Int. Conf. Mining Softw. Repositories
  ident: ref24
  article-title: SOTorrent: Reconstructing and analyzing the evolution of Stack Overflow posts
– ident: ref21
  doi: 10.1109/QSIC.2014.27
– year: 2019
  ident: ref33
  article-title: Sbaltes/condor on GitHub
SSID ssj0005775
ssib053395008
Score 2.4179285
Snippet Software engineering is knowledge-intensive and requires software developers to continually search for knowledge, often on community question answering...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 135
SubjectTerms Community question answering
Context
Documentation
Domains
Electronic publishing
Encyclopedias
hyperlinks
information diffusion
Information dissemination
Internet
Links
Questions
Software
Software development
software documentation
Software engineering
stack overflow
Title Contextual Documentation Referencing on Stack Overflow
URI https://ieeexplore.ieee.org/document/9042358
https://www.proquest.com/docview/2619023281
Volume 48
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NS8MwFH9sO3nxa4rTKT14EWyXpEmbHEU3hjA9OGG3kqTpZWMT7VD8603SduIH4q2FpIS89_q-fw_gnFJKUqWLkGqsXZoxCaXSScgkJYWxCk5LFxqY3CXjR3o7Y7MWXG56YYwxvvjMRO7R5_LzlV67UNlAuCIOxtvQto5b1av1Wc6RpqzBx2SMiyYlicRg-jC0jiBBERFWPQn-RQX5mSo_fsReu4x2YNKcqyoqmUfrUkX6_Rtk438PvgvbtZkZXFV8sQcts9yHnWaEQ1BLdBcSj0715ppIgpv6O55WQYNAa1VbYF-tUarnwb1l_GKxej2A6Wg4vR6H9SiFUBOByzDWzIU2iExiiQxHRYrzQsqCxopbm4RirIqca8TdzBorwwonIucFinOpUoPiQ-gsV0tzBIGgiRZYcJbGnGqH7yalMUwKTdLUGN6DQXO5ma5hxt20i0Xm3Q0kMkuOzJEjq8nRg4vNjqcKYuOPtV13u5t19cX2oN_QL6tl8CVzvqG1SAjHx7_vOoEt4poZfEClD53yeW1OrYlRqjPPWx_Gh8u2
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV05T8MwFH6CMjBxIwoFMsDAkDZx7MQeGBCHyj1QJDbLcZylqEXQiuOv8Ff4cTy7ThGH2JCQMiSSHVn-nv0OP38PYItSSrJclyHVsbbHjGmocp2GTFFSGlRwWtnQwPlF2r6mJzfsZgJex3dhjDEu-cw07as7yy_6emhDZS1hkzgY9ymUp-b5ER20h93jA0Rzm5Cjw85-O_Q1BEJNRDwIE82sT09UmqjI8KjM4qJUqqRJzlEZ0zjOy4LriNtiLSi8eZyKgpdRUqg8M1GCv52EKTQzGBldDvvIH8kyVhFyMsZFdQYaiVbn6hA9TxI1iUB9KPgnneeKuHzb-Z06O5qFt2oiRlks3eZwkDf1yxeOyH86U3Mw483oYG8k9_MwYXoLMFuVqAj8jrUIqWPferKXZIIDP2wni0HFsIuqO8BPNLp1N7jEhV3e9h-XoPMXg1-GWq_fMysQCJpqEQvOsoRTbfnrlDKGKaFJlhnD69CqsJTa06jbah630rlTkZCIvrToS49-HXbGPe5GFCK_tF20YI7beRzr0KjERfo95kFa3xctLsLj1Z97bcJ0u3N-Js-OL07XbEDCRY7waUBtcD8062hODfINJ9YByD8Wjndtjybq
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Contextual+Documentation+Referencing+on+Stack+Overflow&rft.jtitle=IEEE+transactions+on+software+engineering&rft.au=Baltes%2C+Sebastian&rft.au=Treude%2C+Christoph&rft.au=Robillard%2C+Martin+P.&rft.date=2022-01-01&rft.pub=IEEE&rft.issn=0098-5589&rft.volume=48&rft.issue=1&rft.spage=135&rft.epage=149&rft_id=info:doi/10.1109%2FTSE.2020.2981898&rft.externalDocID=9042358
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0098-5589&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0098-5589&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0098-5589&client=summon