The Importance of the Correlation in Crossover Experiments

Context: In empirical software engineering, crossover designs are popular for experiments comparing software engineering techniques that must be undertaken by human participants. However, their value depends on the correlation (<inline-formula><tex-math notation="LaTeX">r</t...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on software engineering Vol. 48; no. 8; pp. 2802 - 2813
Main Authors	Kitchenham, Barbara, Madeyski, Lech, Scanniello, Giuseppe, Gravino, Carmine
Format	Journal Article
Language	English
Published	New York IEEE 01.08.2022 IEEE Computer Society
Subjects	Atmospheric measurements Correlation crossover design crossover experiments Empirical analysis Empirical software engineering Estimates Experiments Mathematical model Particle measurements repeated measures correlation Size measurement Software engineering Time measurement Training Within-subjects design
Online Access	Get full text

Cover

Loading…

Abstract	Context: In empirical software engineering, crossover designs are popular for experiments comparing software engineering techniques that must be undertaken by human participants. However, their value depends on the correlation (<inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq1-3070480.gif"/> </inline-formula>) between the outcome measures on the same participants. Software engineering theory emphasizes the importance of individual skill differences, so we would expect the values of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq2-3070480.gif"/> </inline-formula> to be relatively high. However, few researchers have reported the values of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq3-3070480.gif"/> </inline-formula>. Goal: To investigate the values of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq4-3070480.gif"/> </inline-formula> found in software engineering experiments. Method: We undertook simulation studies to investigate the theoretical and empirical properties of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq5-3070480.gif"/> </inline-formula>. Then we investigated the values of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq6-3070480.gif"/> </inline-formula> observed in 35 software engineering crossover experiments. Results: The level of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq7-3070480.gif"/> </inline-formula> obtained by analysing our 35 crossover experiments was small. Estimates based on means, medians, and random effect analysis disagreed but were all between 0.2 and 0.3. As expected, our analyses found large variability among the individual <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq8-3070480.gif"/> </inline-formula> estimates for small sample sizes, but no indication that <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq9-3070480.gif"/> </inline-formula> estimates were larger for the experiments with larger sample sizes that exhibited smaller variability. Conclusions: Low observed <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq10-3070480.gif"/> </inline-formula> values cast doubts on the validity of crossover designs for software engineering experiments. However, if the cause of low <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq11-3070480.gif"/> </inline-formula> values relates to training limitations or toy tasks, this affects all Software Engineering (SE) experiments involving human participants. For all human-intensive SE experiments, we recommend more intensive training and then tracking the improvement of participants as they practice using specific techniques, before formally testing the effectiveness of the techniques.
AbstractList	Context: In empirical software engineering, crossover designs are popular for experiments comparing software engineering techniques that must be undertaken by human participants. However, their value depends on the correlation (<inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq1-3070480.gif"/> </inline-formula>) between the outcome measures on the same participants. Software engineering theory emphasizes the importance of individual skill differences, so we would expect the values of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq2-3070480.gif"/> </inline-formula> to be relatively high. However, few researchers have reported the values of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq3-3070480.gif"/> </inline-formula>. Goal: To investigate the values of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq4-3070480.gif"/> </inline-formula> found in software engineering experiments. Method: We undertook simulation studies to investigate the theoretical and empirical properties of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq5-3070480.gif"/> </inline-formula>. Then we investigated the values of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq6-3070480.gif"/> </inline-formula> observed in 35 software engineering crossover experiments. Results: The level of <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq7-3070480.gif"/> </inline-formula> obtained by analysing our 35 crossover experiments was small. Estimates based on means, medians, and random effect analysis disagreed but were all between 0.2 and 0.3. As expected, our analyses found large variability among the individual <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq8-3070480.gif"/> </inline-formula> estimates for small sample sizes, but no indication that <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq9-3070480.gif"/> </inline-formula> estimates were larger for the experiments with larger sample sizes that exhibited smaller variability. Conclusions: Low observed <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq10-3070480.gif"/> </inline-formula> values cast doubts on the validity of crossover designs for software engineering experiments. However, if the cause of low <inline-formula><tex-math notation="LaTeX">r</tex-math> <mml:math><mml:mi>r</mml:mi></mml:math><inline-graphic xlink:href="madeyski-ieq11-3070480.gif"/> </inline-formula> values relates to training limitations or toy tasks, this affects all Software Engineering (SE) experiments involving human participants. For all human-intensive SE experiments, we recommend more intensive training and then tracking the improvement of participants as they practice using specific techniques, before formally testing the effectiveness of the techniques. Context: In empirical software engineering, crossover designs are popular for experiments comparing software engineering techniques that must be undertaken by human participants. However, their value depends on the correlation ([Formula Omitted]) between the outcome measures on the same participants. Software engineering theory emphasizes the importance of individual skill differences, so we would expect the values of [Formula Omitted] to be relatively high. However, few researchers have reported the values of [Formula Omitted]. Goal: To investigate the values of [Formula Omitted] found in software engineering experiments. Method: We undertook simulation studies to investigate the theoretical and empirical properties of [Formula Omitted]. Then we investigated the values of [Formula Omitted] observed in 35 software engineering crossover experiments. Results: The level of [Formula Omitted] obtained by analysing our 35 crossover experiments was small. Estimates based on means, medians, and random effect analysis disagreed but were all between 0.2 and 0.3. As expected, our analyses found large variability among the individual [Formula Omitted] estimates for small sample sizes, but no indication that [Formula Omitted] estimates were larger for the experiments with larger sample sizes that exhibited smaller variability. Conclusions: Low observed [Formula Omitted] values cast doubts on the validity of crossover designs for software engineering experiments. However, if the cause of low [Formula Omitted] values relates to training limitations or toy tasks, this affects all Software Engineering (SE) experiments involving human participants. For all human-intensive SE experiments, we recommend more intensive training and then tracking the improvement of participants as they practice using specific techniques, before formally testing the effectiveness of the techniques.
Author	Madeyski, Lech Gravino, Carmine Kitchenham, Barbara Scanniello, Giuseppe
Author_xml	– sequence: 1 givenname: Barbara orcidid: 0000-0002-6134-8460 surname: Kitchenham fullname: Kitchenham, Barbara email: b.a.kitchenham@keele.ac.uk organization: School of Computing and Mathematics, Keele University, Staffordshire, U.K – sequence: 2 givenname: Lech orcidid: 0000-0003-3907-3357 surname: Madeyski fullname: Madeyski, Lech email: Lech.Madeyski@pwr.edu.pl organization: Department of Applied Informatics, Wroclaw University of Science and Technology, Wroclaw, Poland – sequence: 3 givenname: Giuseppe orcidid: 0000-0003-0024-7508 surname: Scanniello fullname: Scanniello, Giuseppe email: giuseppe.scanniello@unibas.it organization: Department of Mathematics, Computer Science, and Economics, University of Basilicata, Potenza, Italy – sequence: 4 givenname: Carmine orcidid: 0000-0002-4394-9035 surname: Gravino fullname: Gravino, Carmine email: gravino@unisa.it organization: Department of Computer Science, University of Salerno, Fisciano, Italy
BookMark	eNp9kEFLAzEQhYNUsFXvgpcFz1snSXeTeJOlaqHgwXoO2ewEt7SbNZuK_ntTWzx48DQwvDfz3jcho853SMgVhSmloG5XL_MpA0anHATMJJyQMVVc5bxgMCJjACXzopDqjEyGYQ0AhRDFmNyt3jBbbHsfouksZt5lMW0qHwJuTGx9l7VdVgU_DP4DQzb_7DG0W-zicEFOndkMeHmc5-T1Yb6qnvLl8-Oiul_mlnMec6GoNTBzFspaouDOWFsrTiWjTjnFRVPWrimtYhKcSKFmddM4ZpoaaS0p8nNyc7jbB_--wyHqtd-FLr3UTAAHpSilSVUeVHafNaDTto0_BWIw7UZT0HtOOnHSe076yCkZ4Y-xTwVN-PrPcn2wtIj4K0-0C8Y4_wZhmXTd
CODEN	IESEDJ
CitedBy_id	crossref_primary_10_1016_j_jss_2024_111990 crossref_primary_10_1007_s10664_024_10504_1
Cites_doi	10.1037/1082-989X.1.2.170 10.1016/j.jvlc.2017.06.002 10.1007/s10664-016-9437-5 10.1145/3239235.3240496 10.3233/JIFS-169146 10.1145/3319008.3319009 10.1145/2491912 10.1016/j.jvlc.2014.12.004 10.1016/j.jvlc.2014.03.002 10.1109/TSE.2018.2864633 10.1007/978-3-642-04288-1 10.1002/0470854596 10.1007/s10664-019-09747-0 10.1109/32.922713 10.1109/TSE.2009.69 10.1007/s10664-017-9574-5 10.1145/3104029 10.1007/s10270-013-0386-9 10.1037//0033-2909.112.1.155 10.1007/s10664-014-9327-7 10.1016/j.jss.2012.07.043 10.1109/TSE.2012.27 10.1145/2629457 10.1109/TSE.2015.2467378 10.1145/2601248.2601259
ContentType	Journal Article
Copyright	Copyright IEEE Computer Society 2022
Copyright_xml	– notice: Copyright IEEE Computer Society 2022
DBID	97E ESBDL RIA RIE AAYXX CITATION JQ2 K9.
DOI	10.1109/TSE.2021.3070480
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef ProQuest Computer Science Collection ProQuest Health & Medical Complete (Alumni)
DatabaseTitle	CrossRef ProQuest Health & Medical Complete (Alumni) ProQuest Computer Science Collection
DatabaseTitleList	ProQuest Health & Medical Complete (Alumni)
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	1939-3520
EndPage	2813
ExternalDocumentID	10_1109_TSE_2021_3070480 9395223
Genre	orig-research
GroupedDBID	--Z -DZ -~X .DC 0R~ 29I 4.4 5GY 6IK 85S 8R4 8R5 97E AAJGR AARMG AASAJ AAWTH ABAZT ABPPZ ABQJQ ABVLG ACGFO ACGOD ACIWK ACNCT AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BKOMP BPEOZ CS3 DU5 EBS EDO EJD ESBDL HZ~ I-F IEDLZ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P Q2X RIA RIE RNS RXW S10 TAE TN5 TWZ UHB UPT WH7 YZZ AAYXX ALIPV CITATION JQ2 K9.
ID	FETCH-LOGICAL-c333t-791ca04fc06b8e73faccb931821f9f937d6bfd6c9280f77754bddf2adbe1b81e3
IEDL.DBID	RIE
ISSN	0098-5589
IngestDate	Mon Jun 30 08:54:51 EDT 2025 Thu Apr 24 23:04:05 EDT 2025 Tue Jul 01 01:53:19 EDT 2025 Wed Aug 27 02:02:18 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	8
Language	English
License	https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c333t-791ca04fc06b8e73faccb931821f9f937d6bfd6c9280f77754bddf2adbe1b81e3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-4394-9035 0000-0003-3907-3357 0000-0002-6134-8460 0000-0003-0024-7508
OpenAccessLink	https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/document/9395223
PQID	2703099111
PQPubID	21418
PageCount	12
ParticipantIDs	proquest_journals_2703099111 ieee_primary_9395223 crossref_citationtrail_10_1109_TSE_2021_3070480 crossref_primary_10_1109_TSE_2021_3070480
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2022-08-01
PublicationDateYYYYMMDD	2022-08-01
PublicationDate_xml	– month: 08 year: 2022 text: 2022-08-01 day: 01
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	IEEE transactions on software engineering
PublicationTitleAbbrev	TSE
PublicationYear	2022
Publisher	IEEE IEEE Computer Society
Publisher_xml	– name: IEEE – name: IEEE Computer Society
References	ref13 ref12 ref15 ref14 ref10 ref2 ref1 ref17 ref16 ref19 ref18 Madeyski (ref11) 2020 Boehm (ref7) 2000 ref24 ref23 ref26 Rosnow (ref28) 1997 ref25 ref20 ref22 ref21 Kitchenham (ref8) 2020 ref27 ref29 ref9 ref4 ref3 ref6 ref5
References_xml	– ident: ref5 doi: 10.1037/1082-989X.1.2.170 – ident: ref18 doi: 10.1016/j.jvlc.2017.06.002 – ident: ref29 doi: 10.1007/s10664-016-9437-5 – ident: ref25 doi: 10.1145/3239235.3240496 – volume-title: People Studying People Artifacts and Ethics in Behavioural Research year: 1997 ident: ref28 – ident: ref26 doi: 10.3233/JIFS-169146 – ident: ref3 doi: 10.1145/3319008.3319009 – ident: ref16 doi: 10.1145/2491912 – volume-title: Software Cost Estimation with COCOMO II year: 2000 ident: ref7 – ident: ref20 doi: 10.1016/j.jvlc.2014.12.004 – ident: ref14 doi: 10.1016/j.jvlc.2014.03.002 – year: 2020 ident: ref11 article-title: Reproducer: Reproduce statistical analyses and meta-analyses – ident: ref27 doi: 10.1109/TSE.2018.2864633 – ident: ref23 doi: 10.1007/978-3-642-04288-1 – ident: ref10 doi: 10.1002/0470854596 – ident: ref6 doi: 10.1007/s10664-019-09747-0 – ident: ref4 doi: 10.1109/32.922713 – ident: ref12 doi: 10.1109/TSE.2009.69 – ident: ref2 doi: 10.1007/s10664-017-9574-5 – ident: ref24 doi: 10.1145/3104029 – ident: ref22 doi: 10.1007/s10270-013-0386-9 – ident: ref9 doi: 10.1037//0033-2909.112.1.155 – ident: ref15 doi: 10.1007/s10664-014-9327-7 – ident: ref13 doi: 10.1016/j.jss.2012.07.043 – ident: ref17 doi: 10.1109/TSE.2012.27 – volume-title: The Importance of the Correlation in Crossover Experiments year: 2020 ident: ref8 article-title: Supplementary material to the paper – ident: ref21 doi: 10.1145/2629457 – ident: ref1 doi: 10.1109/TSE.2015.2467378 – ident: ref19 doi: 10.1145/2601248.2601259
SSID	ssj0005775 ssib053395008
Score	2.398284
Snippet	Context: In empirical software engineering, crossover designs are popular for experiments comparing software engineering techniques that must be undertaken by...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	2802
SubjectTerms	Atmospheric measurements Correlation crossover design crossover experiments Empirical analysis Empirical software engineering Estimates Experiments Mathematical model Particle measurements repeated measures correlation Size measurement Software engineering Time measurement Training Within-subjects design
Title	The Importance of the Correlation in Crossover Experiments
URI	https://ieeexplore.ieee.org/document/9395223 https://www.proquest.com/docview/2703099111
Volume	48
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEB5qT16sWsVqlRy8CO52N9lXvElpqYJebKG3ZfMCUbditxd_vZl9KSribVkSyGYmyTebb74BOLcgV1OlModKhSk5KnQynxsnZoLzkGUouIJsi_totghul-GyA5dtLozWuiSfaRcfy7t8tZIb_FU24oxbuMC2YMsGblWu1iedI47DRh8zDBPeXEl6fDR_mNhAkPou-neAApBfjqCypsqPjbg8XaY9uGvGVZFKntxNIVz5_k2y8b8D34WdGmaS68ov9qCj833oNSUcSL2i-3Bl3YTcvJQg3L4gK0MsIiRjrNlRseTIY07G-B3I9SSTtiDA-gAW08l8PHPqcgqOZIwVKEwpMy8w0otEomNmMikFt2ua-oYbC1NUJIyKJKeJZ2JUxhNKGZopoX2R-JodQjdf5foISMikBQZJGOsoCez5xv1M0IxZuOBJG3_xAYyaGU5lrTWOJS-e0zLm8HhqbZKiTdLaJgO4aHu8Vjobf7Tt4xS37erZHcCwMWJaL8R1SnFH47ijH__e6wS2KWY0lJy-IXSLt40-tTijEGelg30AcszNLQ
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED5BGWDhjSgU8MCCRNLErpOYDVVF5dWFVmKL4peEgBZBuvDr8eUFAoTYosiWHN_Z9zn-7juAYwdyDdU686jSmJKjuZeFwnoxk0JwlqHgCrItRtFw0ru65_cLcNrkwhhjCvKZ8fGxuMvXMzXHX2VdwYSDC2wRllzc52GZrfVJ6IhjXitkcp6I-lIyEN3x3cAdBWnoo4f3UALySxAqqqr82IqL-HKxBrf1yEpayaM_z6Wv3r-JNv536OuwWgFNcl56xgYsmOkmrNVFHEi1prfgzDkKuXwuYLh7QWaWOExI-li1o-TJkYcp6eN3INuTDJqSAG_bMLkYjPtDryqo4CnGWI7SlCoLelYFkUxMzGymlBRuVdPQCuuAio6k1ZESNAlsjNp4UmtLMy1NKJPQsB1oTWdTswuEM-WgQcJjEyU9F-FEmEmaMQcYAuVOYKIN3XqGU1WpjWPRi6e0OHUEInU2SdEmaWWTNpw0PV5KpY0_2m7hFDftqtltQ6c2YlotxbeU4p4mcE_f-73XESwPx7c36c3l6HofVijmNxQMvw608te5OXCoI5eHhbN9AFSA0HY
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Importance+of+the+Correlation+in+Crossover+Experiments&rft.jtitle=IEEE+transactions+on+software+engineering&rft.au=Kitchenham%2C+Barbara&rft.au=Madeyski%2C+Lech&rft.au=Scanniello%2C+Giuseppe&rft.au=Gravino%2C+Carmine&rft.date=2022-08-01&rft.pub=IEEE&rft.issn=0098-5589&rft.volume=48&rft.issue=8&rft.spage=2802&rft.epage=2813&rft_id=info:doi/10.1109%2FTSE.2021.3070480&rft.externalDocID=9395223
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0098-5589&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0098-5589&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0098-5589&client=summon