To Err is (Not) Human: Examining Beliefs about Errors Made by Artificial Intelligence

Algorithm aversion research largely demonstrates algorithm aversion in tasks related to human intelligence. We offer a deeper understanding by investigating lay beliefs about AI, per se: We show that consumers believe AI commits fewer total errors, but is more likely to commit a severe error, than h...

Full description

Saved in:

Bibliographic Details
Published in	Advances in consumer research Vol. 50; pp. 406 - 407
Main Authors	Escoe, Brianna, Vanbergen, Noah, Irmak, Caglar
Format	Conference Proceeding
Language	English
Published	Urbana Association for Consumer Research 01.01.2022
Subjects	Age Algorithms Artificial intelligence Consumer behavior Consumers Decision making Emergency medical care Marketing Preferences Risk aversion Surgeons
Online Access	Get full text
ISSN	0098-9258

Cover

Loading…

More Information
Summary:	Algorithm aversion research largely demonstrates algorithm aversion in tasks related to human intelligence. We offer a deeper understanding by investigating lay beliefs about AI, per se: We show that consumers believe AI commits fewer total errors, but is more likely to commit a severe error, than humans. Companies are increasingly relying on artificial intelligence (AI) in various aspects of their operations. Yet, a great deal of work has demonstrated that consumers are hesitant to adopt AI, a phenomenon referred to as "algorithm aversion," (Jussupow, Benbasat, and Heinzl 2020). Algorithm aversion has been found to occur in domains where uniquely human capabilities (e.g., moral judgment, accounting for uniqueness, competence at making subjective judgments, etc.) are relevant (Bigman and Gray 2018; Granulo, Fuchs and Puntoni 2020; Longoni and Cian 2020). While this work successfully identifies the contexts in which algorithm aversion is likely to be observed, we know less about the psychological underpinnings for algorithm aversion or why algorithm aversion is observed in domains where uniquely human skills are not relevant (see Dietvorst, Simmons, and Massey 2014). We propose that to understand the psychological processes driving algorithm aversion, we must understand what beliefs consumers have about AI. Furthermore, over many interactions with computers, we propose that consumers learn AIs' incorrect responses are not consistent or systematic in magnitude: a response that is wildly incorrect is just as likely as a response that is slightly incorrect. This further implies that consumers should believe that AI is incapable of differentiating between errors, such that a response that is greatly incorrect (a severe error) is just as likely as a response that is slightly incorrect (a minor error). Due to this belief, consumers expect AI, as compared to a human, to be more likely to make a severe error and are reluctant to adopt it. We demonstrate the existence of this lay theory and its impact on consumer preferences in four studies. In studies la - b, we provide initial evidence of people's lay beliefs about the likelihood of relatively minor versus severe errors when a task is performed by AI versus a human. Study 1a investigates lay beliefs in a medical context, and study 1b replicates the results of study 1a in the context of a driverless vehicle. In both studies, participants are shown two line graphs depicting the performance of two service providers (i.e., a human vs. robot surgeon in study 1a; a human driver vs. driverless car in study 1b). One line (labeled "Surgeon 1" or "Driver 1") was steeper, illustrating more errors overall, but predominantly minor errors. The other line (labeled "Surgeon 2" or "Driver 2") was flatter, illustrating fewer overall errors but similar occurrences of major and minor errors. In study 1a and lb, the majority reported that the flatter line was more representative of AI (χ2(l) = 9.33,p = .002 and f(l) = 8.00, p = .005). In Study 2, participants were assigned to one of two conditions in which they were told they would be having an expensive or inexpensive item delivered. We predict that people will be willing to adopt AI in circumstances that are less risky, because the difference between Al's error avoidance tendencies versus a human's error avoidance tendencies are less relevant if highly consequential errors are implausible. In line with our predictions, participants displayed grater algorithm aversion when the package was expensive (vs. inexpensive; M = 5.35 vs. M = 4.65; t = 2.01, p = .047). In study 3, we further demonstrate the importance of error likelihood and type when choosing to adopt AI by manipulating error consequentiality within a medical domain. Algorithm aversion should only be displayed when there is a possibility for severe errors. Therefore, study 3 used a 2 (error type: minor vs. severe) x 2 (medical service provider type: human vs. AI) cell between-subject design. Participants were told to imagine they were suffering from acute pain in their stomach, had a fever, and had gone to the emergency room. In the minor error condition, participants were told they needed to have an abdominal X-ray done and that only minor errors were possible. In the severe error condition, participants were told they needed to have an emergency surgery to remove their appendix and minor, moderate, and severe errors were possible. As expected, an interaction emerged between the error type and medical service provider type factors (F(l,296) = 36.5, p < .001). Algorithm aversion was displayed in the severe error condition where people were more likely to undergo the surgery if it was performed by a human as compared to AI (p < .001). However, this effect was attenuated and there was no preference between service provider types within the minor error condition (p = .406). In study 4 we show that differences in risk aversion impact willingness to adopt AI when a severe error is implicit, but not explicit. The design was a 2(tram operator type: human vs. AI) x 2(error severity: high vs. low) x continuous (age: lower vs. higher) mixed design. Age was chosen as a proxy for risk aversion because it is easily obtained and it has been shown that younger (vs. older) consumers underestimate their risk for serious consequences in the context of driving (Delohomme, Verhiac, and Martha 2009). We expected that when severe errors were made explicit, age would not impact one's willingness to ride a human (vs. AI) operated tram and all consumers would prefer a tram operated by a human. However, when only minor errors are made explicit and severe errors are implicit, we expect older (but not younger) consumers to display algorithm aversion. Results revealed a three-way interaction between type of tram operator, error severity, and age (B = -0.7, p = .013). In the high error severity condition, we find no interaction between type of tram operator and age (F < 1). In contrast, in the low-severity condition, we find a significant interaction between type of tram operator and age (B = -0.08, p < .001), such that older (younger) consumers were significantly less (more) likely to ride the tram when it was operated by AI. Together, our studies reveal how a novel lay belief about AI impacts consumers' willingness to adopt AI across different contexts, furthering our understanding of when and why consumers display algorithm aversion.
Bibliography:	SourceType-Conference Papers & Proceedings-1 content type line 22 ObjectType-Feature-1
ISSN:	0098-9258