Social Network Analysis for Fraud Detection

In the last decade, the use of social media web sites in everybody's daily life is booming. People can continue their conversations on online social network sites like Facebook, Twitter, LinkedIn, Google+, Instagram, and so on and share their experiences with their acquaintances, friends, famil...

Full description

Saved in:
Bibliographic Details
Published inFraud Analytics Using Descriptive, Predictive, and Social Network Techniques pp. 207 - 278
Main Authors Van Vlasselaer, Veronique, Baesens, Bart, Verbeke, Wouter
Format Book Chapter
LanguageEnglish
Published United States Wiley 2015
John Wiley & Sons, Incorporated
John Wiley & Sons, Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In the last decade, the use of social media web sites in everybody's daily life is booming. People can continue their conversations on online social network sites like Facebook, Twitter, LinkedIn, Google+, Instagram, and so on and share their experiences with their acquaintances, friends, family, and so on. It only takes one click to update your whereabouts to the rest of the world. Plenty of options exist to broadcast your current activities: by picture, video, geo‐location, links, or just plain text. You are on the top of the world—and everybody's watching. And this is where it becomes interesting. Users of online social network sites explicitly reveal their relationships with other people. As a consequence, social network sites are a (almost) perfect mapping of the relationships that exist in the real world. We know who you are, what your hobbies and interests are, to whom you are married, how many children you have, your buddies with whom you run every week, your friends of the wine club, etc. This whole interconnected network of people knowing each other somehow, is an extremely interesting source of information and knowledge. Marketing managers no longer have to guess who might influence whom to create the appropriate campaign. It is all there—and that is exactly the problem. Social network sites acknowledge the richness of the data sources they have, and are not willing to share them as such and free of cost. Moreover, those data are often privatized and regulated, and well‐hidden from commercial use. On the other hand, social network sites offer many good built‐in facilities to managers and other interested parties to launch and manage their marketing campaigns by exploiting the social network, without publishing the exact network representation. However, companies often forget that they can reconstruct (a part of) the social network using in‐house data. Telecommunication providers, for example, have a massive transactional data base where they record call behavior of their customers. Under the assumption that good friends call each other more often, we can recreate the network and indicate the tie strength between people based on the frequency and/or duration of calls. Internet infrastructure providers might map the relationships between people using their customers' IP‐addresses. IP‐addresses that frequently communicate are represented by a stronger relationship. In the end, the IP‐network will envisage the relational structure between people from another point of view, but to a certain extent as observed in reality. Many more examples can be found in the banking, retail, and online gaming industry. Also, the fraud detection domain might benefit from the analysis of social networks. In this chapter, we underline the social character of fraud. This means that we assume that the probability of someone committing fraud depends on the people (s)he is connected to. These are the so‐called guilt‐by‐associations (Koutra et al. 2011). If we know that five friends of Bob are fraudsters, what would we say about Bob? Is he also likely to be a fraudster? If these friends are Bob's only friends, is it more likely that Bob will be influenced to commit fraud? What if Bob has 200 other friends; will the influence of these five fraudsters be the same? In this chapter, we will briefly introduce the reader to networks and their applications in a fraud detection setting. One of the main questions answered in this chapter is how unstructured network information can be translated into useful and meaningful characteristics of a subject. We will analyze and extract features from the direct neighborhood—that is, the direct associates of a certain person or subject—as well as the network as a whole (i.e., collective inferencing). Those network‐based features can serve as an enrichment of traditional data analysis techniques.
ISBN:1119133122
9781119133124
DOI:10.1002/9781119146841.ch5