Spatio-temporal mining of software adoption & penetration

How does malware propagate? Does it form spikes over time? Does it resemble the propagation pattern of benign files, such as software patches? Does it spread uniformly over countries? How long does it take for a URL that distributes malware to be detected and shut down? In this work, we answer these...

Full description

Saved in:
Bibliographic Details
Published in2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013) pp. 878 - 885
Main Authors Papalexakis, Evangelos E., Dumitras, Tudor, Duen Horng Chau, Prakash, B. Aditya, Faloutsos, Christos
Format Conference Proceeding
LanguageEnglish
Published ACM and IEEE 01.08.2013
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:How does malware propagate? Does it form spikes over time? Does it resemble the propagation pattern of benign files, such as software patches? Does it spread uniformly over countries? How long does it take for a URL that distributes malware to be detected and shut down? In this work, we answer these questions by analyzing patterns from 22 million malicious (and benign) files, found on 1.6 million hosts worldwide during the month of June 2011. We conduct this study using the WINE database available at Symantec Research Labs. Additionally, we explore the research questions raised by sampling on such large databases of executables; the importance of studying the implications of sampling is twofold: First, sampling is a means of reducing the size of the database hence making it more accessible to researchers; second, because every such data collection can be perceived as a sample of the real world. Finally, we discover the SHARKFIN temporal propagation pattern of executable files, the GEOSPLIT pattern in the geographical spread of machines that report executables to Symantec's servers, the Periodic Power Law (PPL) distribution of the life-time of URLs, and we show how to efficiently extrapolate crucial properties of the data from a small sample. To the best of our knowledge, our work represents the largest study of propagation patterns of executables.
DOI:10.1145/2492517.2500244