Latent Clustering Models for Outlier Identification in Telecom Data

Collected telecom data traffic has boomed in recent years, due to the development of 4G mobile devices and other similar high-speed machines. The ability to quickly identify unexpected traffic data in this stream is critical for mobile carriers, as it can be caused by either fraudulent intrusion or...

Full description

Saved in:

Bibliographic Details
Published in	Mobile information systems Vol. 2016; no. 2016; pp. 1 - 11
Main Authors	Hu, Mantian (Mandy), Shim, J. P., Huet, Alexis, Ouyang, Ye
Format	Journal Article
Language	English
Published	Cairo, Egypt Hindawi Publishing Corporation 01.01.2016 Hindawi Limited
Subjects	Clustering Data analysis Electronic devices Gaussian distribution Information management Intrusion Outliers (statistics) Telecommunications Traffic information Traffic models Traffic speed
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Collected telecom data traffic has boomed in recent years, due to the development of 4G mobile devices and other similar high-speed machines. The ability to quickly identify unexpected traffic data in this stream is critical for mobile carriers, as it can be caused by either fraudulent intrusion or technical problems. Clustering models can help to identify issues by showing patterns in network data, which can quickly catch anomalies and highlight previously unseen outliers. In this article, we develop and compare clustering models for telecom data, focusing on those that include time-stamp information management. Two main models are introduced, solved in detail, and analyzed: Gaussian Probabilistic Latent Semantic Analysis (GPLSA) and time-dependent Gaussian Mixture Models (time-GMM). These models are then compared with other different clustering models, such as Gaussian model and GMM (which do not contain time-stamp information). We perform computation on both sample and telecom traffic data to show that the efficiency and robustness of GPLSA make it the superior method to detect outliers and provide results automatically with low tuning parameters or expertise requirement.
ISSN:	1574-017X 1875-905X
DOI:	10.1155/2016/1542540