Federated Privacy-preserving Collaborative Filtering for On-Device Next App Prediction

In this study, we propose a novel SeqMF model to solve the problem of predicting the next app launch during mobile device usage. Although this problem can be represented as a classical collaborative filtering problem, it requires proper modification since the data are sequential, the user feedback i...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Sayapin, Albert, Balitskiy, Gleb, Bershatsky, Daniel, Katrutsa, Aleksandr, Frolov, Evgeny, Frolov, Alexey, Oseledets, Ivan, Kharin, Vitaliy
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 05.02.2023
Subjects	Collaboration Electronic devices Factorization Federated learning Filtration Mobile computing Privacy User behavior User experience
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this study, we propose a novel SeqMF model to solve the problem of predicting the next app launch during mobile device usage. Although this problem can be represented as a classical collaborative filtering problem, it requires proper modification since the data are sequential, the user feedback is distributed among devices and the transmission of users' data to aggregate common patterns must be protected against leakage. According to such requirements, we modify the structure of the classical matrix factorization model and update the training procedure to sequential learning. Since the data about user experience are distributed among devices, the federated learning setup is used to train the proposed sequential matrix factorization model. One more ingredient of the proposed approach is a new privacy mechanism that guarantees the protection of the sent data from the users to the remote server. To demonstrate the efficiency of the proposed model we use publicly available mobile user behavior data. We compare our model with sequential rules and models based on the frequency of app launches. The comparison is conducted in static and dynamic environments. The static environment evaluates how our model processes sequential data compared to competitors. Therefore, the standard train-validation-test evaluation procedure is used. The dynamic environment emulates the real-world scenario, where users generate new data by running apps on devices, and evaluates our model in this case. Our experiments show that the proposed model provides comparable quality with other methods in the static environment. However, more importantly, our method achieves a better privacy-utility trade-off than competitors in the dynamic environment, which provides more accurate simulations of real-world usage.
ISSN:	2331-8422