Unacceptable, where is my privacy? Exploring Accidental Triggers of Smart Speakers
Voice assistants like Amazon's Alexa, Google's Assistant, or Apple's Siri, have become the primary (voice) interface in smart speakers that can be found in millions of households. For privacy reasons, these speakers analyze every sound in their environment for their respective wake wo...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
02.08.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Voice assistants like Amazon's Alexa, Google's Assistant, or Apple's Siri,
have become the primary (voice) interface in smart speakers that can be found
in millions of households. For privacy reasons, these speakers analyze every
sound in their environment for their respective wake word like ''Alexa'' or
''Hey Siri,'' before uploading the audio stream to the cloud for further
processing. Previous work reported on the inaccurate wake word detection, which
can be tricked using similar words or sounds like ''cocaine noodles'' instead
of ''OK Google.''
In this paper, we perform a comprehensive analysis of such accidental
triggers, i.,e., sounds that should not have triggered the voice assistant, but
did. More specifically, we automate the process of finding accidental triggers
and measure their prevalence across 11 smart speakers from 8 different
manufacturers using everyday media such as TV shows, news, and other kinds of
audio datasets. To systematically detect accidental triggers, we describe a
method to artificially craft such triggers using a pronouncing dictionary and a
weighted, phone-based Levenshtein distance. In total, we have found hundreds of
accidental triggers. Moreover, we explore potential gender and language biases
and analyze the reproducibility. Finally, we discuss the resulting privacy
implications of accidental triggers and explore countermeasures to reduce and
limit their impact on users' privacy. To foster additional research on these
sounds that mislead machine learning models, we publish a dataset of more than
1000 verified triggers as a research artifact. |
---|---|
DOI: | 10.48550/arxiv.2008.00508 |