Anomalous source detection in high-dimensional sequence data
Devices and techniques are generally described for evaluation of text data using large n-grams. In various examples, a first vector may be generated for first text data, wherein each element of the vector comprises a value indicating whether the first text data includes a respective n-gram included...
Saved in:
Main Authors | , , , , |
---|---|
Format | Patent |
Language | English |
Published |
03.09.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Devices and techniques are generally described for evaluation of text data using large n-grams. In various examples, a first vector may be generated for first text data, wherein each element of the vector comprises a value indicating whether the first text data includes a respective n-gram included in a corpus of text data. First label data indicating that a user associated with the first text data has connected to a first computer-implemented service more than a threshold number of times during a past time period may be determined. A first machine learning model may be trained based at least in part on the first vector and the first label data. The first machine learning model may be used to determine a first probability associated with a first n-gram of the first vector. In some examples, at least a first user associated with the first n-gram may be determined. |
---|---|
Bibliography: | Application Number: US202117541833 |