Lempel-Ziv Networks
Sequence processing has long been a central area of machine learning research. Recurrent neural nets have been successful in processing sequences for a number of tasks; however, they are known to be both ineffective and computationally expensive when applied to very long sequences. Compression-based...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
23.11.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Sequence processing has long been a central area of machine learning
research. Recurrent neural nets have been successful in processing sequences
for a number of tasks; however, they are known to be both ineffective and
computationally expensive when applied to very long sequences.
Compression-based methods have demonstrated more robustness when processing
such sequences -- in particular, an approach pairing the Lempel-Ziv Jaccard
Distance (LZJD) with the k-Nearest Neighbor algorithm has shown promise on long
sequence problems (up to $T=200,000,000$ steps) involving malware
classification. Unfortunately, use of LZJD is limited to discrete domains. To
extend the benefits of LZJD to a continuous domain, we investigate the
effectiveness of a deep-learning analog of the algorithm, the Lempel-Ziv
Network. While we achieve successful proof of concept, we are unable to improve
meaningfully on the performance of a standard LSTM across a variety of datasets
and sequence processing tasks. In addition to presenting this negative result,
our work highlights the problem of sub-par baseline tuning in newer research
areas. |
---|---|
DOI: | 10.48550/arxiv.2211.13250 |