Lempel-Ziv Networks

Sequence processing has long been a central area of machine learning research. Recurrent neural nets have been successful in processing sequences for a number of tasks; however, they are known to be both ineffective and computationally expensive when applied to very long sequences. Compression-based...

Full description

Saved in:

Bibliographic Details
Main Authors	Saul, Rebecca, Alam, Mohammad Mahmudul, Hurwitz, John, Raff, Edward, Oates, Tim, Holt, James
Format	Journal Article
Language	English
Published	23.11.2022
Subjects	Computer Science - Artificial Intelligence Computer Science - Learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Sequence processing has long been a central area of machine learning research. Recurrent neural nets have been successful in processing sequences for a number of tasks; however, they are known to be both ineffective and computationally expensive when applied to very long sequences. Compression-based methods have demonstrated more robustness when processing such sequences -- in particular, an approach pairing the Lempel-Ziv Jaccard Distance (LZJD) with the k-Nearest Neighbor algorithm has shown promise on long sequence problems (up to $T=200,000,000$ steps) involving malware classification. Unfortunately, use of LZJD is limited to discrete domains. To extend the benefits of LZJD to a continuous domain, we investigate the effectiveness of a deep-learning analog of the algorithm, the Lempel-Ziv Network. While we achieve successful proof of concept, we are unable to improve meaningfully on the performance of a standard LSTM across a variety of datasets and sequence processing tasks. In addition to presenting this negative result, our work highlights the problem of sub-par baseline tuning in newer research areas.
DOI:	10.48550/arxiv.2211.13250