Shift-reduce Spinal TAG Parsing with Dynamic Programming

The spinal tree adjoining grammar (TAG) parsing model of [Carreras 08] achieves the current state-of-the-art constituent parsing accuracy on the commonly used English Penn Treebank evaluation setting. Unfortunately, the model has the serious drawback of low parsing efficiency since its Eisner-CKY st...

Full description

Saved in:
Bibliographic Details
Published inTransactions of the Japanese Society for Artificial Intelligence Vol. 31; no. 2; pp. J-F83_1 - 8
Main Authors Hayashi, Katsuhiko, Suzuki, Jun, Nagata, Masaaki
Format Journal Article
LanguageEnglish
Published Tokyo The Japanese Society for Artificial Intelligence 01.03.2016
Japan Science and Technology Agency
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The spinal tree adjoining grammar (TAG) parsing model of [Carreras 08] achieves the current state-of-the-art constituent parsing accuracy on the commonly used English Penn Treebank evaluation setting. Unfortunately, the model has the serious drawback of low parsing efficiency since its Eisner-CKY style parsing algorithm needs O(n4) computation time for input length n. This paper investigates a more practical solution and presents a beam search shift-reduce algorithm for spinal TAG parsing. Since the algorithm works in O(bn) (b is beam width), it can be expected to provide a significant improvement in parsing speed. However, to achieve faster parsing, it needs to prune a large number of candidates in an exponentially large search space and often suffers from severe search errors. In fact, our experiments show that the basic beam search shift-reduce parser does not work well for spinal TAGs. To alleviate this problem, we extend the proposed shift-reduce algorithm with two techniques: Dynamic Programming of [Huang 10a] and Supertagging. The proposed extended parsing algorithm is about 8 times faster than the Berkeley parser, which is well-known to be fast constituent parsing software, while offering state-of-the-art performance. Moreover, we conduct experiments on the Keyaki Treebank for Japanese to show that the good performance of our proposed parser is language-independent.
ISSN:1346-0714
1346-8030
DOI:10.1527/tjsai.J-F83