Supercombinator set acquired from context-free grammar samples

•We present an algorithm that transforms context-free grammars into a single set of supercombinators.•We evaluate our algorithm with the use of 62,008 grammar samples obtained from Groningen Meaning Bank.•We have found the limit of supercombinator set, which in case of our sample set is a sequence o...

Full description

Saved in:
Bibliographic Details
Published inComputer languages, systems & structures Vol. 54; pp. 1 - 19
Main Authors Sičák, Michal, Kollár, Ján
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.12.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•We present an algorithm that transforms context-free grammars into a single set of supercombinators.•We evaluate our algorithm with the use of 62,008 grammar samples obtained from Groningen Meaning Bank.•We have found the limit of supercombinator set, which in case of our sample set is a sequence of Catalan numbers.•We show the way how to identify the most common structures of input grammars. We present an algorithm that transforms context-free grammars into a non-redundant set of supercombinators. This set contains interconnected lambda calculus’ supercombinators that are enriched by grammar operations. The resulting set is scalable and it can be extended with new supercombinators created from grammars. We describe this algorithm in detail and then we apply it on 62,008 grammar samples in order to find out the properties and limits of acquired supercombinator set. We show that this set has a maximum theoretical limit of possible supercombinators. That limit is the sequence of Catalan numbers. We show that in some cases we are able to reach that limit if we use large enough input data source and we limit the size of supercombinators permitted into the final set. We also describe another benefit of our algorithm, which is the identification of most reoccurring structures in the input set.
ISSN:1477-8424
1873-6866
DOI:10.1016/j.cl.2018.04.001