Supercombinator set acquired from context-free grammar samples
•We present an algorithm that transforms context-free grammars into a single set of supercombinators.•We evaluate our algorithm with the use of 62,008 grammar samples obtained from Groningen Meaning Bank.•We have found the limit of supercombinator set, which in case of our sample set is a sequence o...
Saved in:
Published in | Computer languages, systems & structures Vol. 54; pp. 1 - 19 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.12.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •We present an algorithm that transforms context-free grammars into a single set of supercombinators.•We evaluate our algorithm with the use of 62,008 grammar samples obtained from Groningen Meaning Bank.•We have found the limit of supercombinator set, which in case of our sample set is a sequence of Catalan numbers.•We show the way how to identify the most common structures of input grammars.
We present an algorithm that transforms context-free grammars into a non-redundant set of supercombinators. This set contains interconnected lambda calculus’ supercombinators that are enriched by grammar operations. The resulting set is scalable and it can be extended with new supercombinators created from grammars. We describe this algorithm in detail and then we apply it on 62,008 grammar samples in order to find out the properties and limits of acquired supercombinator set. We show that this set has a maximum theoretical limit of possible supercombinators. That limit is the sequence of Catalan numbers. We show that in some cases we are able to reach that limit if we use large enough input data source and we limit the size of supercombinators permitted into the final set. We also describe another benefit of our algorithm, which is the identification of most reoccurring structures in the input set. |
---|---|
ISSN: | 1477-8424 1873-6866 |
DOI: | 10.1016/j.cl.2018.04.001 |