Learning Tree Pattern Transformations

Explaining why and how a tree $t$ structurally differs from another tree $t^*$ is a question that is encountered throughout computer science, including in understanding tree-structured data such as XML or JSON data. In this article, we explore how to learn explanations for structural differences bet...

Full description

Saved in:

Bibliographic Details
Main Authors	Neider, Daniel, Sabellek, Leif, Schmidt, Johannes, Vehlken, Fabian, Zeume, Thomas
Format	Journal Article
Language	English
Published	10.10.2024
Subjects	Computer Science - Artificial Intelligence Computer Science - Computational Complexity Computer Science - Databases Computer Science - Learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Explaining why and how a tree $t$ structurally differs from another tree $t^$ is a question that is encountered throughout computer science, including in understanding tree-structured data such as XML or JSON data. In this article, we explore how to learn explanations for structural differences between pairs of trees from sample data: suppose we are given a set $\{(t_1, t_1^),\dots, (t_n, t_n^)\}$ of pairs of labelled, ordered trees; is there a small set of rules that explains the structural differences between all pairs $(t_i, t_i^)$? This raises two research questions: (i) what is a good notion of "rule" in this context?; and (ii) how can sets of rules explaining a data set be learnt algorithmically? We explore these questions from the perspective of database theory by (1) introducing a pattern-based specification language for tree transformations; (2) exploring the computational complexity of variants of the above algorithmic problem, e.g. showing NP-hardness for very restricted variants; and (3) discussing how to solve the problem for data from CS education research using SAT solvers.
DOI:	10.48550/arxiv.2410.07708