A propositionalization method of multi-relational data based on Grammar-Guided Genetic Programming

The propositionalization process tries to find distinctive features of the examples in a database to transform such relational data into a simpler representation. More informative features have a positive impact on the classification capabilities of the learning algorithms. In this work, we propose...

Full description

Saved in:
Bibliographic Details
Published inExpert systems with applications Vol. 168; p. 114263
Main Authors Quintero-Domínguez, Luis A., Morell, Carlos, Ventura, Sebastián
Format Journal Article
LanguageEnglish
Published New York Elsevier Ltd 15.04.2021
Elsevier BV
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The propositionalization process tries to find distinctive features of the examples in a database to transform such relational data into a simpler representation. More informative features have a positive impact on the classification capabilities of the learning algorithms. In this work, we propose a new propositionalization method, which generates complex Boolean attributes using Grammar-Guided Genetic Programming (G3P). The generated attributes are compound formulas that combine word items coming from a Bag-of-Words (BoW) representation using Boolean operators. The proposal was assessed against three state-of-the-art simple-instance and multiple-instance propositionalization methods. The experimental results show that the proposed method achieves an improvement in terms of classification accuracy and a considerable reduction in the dimensionality of the resulting datasets. •Grammar-Guided Genetic Programming is used to generate complex attributes.•Words coming from a Bag-of-Words representation are combined using Boolean operators.•Simple-instance and multiple-instance propositionalization were analyzed.•A considerable reduction in the dimensionality of the resulting datasets is achieved.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2020.114263