Hybridization of linguistic and statistical methods of text analysis for describing the subject area in the form of a fuzzy ontology

The algorithm for selection of n-words and formation of a description of the subject area in the form of a fuzzy ontology based on the hybridization of linguistic and statistical methods of text analysis has been developed. The input data of the algorithm are text sequences in machine language, the...

Full description

Saved in:
Bibliographic Details
Published inJournal of physics. Conference series Vol. 1679; no. 4; pp. 42008 - 42014
Main Authors Rabin, A V, Petrushevskaya, A A
Format Journal Article
LanguageEnglish
Published IOP Publishing 01.11.2020
Online AccessGet full text

Cover

Loading…
More Information
Summary:The algorithm for selection of n-words and formation of a description of the subject area in the form of a fuzzy ontology based on the hybridization of linguistic and statistical methods of text analysis has been developed. The input data of the algorithm are text sequences in machine language, the result of the algorithm operation is the description of the subject area in the form of a fuzzy ontology. The definition of classes of fuzzy ontologies in accordance with the source code is carried out using several machine learning algorithms that combine methods of data preparation and prediction. Methods "Bootstrap", "Bagging" and "Random forest" were used to classify text sequences. A feature of the developed algorithm is the need to represent ontology objects mainly in the form of one-words with maximization of the number of relations between objects.
ISSN:1742-6588
1742-6596
DOI:10.1088/1742-6596/1679/4/042008