Intelligent deep analysis of DNA sequences based on FFGM to enhancement the performance and reduce the computation

•Handle Analysis crystal structure macrodomain that consider one of the hot subjects today form DNA Sequence.•Split any structure of DNA based on Novel Meta-Heuristic Algorithm with rules.•Build RF-FFGM that finds the frequent in a short time, because it building matrix code for connection edge and...

Full description

Saved in:
Bibliographic Details
Published inEgyptian informatics journal Vol. 24; no. 2; pp. 173 - 190
Main Authors Kadhuim, Zena A., Al-Janabi, Samaher
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.07.2023
Elsevier
Subjects
Online AccessGet full text
ISSN1110-8665
2090-4754
DOI10.1016/j.eij.2023.02.004

Cover

Loading…
More Information
Summary:•Handle Analysis crystal structure macrodomain that consider one of the hot subjects today form DNA Sequence.•Split any structure of DNA based on Novel Meta-Heuristic Algorithm with rules.•Build RF-FFGM that finds the frequent in a short time, because it building matrix code for connection edge and transforming matrices into incidence matrix.•Perform Deep analysis to graph mining techniques (GSpan, FFSM, Hybrid-Tree-Miner, Approximate Frequent Sub-graph, CloGraMi and FFSM) this analysis focus on determined (the main programming steps, main parameters, advantages, disadvantages) for each algorithm.•Comparison among the Main Parameters, which affect in mining sub-graph techniques. In an attempt to improve the analysis DNA sequence, a new intelligent deep analysis algorithm called reduce frequency bast on fast frequency graph mining (RF-FFGM) is established; This algorithm at the beginning converts the DNA sequence into RNA sequences after that split these sequence into multi subsequence through determined specific equation for start and end point of each sequence. After that each subsequence represent as subgraph after label to the bonds between each pair of components related to RNA (i.e., A, G, U, C) these bounds include 16 labels used as Knowledge Constructions (KC)) for this work. After that apply the steps of FFGM that select after deep analysis to graph mining techniques (GSpan, FFSM, Hybrid-Tree-Miner, Approximate Frequent Sub-graph, CloGraMi and FFSM) this analysis focus on determined (the main programming steps, main parameters, advantages, disadvantages) for each algorithm. We discovery FFGM finds the frequent in a short time, because it building matrix code for connection edge and transforming matrices into incidence matrix, also; we found FFGM can get all the edges that have the highest contact with the other edges, so from the second stage therefore it avoids us from going through a sequential path to find duplicate edges. RF-FFGM appears as a pragmatic algorithm, it proves their robust to work with DNA sequence to reduce the computation and time.
ISSN:1110-8665
2090-4754
DOI:10.1016/j.eij.2023.02.004