Synthetic Training-Data Generation for ML-based Process Mining Tools
This work addresses the challenge of data scarcity in process mining by proposing the creation of synthetic training data using generative models. A comparative analysis is conducted between a Long Short-Term Memory (LSTM) model and the Generative Adversarial Network (GAN) model, using two distinct...
Saved in:
Published in | 2024 14th International Conference on Advanced Computer Information Technologies (ACIT) pp. 705 - 709 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
19.09.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This work addresses the challenge of data scarcity in process mining by proposing the creation of synthetic training data using generative models. A comparative analysis is conducted between a Long Short-Term Memory (LSTM) model and the Generative Adversarial Network (GAN) model, using two distinct datasets. Multiple evaluation methods are employed to compare the results from the two models based on: precision, fidelity, diversity, and novelty. Results indicate that while LSTM accurately reproduces the initial data structure, GAN introduces more variability, offering a wider range of training scenarios. This highlights the potential of GAN-generated data to enhance the effectiveness and reliability of machine learning-based process mining tools |
---|---|
ISSN: | 2770-5226 |
DOI: | 10.1109/ACIT62333.2024.10712516 |