Multi-output chemometrics model for gasoline compounding

•PTML Machine learning is used to model gasoline production on a real refinery.•PTML model is multi-label considering many operations and streams.•NIR measurements allowed alternative determination of multiple properties.•NIR data was used in PTML model internal robustness control study.•Model obtai...

Full description

Saved in:
Bibliographic Details
Published inFuel (Guildford) Vol. 310; p. 122274
Main Authors Bediaga, Harbil, Moreno, María Isabel, Arrasate, Sonia, Vilas, José Luis, Orbe, Lucía, Unzueta, Elías, Mercader, Juan Pérez, González-Díaz, Humberto
Format Journal Article
LanguageEnglish
Published Kidlington Elsevier Ltd 15.02.2022
Elsevier BV
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•PTML Machine learning is used to model gasoline production on a real refinery.•PTML model is multi-label considering many operations and streams.•NIR measurements allowed alternative determination of multiple properties.•NIR data was used in PTML model internal robustness control study.•Model obtained may predict alternative sources for gasoline compounding. Computational models may help to reduce research cost by predicting properties of alternative blends. Nowadays, most efforts focus on prediction of a few properties for sets of gasoline samples. However, there are no reports of models able for classification of gasoline samples with multiple output properties measured in real life refinery plants. In this work, Information Fusion (IF), Perturbation Theory (PT), and Machine Learning (ML) algorithm (IFPTML) was used to model real production data with >230,000 outcomes gathered from a petroleum refinery plant. IF-pre-processing phase assembled the working dataset with 44 physicochemical output properties vs. 574 input variables of 4 production lines distributed in 26 data blocks including 14 different streams and 23 operations carried out in the plant. PT-calculation phase quantifies the effect of perturbations (deviations) in all input variables using PT Operators. Last, in ML-analysis phase involved Linear Discriminant Analysis (LDA) and Artificial Neural Networks (ANN) models training. IFPTML-LDA model presented AUROC = 0.936 with overall Sensitivity Sn and Specificity Sp ≈ 84–91% for training and validation sets. In internal control experiment we obtained an IFPTML-FT-NIR model with similar Sn and Sp ≈ 86–97%, for >25,000 values of 16 properties measured FT-NIR technique; demonstrating the robustness of the algorithm to changes on the experimental techniques used. This model could be useful for the design of new alternatives blends (biofuels, refuse-derived fuels, etc.) with lower environmental impact.
ISSN:0016-2361
1873-7153
DOI:10.1016/j.fuel.2021.122274