Applicability of ensemble learning in total organic carbon and porosity evaluation of shales

Accurate evaluation of total organic carbon (TOC) content and porosity is of paramount significance for assessment and target interval selection for shale reservoirs. This study takes shales from the western Chongqing area as an exemplary case to delve into the applicability and reliability of ensem...

Full description

Saved in:

Bibliographic Details
Published in	Physics of fluids (1994) Vol. 36; no. 10
Main Authors	Zhang, Luchuan, Li, Yibo, Zhang, Lei, Xiao, Dianshi, Zhang, Haijie, Zhang, Xuejuan, Liu, Ruhao, Luo, Tongtong, Xing, Yabing, Chen, Weiming, Jiang, Lin, Chen, Lei, Wang, Bo
Format	Journal Article
Language	English
Published	Melville American Institute of Physics 01.10.2024
Subjects	Algorithms Ensemble learning Gamma rays Machine learning Organic carbon Parameter identification Performance evaluation Porosity Regression analysis Shales Transit time
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Accurate evaluation of total organic carbon (TOC) content and porosity is of paramount significance for assessment and target interval selection for shale reservoirs. This study takes shales from the western Chongqing area as an exemplary case to delve into the applicability and reliability of ensemble learning in evaluating TOC content and porosity. The results indicate that although both Light Gradient Boosting Machine (LightGBM) and Random Forest (RF) algorithms are suitable for evaluating TOC content and porosity in shales, LightGBM algorithm is preferred due to its comprehensive advantages, including higher accuracy, stronger generalization capability, and faster operating speed. For TOC content evaluation, the four most important logging parameters identified by LightGBM and RF are consistent, but exhibit different orders: DEN (compensated density) > GR (gamma ray) > U (uranium) > CNL (compensated neutron) and DEN > U > GR > CNL, respectively. For porosity evaluation, LightGBM and RF identify the same type and order of the three most important logging parameters: AC (acoustic transit time) > DEN > U. This similarity may be attributed to the fact that both algorithms utilize Classification and Regression Tree (CART) as base learners. The dependence plots between SHAP (SHapley Additive exPlanations) values and logging parameters reveal that the role of each logging parameter in the evaluation model is segmented, rather than exhibiting a continuous linear contribution. In conclusion, given the exceptional performance of ensemble learning algorithms, they, especially LightGBM algorithm, are highly recommended for shale evaluation.
ISSN:	1070-6631 1089-7666
DOI:	10.1063/5.0233778