Systematically modeling and extracting bibliographic metadata of power grid standard documents with LLMs
Introduction. This study addresses the critical need for systematic bibliographic metadata representation and extraction from power grid standard documents, essential for operational efficiency and knowledge management in the power industry. Method. We developed a two-stage methodology utilizing lar...
Saved in:
Published in | Information research Vol. 30; no. iConf; pp. 654 - 665 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
University of Borås
11.03.2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Introduction. This study addresses the critical need for systematic bibliographic metadata representation and extraction from power grid standard documents, essential for operational efficiency and knowledge management in the power industry. Method. We developed a two-stage methodology utilizing large language models (LLMs) for extracting bibliographic metadata. The first stage involves constructing state grid-oriented instructions for the LLM, and the second stage includes a trustworthiness estimation to ensure the reliability of the extracted metadata. Analysis. Experiments were conducted using 96 state grid PDF samples to test the accuracy of metadata extraction. The performance of different LLMs was evaluated using single and multiple instructions. Results. The results showed over 70% accuracy across all models, with GPT-4 achieving the highest accuracy of 84%. Multiple instructions outperformed single instructions, highlighting the effectiveness of our approach. Conclusions. This study demonstrates the promising potential by LLM for data management in the power grid field, with the trustworthiness estimation mechanism significantly enhancing the reliability of the data extracted. |
---|---|
ISSN: | 1368-1613 1368-1613 |
DOI: | 10.47989/ir30iConf47233 |