Enhanced Pose Estimation for Badminton Players via Improved YOLOv8-Pose with Efficient Local Attention

With the rapid development of sports analytics and artificial intelligence, accurate human pose estimation in badminton is becoming increasingly important. However, challenges such as the lack of domain-specific datasets and the complexity of athletes’ movements continue to hinder progress in this a...

Full description

Saved in:

Bibliographic Details
Published in	Sensors (Basel, Switzerland) Vol. 25; no. 14; p. 4446
Main Authors	Wu, Yijian, Chen, Zewen, Zhang, Hongxing, Yang, Yulin, Yi, Weichao
Format	Journal Article
Language	English
Published	Switzerland MDPI AG 17.07.2025 MDPI
Subjects	Accuracy Algorithms Analysis Annotations Artificial Intelligence Attention - physiology Badminton Badminton (Game) badminton scene Datasets Deep learning efficient local attention human pose estimation Humans Localization Movement - physiology Neural networks Posture - physiology Racquet Sports - physiology sport analytics YOLOv8-Pose Germany sport analytics YOLOv8-Pose badminton scene human pose estimation efficient local attention
Online Access	Get full text

Cover

Loading…

More Information
Summary:	With the rapid development of sports analytics and artificial intelligence, accurate human pose estimation in badminton is becoming increasingly important. However, challenges such as the lack of domain-specific datasets and the complexity of athletes’ movements continue to hinder progress in this area. To address these issues, we propose an enhanced pose estimation framework tailored to badminton players, built upon an improved YOLOv8-Pose architecture. In particular, we introduce an efficient local attention (ELA) mechanism that effectively captures fine-grained spatial dependencies and contextual information, thereby significantly improving the keypoint localization accuracy and overall pose estimation performance. To support this study, we construct a dedicated badminton pose dataset comprising 4000 manually annotated samples, captured using a Microsoft Kinect v2 camera. The raw data undergo careful processing and refinement through a combination of depth-assisted annotation and visual inspection to ensure high-quality ground truth keypoints. Furthermore, we conduct an in-depth comparative analysis of multiple attention modules and their integration strategies within the network, offering generalizable insights to enhance pose estimation models in other sports domains. The experimental results show that the proposed ELA-enhanced YOLOv8-Pose model consistently achieves superior accuracy across multiple evaluation metrics, including the mean squared error (MSE), object keypoint similarity (OKS), and percentage of correct keypoints (PCK), highlighting its effectiveness and potential for broader applications in sports vision tasks.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1424-8220 1424-8220
DOI:	10.3390/s25144446