테니스 경기분석을 위한 빅데이터 분석 기법의 적용

이 연구는 최근 4차 산업혁명의 주요 기술 중에 하나인 빅데이터 분석 기법을 활용하여 프로테니스협회에서 제공 한 테니스 경기 공식기록을 대상으로 경기대회(그랜드슬램과 Masters 1000)를 구분한 후 연도(1991년부터 2019년 까지)에 따라 나타나는 변수별 변화 추이를 분석함으로써 테니스 경기분석을 위한 빅데이터 분석 기법을 적용해 보 고 이를 평가하는데 목적을 두었다. 이 연구의 대상은 그랜드슬램(N=19,682)과 Masters 1000 (N=18,784) 경기의 공식기록이었으며, 사용된 측정변수는 총 59개였다. 자료의...

Full description

Saved in:

Bibliographic Details
Published in	한국체육측정평가학회지 Vol. 22; no. 1; pp. 57 - 68
Main Authors	최형준(Hyongjun Choi), 이윤수(Yun-Soo Lee)
Format	Journal Article
Language	Korean
Published	한국체육측정평가학회 01.03.2020
Subjects	체육 경기력분석 스포츠 빅데이터 커널밀도추정 Sports Big Data Performance Analysis Kernel Density Estimation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	이 연구는 최근 4차 산업혁명의 주요 기술 중에 하나인 빅데이터 분석 기법을 활용하여 프로테니스협회에서 제공 한 테니스 경기 공식기록을 대상으로 경기대회(그랜드슬램과 Masters 1000)를 구분한 후 연도(1991년부터 2019년 까지)에 따라 나타나는 변수별 변화 추이를 분석함으로써 테니스 경기분석을 위한 빅데이터 분석 기법을 적용해 보 고 이를 평가하는데 목적을 두었다. 이 연구의 대상은 그랜드슬램(N=19,682)과 Masters 1000 (N=18,784) 경기의 공식기록이었으며, 사용된 측정변수는 총 59개였다. 자료의 수집을 위하여 python 3.8.1 버전 프로그램을 사용하였 으며, 자료의 정제, 자료의 전처리, 자료의 통계처리, 자료의 시각화 과정에서의 자료처리를 위하여 통계프로그램 R 3.5.1 버전을 사용하였다. 이 연구에서는 테니스 경기의 분석을 위하여 빅데이터를 활용하여 연도별 경기대회에서 나타난 각 변수의 원자료를 비교하였고, 변수별 일반적인 특성을 알아보기 위하여 커널밀도추정 결과에 대한 변화 추이를 분석하였으며, 경기대회와 연도에 따른 변수별 평균 차이의 비교를 위해 이원분산분석(two-way ANOVA)을 실시하여 얻은 결론은 다음과 같다. 첫째, 1991년부터 2019년까지의 테니스 경기를 그랜드슬램과 Masters 1000로 구분하여 비교한 결과, 2009년을 기준으로 그랜드슬램에서 나타난 변수값이 감소하는 변화 양상을 보였다. 둘째, 수 집된 자료의 커널밀도추정을 통해 얻은 비교 결과, winner tiebreaks won과 loser tiebreaks won 변수를 제외한 대부분의 변수에서는 경기대회 간 유사성이 나타냈다. 셋째, 그랜드슬램과 Masters 1000, 및 연도에 따른 변수의 평균차이에 대한 변화 추이를 비교한 결과, loser tiebreaks won 변수에서만 연도 간 통계적 유의성이 나타나지 않 았으며, 그 외의 변수에서는 경기대회 및 연도에 따라서 통계적으로 유의한 차이를 보였다(p<.05). The purpose of this study was to discuss the application of the Big Data Analytics for the performance analysis results in tennis that it is now one of key technology in the era of 4th industrial revolution. The consideration of all data was based on different tournaments(grand slam & Masters 1000) and competitions’ years from 1991 to 2019. The subjects of this study were the matches in Grand Slam (N=19,682) and the matches in Masters 1000 (N=18,784). The variables used in this study were totally 59 variables relevant to the performances in the matches. The python program version 3.8.1 was used in the data collection and R version 3.5.1 was used in the data processes, such as data cleaning, data preprocessing, and data visualization. The comparison of raw data between different years in the tournaments was considered in order to interpretate the data, and the comparison of results from the kernel density estimation was concerned to identify the general characteristics of each variable. The results of two-way ANOVA test were also considered to compare means of variables within the tournaments and years. Consequently, there are three findings in this study as following belows; First of all, the values of all variables were decreasing from the year 2009 in the comparison of raw data by the tournaments from 1991 to 2019. Secondly, the patterns of the kernel density estimation on the winner tiebreaks won and loser tiebreaks won were differently occurred between the tournaments. Thirdly, the result of two-way ANOVA shown that there was no significant difference on the loser tiebreaks won variable, but there were significant differences on the other variables by the way(p<.05). KCI Citation Count: 0
ISSN:	1229-4225 2671-9134