Adaptive frequency scaled wavelet packet decomposition for frog call classification

Environmental changes have put great pressure on biological systems leading to the rapid decline of biodiversity. To monitor this change and protect biodiversity, animal vocalizations have been widely explored by the aid of deploying acoustic sensors in the field. Consequently, large volumes of acou...

Full description

Saved in:
Bibliographic Details
Published inEcological informatics Vol. 32; pp. 134 - 144
Main Authors Xie, Jie, Towsey, Michael, Zhang, Jinglan, Roe, Paul
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.03.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Environmental changes have put great pressure on biological systems leading to the rapid decline of biodiversity. To monitor this change and protect biodiversity, animal vocalizations have been widely explored by the aid of deploying acoustic sensors in the field. Consequently, large volumes of acoustic data are collected. However, traditional manual methods that require ecologists to physically visit sites to collect biodiversity data are both costly and time consuming. Therefore it is essential to develop new semi-automated and automated methods to identify species in automated audio recordings. In this study, a novel feature extraction method based on wavelet packet decomposition is proposed for frog call classification. After syllable segmentation, the advertisement call of each frog syllable is represented by a spectral peak track, from which track duration, dominant frequency and oscillation rate are calculated. Then, a k-means clustering algorithm is applied to the dominant frequency, and the centroids of clustering results are used to generate the frequency scale for wavelet packet decomposition (WPD). Next, a new feature set named adaptive frequency scaled wavelet packet decomposition sub-band cepstral coefficients is extracted by performing WPD on the windowed frog calls. Furthermore, the statistics of all feature vectors over each windowed signal are calculated for producing the final feature set. Finally, two well-known classifiers, a k-nearest neighbour classifier and a support vector machine classifier, are used for classification. In our experiments, we use two different datasets from Queensland, Australia (18 frog species from commercial recordings and field recordings of 8 frog species from James Cook University recordings). The weighted classification accuracy with our proposed method is 99.5% and 97.4% for 18 frog species and 8 frog species respectively, which outperforms all other comparable methods. •A frog call classification system is presented.•Adaptive frequency scaled wavelet packet decomposition is used to extract novel features for frog calls.•Spectral peak track is used to separate frog calls from the background noise.•Mel-frequency cepstral coefficients, Mel-scaled wavelet packet decomposition subband cepstral coefficients, and adaptive frequency scaled wavelet packet decomposition sub-band cepstral coefficients were compared.•Adaptive frequency scaled wavelet packet decomposition sub-band cepstral coefficients yields the best classification performance.
Bibliography:http://dx.doi.org/10.1016/j.ecoinf.2016.01.007
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1574-9541
DOI:10.1016/j.ecoinf.2016.01.007