Feature Evaluation for Underwater Acoustic Object Counting and F0 Estimation

When carrying out underwater acoustic target detection mission, we need to count the target number (N), conduct source separation when N is greater than one, and retrieve motion parameters (shaft frequency, or F0 for example) of each target from the separated noises. Though widely adopted in image i...

Full description

Saved in:
Bibliographic Details
Published in2022 4th International Conference on Robotics and Computer Vision (ICRCV) pp. 180 - 185
Main Authors Li, Liming, Song, Sanming, Wang, Li, Ye, Lei, Jing, Yan, Pang, Guofu
Format Conference Proceeding
LanguageEnglish
Published IEEE 25.09.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:When carrying out underwater acoustic target detection mission, we need to count the target number (N), conduct source separation when N is greater than one, and retrieve motion parameters (shaft frequency, or F0 for example) of each target from the separated noises. Though widely adopted in image interpretation, deep learning methods, however, strongly depend on the form or quality of the feed-in data or features, especially for underwater acoustic applications where strong ambient noise and multi-path effects hinders accurate target detection. Therefore, a thorough evaluation of typical features can provide a reference for feature selection in different tasks. In this paper, we choose CRNN, which has been widely validated in time-series analysis, as the common classifier to evaluate different time-frequency features and their enhanced version for object counting and F0 estimation. The performance of feeding STFT, GST, LOFAR, DEMON, or MFCCs as input is analyzed in the two tasks respectively through simulation and lake trial. Experimental results based on lake trial dataset show that both LOFAR and DEMON dominate object counting performance, with an accuracy of 96% and 97%, respectively, while DEMON performs better (94%)in F0 estimation task than LOFAR (83%), partly due to the prominent cavitation in our lake trial dataset. STFT and GST have poor robustness in real environment, while MFCCs fails to cope with both tasks.
DOI:10.1109/ICRCV55858.2022.9953234