Content-based retrieval of MP3 songs based on query by singing
With the growth of multimedia in the Internet, content analysis of multimedia plays an important role for humanistic management. We investigate the content-based retrieval of MP3 songs based on the interface of query by singing. MDCT (modified DCT) spectral coefficients are directly used to represen...
Saved in:
Published in | 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing Vol. 5; pp. V - 929 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
Piscataway, N.J
IEEE
2004
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | With the growth of multimedia in the Internet, content analysis of multimedia plays an important role for humanistic management. We investigate the content-based retrieval of MP3 songs based on the interface of query by singing. MDCT (modified DCT) spectral coefficients are directly used to represent the tonic characteristics of a short-term sound. This spectral profile is used for detailed matching between two audio segments. Perceptual features are also computed from MDCT coefficients for audio classification. Two pre-stages based on SVM and k-means classifications are used to remove incorrect (or noisy) segment candidates and to speed up the subsequent matching process. On the other hand, exponential key-scaling schemes and time-warping techniques are developed to overcome key difference and tempo variation between different singers. Experiments show that the retrieval probability of our design can achieve up to 76% among the top 5 out of a total of 114 excerpts in the database. |
---|---|
ISBN: | 9780780384842 0780384849 |
ISSN: | 1520-6149 2379-190X |
DOI: | 10.1109/ICASSP.2004.1327264 |