Leveraging Features from Background and Salient Regions for Automatic Image Annotation
In this era of information explosion, automating the annotation process of digital images is a crucial step towards efficient and effective management of this increasingly high volume of content. However, this still is a highly challenging task for the research community. One of the main bottlenecks...
Saved in:
Published in | Journal of Information Processing Vol. 20; no. 1; pp. 250 - 266 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Information Processing Society of Japan
2012
|
Online Access | Get full text |
ISSN | 1882-6652 1882-6652 |
DOI | 10.2197/ipsjjip.20.250 |
Cover
Loading…
Abstract | In this era of information explosion, automating the annotation process of digital images is a crucial step towards efficient and effective management of this increasingly high volume of content. However, this still is a highly challenging task for the research community. One of the main bottlenecks is the lack of integrity and diversity of features. We propose to solve this problem by utilizing 43 image features that cover the holistic content of the image from global to subject, background and scene. In our approach, salient regions and the background are separated without prior knowledge. Each of them together with the whole image are treated independently for feature extraction. Extensive experiments were designed to show the efficiency and the effectiveness of our approach. We chose two publicly available datasets manually annotated with diverse nature of images for our experiments, namely, the Corel5K and ESP Game datasets. We confirm the superior performance of our approach over the use of a single whole image using sign test with p-value < 0.05. Furthermore, our combined feature set gives satisfactory performance compared to recently proposed approaches especially in terms of generalization even with just a simple combination. We also obtain a better performance with the same feature set versus the grid-based approach. More importantly, when using our features with the state-of-the-art technique, our results show higher performance in a variety of standard metrics. |
---|---|
AbstractList | In this era of information explosion, automating the annotation process of digital images is a crucial step towards efficient and effective management of this increasingly high volume of content. However, this still is a highly challenging task for the research community. One of the main bottlenecks is the lack of integrity and diversity of features. We propose to solve this problem by utilizing 43 image features that cover the holistic content of the image from global to subject, background and scene. In our approach, salient regions and the background are separated without prior knowledge. Each of them together with the whole image are treated independently for feature extraction. Extensive experiments were designed to show the efficiency and the effectiveness of our approach. We chose two publicly available datasets manually annotated with diverse nature of images for our experiments, namely, the Corel5K and ESP Game datasets. We confirm the superior performance of our approach over the use of a single whole image using sign test with p-value < 0.05. Furthermore, our combined feature set gives satisfactory performance compared to recently proposed approaches especially in terms of generalization even with just a simple combination. We also obtain a better performance with the same feature set versus the grid-based approach. More importantly, when using our features with the state-of-the-art technique, our results show higher performance in a variety of standard metrics. |
Author | Fahrmair, Michael Sarin, Supheakmungkol Kameyama, Wataru Wagner, Matthias |
Author_xml | – sequence: 1 fullname: Fahrmair, Michael organization: DOCOMO Communications Laboratories Europe GmbH – sequence: 1 fullname: Sarin, Supheakmungkol organization: GITS, Waseda University – sequence: 1 fullname: Wagner, Matthias organization: DOCOMO Communications Laboratories Europe GmbH – sequence: 1 fullname: Kameyama, Wataru organization: GITS, Waseda University |
BookMark | eNpNUNtKAzEQDVLBWn31OT-wNck2e3msxWqhIHh7XSbpZM3aTUqyFfx7Iy2lD3M5nHOGmbkmI-cdEnLH2VTwury3u9h1djcVCUt2Qca8qkRWFFKMzvorch1jx1hRM8nG5HONPxigta6lS4RhHzBSE3xPH0B_t8Hv3YZCijfYWnQDfcXWepc0PtD5fvA9DFbTVQ8t0rlzfkjYuxtyaWAb8fZYJ-Rj-fi-eM7WL0-rxXyd6bwsWFah4dIoBrysmFRcFrU2Sqqac5RGlzOuEq3yGQjFVa5meiOVAJObDZRYyHxCpoe5OvgYA5pmF2wP4bfhrPn_SnP8SiMSliwZFgdDF4e08kkOIZ2xxXM5PybJTqz-gtCgy_8AGzZzeg |
Cites_doi | 10.1037/0096-3445.108.3.316 10.1109/CVPRW.2009.5206596 10.1109/TPAMI.2008.128 10.1109/34.868688 10.1109/TPAMI.2007.61 10.1109/CVPR.2004.1315215 10.1109/MUE.2011.22 10.1109/TPAMI.2006.233 10.1109/34.730558 10.1023/A:1011139631724 10.1109/ICCV.2003.1238351 10.1109/TPAMI.2007.70791 10.1109/ICCV.2005.239 10.1109/TPAMI.2007.70716 10.1109/TPAMI.2006.57 10.1109/TPAMI.2009.154 10.1109/TPAMI.2006.3 10.1109/ICCV.2009.5459266 10.1162/153244303322533214 |
ContentType | Journal Article |
Copyright | 2012 by the Information Processing Society of Japan |
Copyright_xml | – notice: 2012 by the Information Processing Society of Japan |
DBID | AAYXX CITATION |
DOI | 10.2197/ipsjjip.20.250 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 1882-6652 |
EndPage | 266 |
ExternalDocumentID | 10_2197_ipsjjip_20_250 article_ipsjjip_20_1_20_1_250_article_char_en |
GroupedDBID | 2WC ALMA_UNASSIGNED_HOLDINGS CS3 JSF JSH KQ8 RJT RZJ TKC AAYXX CITATION |
ID | FETCH-LOGICAL-c3760-8ef15fb0a17805b1569cfb5b911e5fc741b5fbb34a2b1b3b4cd5b2af3fda7e653 |
ISSN | 1882-6652 |
IngestDate | Tue Jul 01 05:17:05 EDT 2025 Wed Sep 03 06:25:13 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Issue | 1 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c3760-8ef15fb0a17805b1569cfb5b911e5fc741b5fbb34a2b1b3b4cd5b2af3fda7e653 |
OpenAccessLink | https://www.jstage.jst.go.jp/article/ipsjjip/20/1/20_1_250/_article/-char/en |
PageCount | 17 |
ParticipantIDs | crossref_primary_10_2197_ipsjjip_20_250 jstage_primary_article_ipsjjip_20_1_20_1_250_article_char_en |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2012 2012-00-00 |
PublicationDateYYYYMMDD | 2012-01-01 |
PublicationDate_xml | – year: 2012 text: 2012 |
PublicationDecade | 2010 |
PublicationTitle | Journal of Information Processing |
PublicationTitleAlternate | Journal of Information Processing |
PublicationYear | 2012 |
Publisher | Information Processing Society of Japan |
Publisher_xml | – name: Information Processing Society of Japan |
References | [18] Shyu, M.-L., Chen, S.-C., Chen, M., Zhang, C. and Sarinnapakorn, K.: Image database retrieval utilizing affinity relationships, MMDB '03: Proc. 1st ACM International Workshop on Multimedia Databases, pp.78-85, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/951676.951691 (2003). [37] Hou, X. and Zhang, L.: Saliency Detection: A Spectral Residual Approach, Proc. IEEE Conference on Computer Vision and Pattern Recognition CVPR '07, pp.1-8 (online), DOI:10.1109/CVPR. 2007.383267 (2007). [48] Sarin, S. and Kameyama, W.: Holistic Image Features Extraction for Better Image Annotation, IEICE General Conference, Sendai City, Miyagi, Japan (2010). [13] Grady, L. and Schwartz, E.: Isoperimetric graph partitioning for image segmentation, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.28, No.3, pp.469-475 (2006). [19] Zhao, R. and Grosky, W.: From Features to Semantics: Some Preliminary Results, p.TAS3 (2000). [51] Guillaumin, M.: Exploiting Multimodal Data for Image Understanding, PhD Thesis, Université de Grenoble (2010). [35] Judd, T., Ehinger, K., Durand, F. and Torralba, A.: Learning to predict where humans look, 2009 IEEE 12th International Conference on Computer Vision, pp.2106-2113, IEEE (2010). [49] Ong, K.-M., Sarin, S. and Kameyama, W.: Affective and Holistic Approach at TRECVID 2010 Task-Semantic Indexing (SIN), Working Notes of TRECVID (2010). [15] Meghini, C., Sebastiani, F. and Straccia, U.: A model of multimedia information retrieval, J. ACM, Vol.48, pp.909-970 (online), DOI:http://doi.acm.org/10.1145/502102.502103 (2001). [46] Abdel-Hakim, A. and Farag, A.: CSIFT: A SIFT descriptor with color invariant characteristics, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol.2, pp.1978-1983, IEEE (2006). [26] Grangier, D. and Bengio, S.: A Discriminative Kernel-Based Approach to Rank Images from Text Queries, IEEE Trans. Pattern Anal. Mach. Intell., Vol.30, pp.1371-1384 (online), DOI:10.1109/TPAMI.2007.70791 (2008). [8] Makadia, A., Pavlovic, V. and Kumar, S.: Baselines for Image Annotation, International Journal of Computer Vision, pp.1-18 (2010). [52] Shechtman, E. and Irani, M.: Matching local self-similarities across images and videos, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, IEEE (2007). [6] von Ahn, L. and Dabbish, L.: Labeling images with a computer game, CHI '04: Proc. SIGCHI Conference on Human Factors in Computing Systems, pp.319-326, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/985692.985733 (2004). [9] Deng, Y., Manjunath, B. and Shin, H.: Color image segmentation, CVPR '99, p.2446, IEEE Computer Society (1999). [28] Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D.M. and Jordan, M.I.: Matching words and pictures, J. Mach. Learn. Res., Vol.3, pp.1107-1135 (2003). [39] Van De Sande, K., Gevers, T. and Snoek, C.: Evaluating color descriptors for object and scene recognition, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.32, No.9, pp.1582-1596 (2010). [11] Grady, L.: Random walks for image segmentation, IEEE Trans. on Pattern Analysis and Machine Intelligence, pp.1768-1783 (2006). [29] Monay, F. and Gatica-Perez, D.: PLSA-based image auto-annotation: constraining the latent space, MULTIMEDIA '04: Proc. 12th Annual ACM International Conference on Multimedia, pp.348-351, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/1027527.1027608 (2004). [22] Grauman, K. and Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features (2005). [1] Gantz, J.F., Reinsel, D., Chute, C., Schlichting, W., Mcarthur, J., Minton, S., Xheneti, I., Toncheva, A. and Manfrediz, A.: The Expanding Digital Universe: A Forecast of Worldwide Information Growth Through 2010, IDC White Paper (online) (2007), available from <http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf>. [21] Monay, F. and Gatica-Perez, D.: On image auto-annotation with latent space models, MULTIMEDIA '03: Proc. 11th ACM International Conference on Multimedia, pp.275-278, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/957013.957070 (2003). [33] Torralba, A., Fergus, R. and Freeman, W.T.: 80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition, IEEE Trans. Pattern Anal. Mach. Intell., Vol.30, pp.1958-1970 (online), DOI:10.1109/TPAMI.2008.128 (2008). [36] Achanta, R., Hemami, S., Estrada, F. and Ssstrunk, S.: Frequency-tuned Salient Region Detection, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), (online) (2009), available from <http://www.cvpr2009.org/>. [27] Hertz, T., Bar-Hillel, A. and Weinshall, D.: Learning distance functions for image retrieval, CVPR'04: Proc. 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.570-577, IEEE Computer Society, Washington, DC, USA (online) (2004), available from <http://portal.acm.org/citation.cfm?id=1896300.1896383>. [34] Itti, L., Koch, C. and Niebur, E.: A Model of Saliency-Based Visual Attention for Rapid Scene Analysis, IEEE Trans. Pattern Anal. Mach. Intell., Vol.20, No.11, pp.1254-1259 (online), DOI:http://dx.doi.org/10.1109/34.730558 (1998). [12] Zahn, C.: Graph-theoretical methods for detecting and describing gestalt clusters, IEEE Trans. Computers, Vol.100, No.1, pp.68-86 (2006). [40] Oliva, A. and Torralba, A.: Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope, International Journal of Computer Vision, Vol.42, No.3, pp.145-175 (online), DOI:http://dx.doi.org/10.1023/A:1011139631724 (2001). [44] Van de Weijer, J., Gevers, T. and Bagdanov, A.: Boosting color saliency in image feature detection, IEEE Trans. on Pattern Analysis and Machine Intelligence, pp.150-156 (2006). [23] Wallraven, C., Caputo, B. and Graf, A.: Recognition with local features: The kernel recipe (2003). [42] Potter, M.C.: Short-term conceptual memory for pictures, Journal of Experimental Psychology: Human Learning and Memory, Vol.2, No.5, pp.509-522 (1976). [3] Facebook Photo Statistics (2010), available from <http://blog.facebook.com/blog.php?post=206178097130>. [50] Sarin, S., Fahrmair, M., Wagner, M. and Kameyama, W.: Holistic Feature Extraction for Automatic Image Annotation, Proc. 5th FTRA Int Multimedia and Ubiquitous Engineering (MUE) Conf, pp.59-66 (online), DOI:10.1109/MUE.2011.22 (2011). [7] Guillaumin, M., Mensink, T., Verbeek, J. and Schmid, C.: TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation, International Conference on Computer Vision (online) (2009), available from <http://lear.inrialpes.fr/pubs/2009/GMVS09>. [43] Lowe, D.: Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, Vol.60, No.2, pp.91-110 (2004). [25] Lazebnik, S., Schmid, C. and Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol.2, pp.2169-2178, IEEE (2006). [14] Laaksonen, J., Koskela, M. and Oja, E.: Content-Based Image Retrieval Using Self-Organizing Maps, VISUAL, pp.541-548 (1999). [32] Feng, S., Manmatha, R. and Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation, CVPR, Vol.2, pp.1002-1009 (2004). [4] Datta, R., Joshi, D., Li, J. and Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age, ACM Comput. Surv., Vol.40, No.2, pp.1-60 (online), DOI:http://doi.acm.org/10.1145/1348246.1348248 (2008). [38] Makadia, A., Pavlovic, V. and Kumar, S.: A New Baseline for Image Annotation, ECCV, Vol.3, pp.316-329 (2008). [10] Shi, J. and Malik, J.: Normalized cuts and image segmentation, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.22, No.8, pp.888-905 (2002). [16] Schettini, R., CIOCCA, G., Zuffi, S., Tecnologie, I. and Multimediali, I.: A Survey of Methods for Colour Image Indexing and Retrieval in Image Databases, Color Imaging Science: Exploiting Digital, Media, John Wiley, pp.1-9 (2001). [30] Carneiro, G., Chan, A.B., Moreno, P.J. and Vasconcelos, N.: Supervised Learning of Semantic Classes for Image Annotation and Retrieval, Vol.29, No.3, pp.394-410 (online), DOI:10.1109/TPAMI.2007.61 (2007). [31] Jeon, L., Lavrenko, V., Manmatha, R. and Jeon, J.: A model for learning the semantics of pictures, Seventeenth Annual Conference on Neural Information Processing Systems (NIPS), MIT Press (2003). [2] Flickr Photo Statistics (2010), available from <http://blog.flickr.net/en/2010/09/19/5000000000/>. [41] Friedman, A.: Framing pictures: The role of knowledge in automatized encoding and memory for gist, Journal of Experimental Psychology: General, Vol.108, pp.316-355 (1979). [24] Willamowski, J., Arregui, D., Csurka, G., Dance, C. and Fan, L.: Categorizing nine visual classes using local appearance descriptors, Illumination, Vol.17, p.21 (2004). [20] Tsai, C.-F., McGarry, K. and Tait, J.: Image classification using hybrid neural networks, SIGIR '03: Proc. 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp.431-432, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/860435.860536 (2003). [5] Duygulu, P., Barnard, K., Freitas, de Freitas, J.F.G. and Forsyth, D.A.: Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary, ECCV '02: Proc. 7th European Conference on Computer Vision-Part IV, pp.97-112, Springer-Verlag, London, UK (2002). [17] Ko, B., Lee, H.-S. and Byun, H.: Image retrieval using flexible image subblocks, SAC '00: Proc. 2000 ACM Symposium on Applied Computing-Volume 2, pp.574-578, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/338407.338502 (2000). [45] Bosch, A., Zisserman, A. and Muoz, X.: Scene classification using a hybrid generative/discriminative approach, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.30, No.4, pp.712-727 (20 44 45 46 47 48 49 50 51 52 10 11 12 13 14 15 16 17 18 19 1 2 3 4 5 6 7 8 9 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
References_xml | – reference: [13] Grady, L. and Schwartz, E.: Isoperimetric graph partitioning for image segmentation, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.28, No.3, pp.469-475 (2006). – reference: [11] Grady, L.: Random walks for image segmentation, IEEE Trans. on Pattern Analysis and Machine Intelligence, pp.1768-1783 (2006). – reference: [39] Van De Sande, K., Gevers, T. and Snoek, C.: Evaluating color descriptors for object and scene recognition, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.32, No.9, pp.1582-1596 (2010). – reference: [38] Makadia, A., Pavlovic, V. and Kumar, S.: A New Baseline for Image Annotation, ECCV, Vol.3, pp.316-329 (2008). – reference: [10] Shi, J. and Malik, J.: Normalized cuts and image segmentation, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.22, No.8, pp.888-905 (2002). – reference: [40] Oliva, A. and Torralba, A.: Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope, International Journal of Computer Vision, Vol.42, No.3, pp.145-175 (online), DOI:http://dx.doi.org/10.1023/A:1011139631724 (2001). – reference: [12] Zahn, C.: Graph-theoretical methods for detecting and describing gestalt clusters, IEEE Trans. Computers, Vol.100, No.1, pp.68-86 (2006). – reference: [30] Carneiro, G., Chan, A.B., Moreno, P.J. and Vasconcelos, N.: Supervised Learning of Semantic Classes for Image Annotation and Retrieval, Vol.29, No.3, pp.394-410 (online), DOI:10.1109/TPAMI.2007.61 (2007). – reference: [52] Shechtman, E. and Irani, M.: Matching local self-similarities across images and videos, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, IEEE (2007). – reference: [42] Potter, M.C.: Short-term conceptual memory for pictures, Journal of Experimental Psychology: Human Learning and Memory, Vol.2, No.5, pp.509-522 (1976). – reference: [28] Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D.M. and Jordan, M.I.: Matching words and pictures, J. Mach. Learn. Res., Vol.3, pp.1107-1135 (2003). – reference: [41] Friedman, A.: Framing pictures: The role of knowledge in automatized encoding and memory for gist, Journal of Experimental Psychology: General, Vol.108, pp.316-355 (1979). – reference: [8] Makadia, A., Pavlovic, V. and Kumar, S.: Baselines for Image Annotation, International Journal of Computer Vision, pp.1-18 (2010). – reference: [14] Laaksonen, J., Koskela, M. and Oja, E.: Content-Based Image Retrieval Using Self-Organizing Maps, VISUAL, pp.541-548 (1999). – reference: [16] Schettini, R., CIOCCA, G., Zuffi, S., Tecnologie, I. and Multimediali, I.: A Survey of Methods for Colour Image Indexing and Retrieval in Image Databases, Color Imaging Science: Exploiting Digital, Media, John Wiley, pp.1-9 (2001). – reference: [4] Datta, R., Joshi, D., Li, J. and Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age, ACM Comput. Surv., Vol.40, No.2, pp.1-60 (online), DOI:http://doi.acm.org/10.1145/1348246.1348248 (2008). – reference: [25] Lazebnik, S., Schmid, C. and Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol.2, pp.2169-2178, IEEE (2006). – reference: [43] Lowe, D.: Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, Vol.60, No.2, pp.91-110 (2004). – reference: [35] Judd, T., Ehinger, K., Durand, F. and Torralba, A.: Learning to predict where humans look, 2009 IEEE 12th International Conference on Computer Vision, pp.2106-2113, IEEE (2010). – reference: [1] Gantz, J.F., Reinsel, D., Chute, C., Schlichting, W., Mcarthur, J., Minton, S., Xheneti, I., Toncheva, A. and Manfrediz, A.: The Expanding Digital Universe: A Forecast of Worldwide Information Growth Through 2010, IDC White Paper (online) (2007), available from <http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf>. – reference: [29] Monay, F. and Gatica-Perez, D.: PLSA-based image auto-annotation: constraining the latent space, MULTIMEDIA '04: Proc. 12th Annual ACM International Conference on Multimedia, pp.348-351, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/1027527.1027608 (2004). – reference: [31] Jeon, L., Lavrenko, V., Manmatha, R. and Jeon, J.: A model for learning the semantics of pictures, Seventeenth Annual Conference on Neural Information Processing Systems (NIPS), MIT Press (2003). – reference: [19] Zhao, R. and Grosky, W.: From Features to Semantics: Some Preliminary Results, p.TAS3 (2000). – reference: [37] Hou, X. and Zhang, L.: Saliency Detection: A Spectral Residual Approach, Proc. IEEE Conference on Computer Vision and Pattern Recognition CVPR '07, pp.1-8 (online), DOI:10.1109/CVPR. 2007.383267 (2007). – reference: [48] Sarin, S. and Kameyama, W.: Holistic Image Features Extraction for Better Image Annotation, IEICE General Conference, Sendai City, Miyagi, Japan (2010). – reference: [49] Ong, K.-M., Sarin, S. and Kameyama, W.: Affective and Holistic Approach at TRECVID 2010 Task-Semantic Indexing (SIN), Working Notes of TRECVID (2010). – reference: [50] Sarin, S., Fahrmair, M., Wagner, M. and Kameyama, W.: Holistic Feature Extraction for Automatic Image Annotation, Proc. 5th FTRA Int Multimedia and Ubiquitous Engineering (MUE) Conf, pp.59-66 (online), DOI:10.1109/MUE.2011.22 (2011). – reference: [15] Meghini, C., Sebastiani, F. and Straccia, U.: A model of multimedia information retrieval, J. ACM, Vol.48, pp.909-970 (online), DOI:http://doi.acm.org/10.1145/502102.502103 (2001). – reference: [5] Duygulu, P., Barnard, K., Freitas, de Freitas, J.F.G. and Forsyth, D.A.: Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary, ECCV '02: Proc. 7th European Conference on Computer Vision-Part IV, pp.97-112, Springer-Verlag, London, UK (2002). – reference: [27] Hertz, T., Bar-Hillel, A. and Weinshall, D.: Learning distance functions for image retrieval, CVPR'04: Proc. 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.570-577, IEEE Computer Society, Washington, DC, USA (online) (2004), available from <http://portal.acm.org/citation.cfm?id=1896300.1896383>. – reference: [20] Tsai, C.-F., McGarry, K. and Tait, J.: Image classification using hybrid neural networks, SIGIR '03: Proc. 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp.431-432, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/860435.860536 (2003). – reference: [3] Facebook Photo Statistics (2010), available from <http://blog.facebook.com/blog.php?post=206178097130>. – reference: [17] Ko, B., Lee, H.-S. and Byun, H.: Image retrieval using flexible image subblocks, SAC '00: Proc. 2000 ACM Symposium on Applied Computing-Volume 2, pp.574-578, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/338407.338502 (2000). – reference: [23] Wallraven, C., Caputo, B. and Graf, A.: Recognition with local features: The kernel recipe (2003). – reference: [44] Van de Weijer, J., Gevers, T. and Bagdanov, A.: Boosting color saliency in image feature detection, IEEE Trans. on Pattern Analysis and Machine Intelligence, pp.150-156 (2006). – reference: [6] von Ahn, L. and Dabbish, L.: Labeling images with a computer game, CHI '04: Proc. SIGCHI Conference on Human Factors in Computing Systems, pp.319-326, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/985692.985733 (2004). – reference: [47] Sarin, S. and Kameyama, W.: Joint Equal Contribution of Global and Local Features for Image Annotation, CLEF Workshop 2009 (2009). – reference: [9] Deng, Y., Manjunath, B. and Shin, H.: Color image segmentation, CVPR '99, p.2446, IEEE Computer Society (1999). – reference: [22] Grauman, K. and Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features (2005). – reference: [26] Grangier, D. and Bengio, S.: A Discriminative Kernel-Based Approach to Rank Images from Text Queries, IEEE Trans. Pattern Anal. Mach. Intell., Vol.30, pp.1371-1384 (online), DOI:10.1109/TPAMI.2007.70791 (2008). – reference: [32] Feng, S., Manmatha, R. and Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation, CVPR, Vol.2, pp.1002-1009 (2004). – reference: [36] Achanta, R., Hemami, S., Estrada, F. and Ssstrunk, S.: Frequency-tuned Salient Region Detection, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), (online) (2009), available from <http://www.cvpr2009.org/>. – reference: [18] Shyu, M.-L., Chen, S.-C., Chen, M., Zhang, C. and Sarinnapakorn, K.: Image database retrieval utilizing affinity relationships, MMDB '03: Proc. 1st ACM International Workshop on Multimedia Databases, pp.78-85, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/951676.951691 (2003). – reference: [51] Guillaumin, M.: Exploiting Multimodal Data for Image Understanding, PhD Thesis, Université de Grenoble (2010). – reference: [2] Flickr Photo Statistics (2010), available from <http://blog.flickr.net/en/2010/09/19/5000000000/>. – reference: [33] Torralba, A., Fergus, R. and Freeman, W.T.: 80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition, IEEE Trans. Pattern Anal. Mach. Intell., Vol.30, pp.1958-1970 (online), DOI:10.1109/TPAMI.2008.128 (2008). – reference: [46] Abdel-Hakim, A. and Farag, A.: CSIFT: A SIFT descriptor with color invariant characteristics, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol.2, pp.1978-1983, IEEE (2006). – reference: [24] Willamowski, J., Arregui, D., Csurka, G., Dance, C. and Fan, L.: Categorizing nine visual classes using local appearance descriptors, Illumination, Vol.17, p.21 (2004). – reference: [34] Itti, L., Koch, C. and Niebur, E.: A Model of Saliency-Based Visual Attention for Rapid Scene Analysis, IEEE Trans. Pattern Anal. Mach. Intell., Vol.20, No.11, pp.1254-1259 (online), DOI:http://dx.doi.org/10.1109/34.730558 (1998). – reference: [7] Guillaumin, M., Mensink, T., Verbeek, J. and Schmid, C.: TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation, International Conference on Computer Vision (online) (2009), available from <http://lear.inrialpes.fr/pubs/2009/GMVS09>. – reference: [45] Bosch, A., Zisserman, A. and Muoz, X.: Scene classification using a hybrid generative/discriminative approach, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.30, No.4, pp.712-727 (2008). – reference: [21] Monay, F. and Gatica-Perez, D.: On image auto-annotation with latent space models, MULTIMEDIA '03: Proc. 11th ACM International Conference on Multimedia, pp.275-278, ACM, New York, NY, USA (online), DOI:http://doi.acm.org/10.1145/957013.957070 (2003). – ident: 2 – ident: 41 doi: 10.1037/0096-3445.108.3.316 – ident: 36 doi: 10.1109/CVPRW.2009.5206596 – ident: 12 – ident: 33 doi: 10.1109/TPAMI.2008.128 – ident: 35 – ident: 10 doi: 10.1109/34.868688 – ident: 51 – ident: 16 – ident: 31 – ident: 30 doi: 10.1109/TPAMI.2007.61 – ident: 9 – ident: 27 doi: 10.1109/CVPR.2004.1315215 – ident: 49 – ident: 17 – ident: 5 – ident: 1 – ident: 38 – ident: 50 doi: 10.1109/MUE.2011.22 – ident: 48 – ident: 8 – ident: 11 doi: 10.1109/TPAMI.2006.233 – ident: 34 doi: 10.1109/34.730558 – ident: 40 doi: 10.1023/A:1011139631724 – ident: 18 – ident: 43 – ident: 23 doi: 10.1109/ICCV.2003.1238351 – ident: 4 – ident: 26 doi: 10.1109/TPAMI.2007.70791 – ident: 37 – ident: 22 doi: 10.1109/ICCV.2005.239 – ident: 14 – ident: 45 doi: 10.1109/TPAMI.2007.70716 – ident: 13 doi: 10.1109/TPAMI.2006.57 – ident: 24 – ident: 47 – ident: 20 – ident: 42 – ident: 3 – ident: 39 doi: 10.1109/TPAMI.2009.154 – ident: 44 doi: 10.1109/TPAMI.2006.3 – ident: 7 doi: 10.1109/ICCV.2009.5459266 – ident: 19 – ident: 52 – ident: 15 – ident: 32 – ident: 29 – ident: 28 doi: 10.1162/153244303322533214 – ident: 6 – ident: 46 – ident: 21 – ident: 25 |
SSID | ssj0069050 |
Score | 1.490037 |
Snippet | In this era of information explosion, automating the annotation process of digital images is a crucial step towards efficient and effective management of this... |
SourceID | crossref jstage |
SourceType | Index Database Publisher |
StartPage | 250 |
Title | Leveraging Features from Background and Salient Regions for Automatic Image Annotation |
URI | https://www.jstage.jst.go.jp/article/ipsjjip/20/1/20_1_250/_article/-char/en |
Volume | 20 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
ispartofPNX | Journal of Information Processing, 2012, Vol.20(1), pp.250-266 |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3Nb9MwFLfK4MCFb8T4kg9IHFCG48RJKnEpCDQGQgI2tltkO_Zoq6ZVSQ7wX_Ef8p7tJO22A4OLVcVJbfn99D78vgh5BkLGVoUB8FpbRWnFTFQoxSPDxkpqludMumiLT9n-UXpwIk5Go98bUUtto_b0rwvzSv6FqvAM6IpZspegbP-n8AB-A31hBArD-Fc0_mhgy77NEKpyLZjOPl_ktdRzzNfwhVhffAVlG33-X8ypi3vD0MJJ2yx9udb3C4zbmdT1csMtf15fDYlLDi8hvaATe66Y43eYnK7PhuLj5Q2Y4_6WtV0B658vgL_Ml_30sTwNSTeu9_hUDu4luTA_5cKpt8eyket2844iHmzZi7fWR6TC5g9AJwglxj0HRpU_y8QWi-bsHBQDv_VVa4Po5r6By1mpAEwZ_dLT1Y_ZbIo1Svf6z7YqbQc6luHFkoN9FAbBym4Ws-EAfFfIVZ7nMTLSD597x1U2ZoL52qC46svtNbd0n2szUP-70EGnzRzeIjcCWenEr3abjEx9h9zsWnzQwPHvkm8DxGgHMYoQowPEKECMBojRADEKBKE9xKiDGB0gdo8cvXt7-GY_Cq04Io1RU1FhbCysYjLGHhgKjP6xtkooEJVGWA1qqYJplaSSq1glKtWVUFzaxFYyN5lI7pOdelmbB4SmQmpZ5NbkSZFqYWSaVnqsOEiHhMsi3iXPu0MqV77iSgmWKh7nJmXgOHfJK3-G_XuXouDD__v8EbmOUPfXcY_JTrNuzRNQUBv11EHiDzq4pKo |
linkProvider | Colorado Alliance of Research Libraries |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Leveraging+Features+from+Background+and+Salient+Regions+for+Automatic+Image+Annotation&rft.jtitle=Journal+of+Information+Processing&rft.au=Fahrmair%2C+Michael&rft.au=Sarin%2C+Supheakmungkol&rft.au=Wagner%2C+Matthias&rft.au=Kameyama%2C+Wataru&rft.date=2012&rft.pub=Information+Processing+Society+of+Japan&rft.eissn=1882-6652&rft.volume=20&rft.issue=1&rft.spage=250&rft.epage=266&rft_id=info:doi/10.2197%2Fipsjjip.20.250&rft.externalDocID=article_ipsjjip_20_1_20_1_250_article_char_en |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1882-6652&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1882-6652&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1882-6652&client=summon |