Applying Machine Learning for High-Performance Named-Entity Extraction

This paper describes a machine learning approach to building an efficient and accurate name spotting system. Finding names in free text is an important task in many text‐based applications. Most previous approaches were based on hand‐crafted modules encoding language and genre‐specific knowledge. Th...

Full description

Saved in:
Bibliographic Details
Published inComputational intelligence Vol. 16; no. 4; pp. 586 - 595
Main Authors Baluja, Shumeet, Mittal, Vibhu O., Sukthankar, Rahul
Format Journal Article
LanguageEnglish
Published Boston, USA and Oxford, UK Blackwell Publishers Ltd 01.11.2000
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper describes a machine learning approach to building an efficient and accurate name spotting system. Finding names in free text is an important task in many text‐based applications. Most previous approaches were based on hand‐crafted modules encoding language and genre‐specific knowledge. These approaches had at least two shortcomings: They required large amounts of time and expertise to develop and were not easily portable to new languages and genres. This paper describes an extensible system that automatically combines weak evidence from different, easily available sources: parts‐of‐speech tags, dictionaries, and surface‐level syntactic information such as capitalization and punctuation. Individually, each piece of evidence is insufficient for robust name detection. However, the combination of evidence, through standard machine learning techniques, yields a system that achieves performance equivalent to the best existing hand‐crafted approaches.
Bibliography:istex:E064954D767056165A04C0CD37C4D3B1F9E35A40
ArticleID:COIN129
ark:/67375/WNG-LR25TQ01-Z
ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0824-7935
1467-8640
DOI:10.1111/0824-7935.00129