Applying Machine Learning for High-Performance Named-Entity Extraction
This paper describes a machine learning approach to building an efficient and accurate name spotting system. Finding names in free text is an important task in many text‐based applications. Most previous approaches were based on hand‐crafted modules encoding language and genre‐specific knowledge. Th...
Saved in:
Published in | Computational intelligence Vol. 16; no. 4; pp. 586 - 595 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Boston, USA and Oxford, UK
Blackwell Publishers Ltd
01.11.2000
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This paper describes a machine learning approach to building an efficient and accurate name spotting system. Finding names in free text is an important task in many text‐based applications. Most previous approaches were based on hand‐crafted modules encoding language and genre‐specific knowledge. These approaches had at least two shortcomings: They required large amounts of time and expertise to develop and were not easily portable to new languages and genres. This paper describes an extensible system that automatically combines weak evidence from different, easily available sources: parts‐of‐speech tags, dictionaries, and surface‐level syntactic information such as capitalization and punctuation. Individually, each piece of evidence is insufficient for robust name detection. However, the combination of evidence, through standard machine learning techniques, yields a system that achieves performance equivalent to the best existing hand‐crafted approaches. |
---|---|
Bibliography: | istex:E064954D767056165A04C0CD37C4D3B1F9E35A40 ArticleID:COIN129 ark:/67375/WNG-LR25TQ01-Z ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 |
ISSN: | 0824-7935 1467-8640 |
DOI: | 10.1111/0824-7935.00129 |