Generating Domain Terminologies using Root- and Rule-Based Terms1
Motivated by the need for flexible, intuitive, reusable, and normalized terminology for guiding search and building ontologies, we present a general approach for generating sets of such terminologies from natural language documents. The terms that this approach generates are root- and rule-based ter...
Saved in:
Published in | Journal of the Washington Academy of Sciences Vol. 104; no. 4; pp. 31 - 78 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
01.01.2018
|
Online Access | Get full text |
Cover
Loading…
Summary: | Motivated by the need for flexible, intuitive, reusable, and normalized terminology for guiding search and building ontologies, we present a general approach for generating sets of such terminologies from natural language documents. The terms that this approach generates are root- and rule-based terms, generated by a series of rules designed to be flexible, to evolve, and, perhaps most important, to protect against ambiguity and standardize semantically similar but syntactically distinct phrases to a normal form. This approach combines several linguistic and computational methods that can be automated with the help of training sets to quickly and consistently extract normalized terms. We discuss how this can be extended as natural language technologies improve and how the strategy applies to common use-cases such as search, document entry and archiving, and identifying, tracking, and predicting scientific and technological trends. |
---|---|
ISSN: | 0043-0439 |