Automatic Corpus Selection and Halting Condition Detection for Semantic Asset Expansion

A mechanism is provided in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to implement an automated lexicon expansion for an identified corp...

Full description

Saved in:
Bibliographic Details
Main Authors Nagarajan, Meenakshi, Lewis, Neal R, Drews, Clemens, Alba, Alfredo, Kato, Linda H, Gruhl, Daniel F, Mendes, Pablo N
Format Patent
LanguageEnglish
Published 09.08.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A mechanism is provided in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to implement an automated lexicon expansion for an identified corpus. For a selected corpus in a set of corpora, the mechanism determines an estimated number of new terms in the selected corpus that are not in the lexicon based on a frequency count known terms in the selected corpus. Responsive to the estimated number of new terms in the selected corpus being greater than a threshold, the mechanism performs lexicon expansion using the selected corpus to form an expanded lexicon. Responsive to the estimated number of new terms in the selected corpus not being greater than the threshold, the mechanism halts lexicon expansion.
Bibliography:Application Number: US201715835919