AUTOMATICALLY GENERATING TRAINING DATA

Computer-readable media, computer systems, and computing devices facilitate generating binary classifier and entity extractor training data. Seed URLs are selected and URL patterns within the seed URLs are identified. Matching URLs in a data structure are identified and corresponding queries and the...

Full description

Saved in:
Bibliographic Details
Main Authors BUEHRER GREG, NARASIMHAN MUKUND, AHARI SANAZ, VIOLA PAUL, MCGOVERN ANDREW
Format Patent
LanguageEnglish
Published 22.12.2011
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Computer-readable media, computer systems, and computing devices facilitate generating binary classifier and entity extractor training data. Seed URLs are selected and URL patterns within the seed URLs are identified. Matching URLs in a data structure are identified and corresponding queries and their associated weights are added to a potential training data set from which training data is selected.
Bibliography:Application Number: US20100818377