Machine learning system for extracting structured records from web pages and other text sources

A method for extracting a structured record ( 190 ) from a document ( 100 ) is described where the the structured record includes information related to a predetermined subject matter ( 120 ), with this information being organized into categories within the structured record. The method comprises th...

Full description

Saved in:

Bibliographic Details
Main Authors	BAXTER JONATHAN, SEYMORE KRISTIE
Format	Patent
Language	English
Published	08.06.2006
Subjects	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A method for extracting a structured record ( 190 ) from a document ( 100 ) is described where the the structured record includes information related to a predetermined subject matter ( 120 ), with this information being organized into categories within the structured record. The method comprises the steps of identifying a span of text ( 130 ) in the document ( 100 ) according to criteria associated with the predetermined subject matter and processing ( 150 ) the span of text to extract at least one text element associated with at least one of the categories of the structured record ( 190 ) from the document ( 100 ).
Bibliography:	Application Number: US20050291740