An information retrieving method utilizing webpage visual features and webpage language features and a system using thereof
An information retrieving method utilizing webpage visual features and webpage language features and a system using thereof are disclosed. The system includes an analysis result database, a webpage template database, a webpage collecting module, and an analysis module. The webpage template database...
Saved in:
Main Author | |
---|---|
Format | Patent |
Language | Chinese English |
Published |
01.02.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | An information retrieving method utilizing webpage visual features and webpage language features and a system using thereof are disclosed. The system includes an analysis result database, a webpage template database, a webpage collecting module, and an analysis module. The webpage template database stores template feature arrays corresponding to target websites, respectively. Each of the template feature arrays includes one or more template visual feature and one or more template language feature which are corresponding to template nodes of a DOM tree. The system is linked to a target website by the webpage collecting module so as to retrieve webpage feature arrays of a target webpage of the target website. The system calculates a similarity degree between the webpage feature arrays and the template feature arrays corresponding to the same target website. Therefore, a desired information content can be figured out and stored in the analysis result database. |
---|---|
Bibliography: | Application Number: TW20150123950 |