A method and apparatus for anti-crawling
A method and apparatus for anti-crawling are disclosed. The method includes: determining a source hypertext markup language HTML web page; Inserting a noise tag into the source HTML web page, whereinthe noise tag includes noise identification information; Inserting noise data into the noise tag; add...
Saved in:
Main Authors | , |
---|---|
Format | Patent |
Language | Chinese English |
Published |
25.01.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A method and apparatus for anti-crawling are disclosed. The method includes: determining a source hypertext markup language HTML web page; Inserting a noise tag into the source HTML web page, whereinthe noise tag includes noise identification information; Inserting noise data into the noise tag; adding a CSS style of a target cascade style sheet to that noise tag, getting Target HTML Page, wherein the target CSS style is used for not displaying the noise data when a web page is displayed, When the target HTML web page is crawled by the crawler, the data crawled by the crawler contains noise data, and the target HTML web page does not display noise data when being displayed to the user in the client, so that the crawler crawling is meaningless, the normal browsing of the user is not affected, and the safety performance of the web site is effectively improved.
本申请公开了种反爬虫方法和装置,所述方法包括:确定源超文本标记语言HTML网页;在所述源HTML网页中插入噪声标签,其中,所述噪声标签中包括噪声标识信息;在所述噪声标签中插入噪声数据;为所述噪声标签添加目标层叠样式表CSS样式,得到目标HTML网页,其中,所述目标CSS样式用于在进行网页展示时不显示所述噪 |
---|---|
Bibliography: | Application Number: CN201811063154 |