Data Correlation System And Method
A computer system extracts product data from a website and correlates product records from multiple sources to one another as corresponding to the same product. A website is crawled efficiently by rendering webpages using a virtual browser that ignores blacklisted elements, extracts data from object...
Saved in:
Main Authors | , , |
---|---|
Format | Patent |
Language | English |
Published |
30.03.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | A computer system extracts product data from a website and correlates product records from multiple sources to one another as corresponding to the same product. A website is crawled efficiently by rendering webpages using a virtual browser that ignores blacklisted elements, extracts data from objects without rendering, and suppressing retrieval of remote resources. Data is extracted according to engine control statements including a selector and extractor. A website may be crawled repeatedly and changes in extracted data may be detected and flagged. Engine control statements may be automatically changed in response to detecting a change in the configuration of the website. Images of product records may be correlated with one another by first comparing text of the product records and selecting images for comparison based on composition. Images are compared using a machine learning model. Images determined to be similar may be presented to a human for a correlation decision. |
---|---|
AbstractList | A computer system extracts product data from a website and correlates product records from multiple sources to one another as corresponding to the same product. A website is crawled efficiently by rendering webpages using a virtual browser that ignores blacklisted elements, extracts data from objects without rendering, and suppressing retrieval of remote resources. Data is extracted according to engine control statements including a selector and extractor. A website may be crawled repeatedly and changes in extracted data may be detected and flagged. Engine control statements may be automatically changed in response to detecting a change in the configuration of the website. Images of product records may be correlated with one another by first comparing text of the product records and selecting images for comparison based on composition. Images are compared using a machine learning model. Images determined to be similar may be presented to a human for a correlation decision. |
Author | Gilfanov, Ruslan Zaytsev, Andrey Aggarwal, Amit |
Author_xml | – fullname: Zaytsev, Andrey – fullname: Gilfanov, Ruslan – fullname: Aggarwal, Amit |
BookMark | eNrjYmDJy89L5WRQckksSVRwzi8qSs1JLMnMz1MIriwuSc1VcMxLUfBNLcnIT-FhYE1LzClO5YXS3AzKbq4hzh66qQX58anFBYnJqXmpJfGhwUYGRsYGlmYGphaOhsbEqQIA950oNg |
ContentType | Patent |
DBID | EVB |
DatabaseName | esp@cenet |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: EVB name: esp@cenet url: http://worldwide.espacenet.com/singleLineSearch?locale=en_EP sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Medicine Chemistry Sciences Physics |
ExternalDocumentID | US2023096058A1 |
GroupedDBID | EVB |
ID | FETCH-epo_espacenet_US2023096058A13 |
IEDL.DBID | EVB |
IngestDate | Fri Jul 19 13:08:33 EDT 2024 |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-epo_espacenet_US2023096058A13 |
Notes | Application Number: US202117486567 |
OpenAccessLink | https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20230330&DB=EPODOC&CC=US&NR=2023096058A1 |
ParticipantIDs | epo_espacenet_US2023096058A1 |
PublicationCentury | 2000 |
PublicationDate | 20230330 |
PublicationDateYYYYMMDD | 2023-03-30 |
PublicationDate_xml | – month: 03 year: 2023 text: 20230330 day: 30 |
PublicationDecade | 2020 |
PublicationYear | 2023 |
RelatedCompanies | The Yes Platform, Inc |
RelatedCompanies_xml | – name: The Yes Platform, Inc |
Score | 3.4609823 |
Snippet | A computer system extracts product data from a website and correlates product records from multiple sources to one another as corresponding to the same... |
SourceID | epo |
SourceType | Open Access Repository |
SubjectTerms | CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING IMAGE DATA PROCESSING OR GENERATION, IN GENERAL PHYSICS |
Title | Data Correlation System And Method |
URI | https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20230330&DB=EPODOC&locale=&CC=US&NR=2023096058A1 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfR3LSsNAcCj1edOo-KgSVHILJiZrkkOQNg-KkLbYRnor-yoIEouJ-PtONqn21OPOwMzuwuzM7LwA7gNfLCWR3AyIR03XthyTLR1uehydZ09Km6s-BdnoaZi7L3My78DHuhZG9Qn9Uc0RUaI4ynul3uvV_ydWrHIrywf2jqDP53QWxkbrHaM9jSyMeBAmk3E8jowoCvOpMXptcEEdA-yjr7SDhrRXy0PyNqjrUlabSiU9gt0J0iuqY-jIQoODaD17TYP9rA15a7CncjR5icBWDssTuI1pRfWonqzR5LLpTedxvV8IPVNDoU_hLk1m0dBEtou_Uy7y6eYenTPoov8vz0H3qesxyoTNmXAJIYxRIgiRtk9t91FYF9DbRulyO_oKDuulKrOzetCtvr7lNerZit2o6_kFsa1_2Q |
link.rule.ids | 230,309,783,888,25576,76876 |
linkProvider | European Patent Office |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LS8NAEB5KfdSbVsVH1aCSW7CxWZMcgrR5ELVJi02kt7K72YIgsZiIf9_JptWeep2Bnd2Fb2dm5wVwa1vZXBDBNZuYVDP0bk9j8x7XTI7OsymEzmWfgih-CFPjeUqmDfhY1cLIPqE_sjkiIooj3kv5Xi_-P7E8mVtZ3LF3JH0-BonjqUvvGO1pFKF6A8cfj7yRq7quk07U-LXm2VUMsI--0hYa2WaFB_9tUNWlLNaVSrAP22NcLy8PoCHyNrTc1ey1NuxGy5B3G3ZkjiYvkLjEYXEI1x4tqeJWkzXqXDal7jyu9PNMieRQ6CO4CfzEDTUUO_s75SydrO-xdwxN9P_FCSgWNUxGWaZzlhmEEMYoyQgRukV14z7rnkJn00pnm9lX0AqTaDgbPsUv57BXsWTJXbcDzfLrW1ygzi3ZpbyqX9v8gsw |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Apatent&rft.title=Data+Correlation+System+And+Method&rft.inventor=Zaytsev%2C+Andrey&rft.inventor=Gilfanov%2C+Ruslan&rft.inventor=Aggarwal%2C+Amit&rft.date=2023-03-30&rft.externalDBID=A1&rft.externalDocID=US2023096058A1 |