DATA INGESTION PLATFORM

Embodiments are directed to data ingestion over a network. Raw data and integrated data associated with a plurality of separate data sources may be provided such that the raw data includes content associated with a plurality of subjects. Categorization models may be employed to categorize the raw da...

Full description

Saved in:
Bibliographic Details
Main Authors Cai, Xiao, Wray, Adam Jason, Pedersen, Kaj Orla Peter
Format Patent
LanguageEnglish
Published 11.11.2021
Subjects
Online AccessGet full text

Cover

More Information
Summary:Embodiments are directed to data ingestion over a network. Raw data and integrated data associated with a plurality of separate data sources may be provided such that the raw data includes content associated with a plurality of subjects. Categorization models may be employed to categorize the raw data based on various features, such as, format, structure, data source, variability, volume, or associated entities. Matching models may be determined based on the categorization of the of the raw data, the integrated data and the content associated with the plurality of subjects. Matching models may generate a plurality of unified facts based on the raw data and the integrated data such that each unified fact is associated with a score associated with a quality of its match with a unified schema.
Bibliography:Application Number: US202117384577