Aggregation of imprecise and uncertain information in databases
Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of representing uncertainty and may thus be used to provide a mechanism for storing uncertain information in a database. We consider the problem of aggre...
Saved in:
Published in | IEEE transactions on knowledge and data engineering Vol. 13; no. 6; pp. 902 - 912 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.11.2001
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 1041-4347 1558-2191 |
DOI | 10.1109/69.971186 |
Cover
Abstract | Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of representing uncertainty and may thus be used to provide a mechanism for storing uncertain information in a database. We consider the problem of aggregation using an imprecise probability data model that allows us to represent imprecision by partial probabilities and uncertainty using probability distributions. Most work to date has concentrated on providing functionality for extending the relational algebra with a view to executing traditional queries on uncertain or imprecise data. However, for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. Thus, while traditional query processing is tuple-driven, processing of uncertain data is often attribute-driven where we use aggregation operators to discover attribute properties. The aggregation operator that we define uses the Kullback-Leibler information divergence between the aggregated probability distribution and the individual tuple values to provide a probability distribution for the domain values of an attribute or group of attributes. The provision of such aggregation operators is a central requirement in furnishing a database with the capability to perform the operations necessary for knowledge discovery in databases. |
---|---|
AbstractList | Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of representing uncertainty and may thus be used to provide a mechanism for storing uncertain information in a database. We consider the problem of aggregation using an imprecise probability data model that allows us to represent imprecision by partial probabilities and uncertainty using probability distributions. Most work to date has concentrated on providing functionality for extending the relational algebra with a view to executing traditional queries on uncertain or imprecise data. However, for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. Thus, while traditional query processing is tuple-driven, processing of uncertain data is often attribute-driven where we use aggregation operators to discover attribute properties. The aggregation operator that we define uses the Kullback-Leibler information divergence between the aggregated probability distribution and the individual tuple values to provide a probability distribution for the domain values of an attribute or group of attributes. The provision of such aggregation operators is a central requirement in furnishing a database with the capability to perform the operations necessary for knowledge discovery in databases. Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of representing uncertainty and may thus be used to provide a mechanism for storing uncertain information in a database. We consider the problem of aggregation using an imprecise probability data model that allows us to represent imprecision by partial probabilities and uncertainty using probability distributions. Most work to date has concentrated on providing functionality for extending the relational algebra with a view to executing traditional queries on uncertain or imprecise data. However, for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. Thus, while traditional query processing is tuple-driven, processing of uncertain data is often attribute-driven where we use aggregation operators to discover attribute properties. The aggregation operator that we define uses the Kullback-Leibler information divergence between the aggregated probability distribution and the individual tuple values to provide a probability distribution for the domain values of an attribute or group of attributes. The provision of such aggregation operators is a central requirement in furnishing a database with the capability to perform the operations necessary for knowledge discovery in databases [...] for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. [...] while traditional query processing is tuple-driven, processing of uncertain data is often attribute-driven where we use aggregation operators to discover attribute properties. |
Author | McClean, S. Scotney, B. Shapcott, M. |
Author_xml | – sequence: 1 givenname: S. surname: McClean fullname: McClean, S. organization: Fac. of Informatics, Ulster Univ., Coleraine, UK – sequence: 2 givenname: B. surname: Scotney fullname: Scotney, B. – sequence: 3 givenname: M. surname: Shapcott fullname: Shapcott, M. |
BookMark | eNqF0TtPwzAQAGALFYm2MLAyRQwghrR24kc8oariJVVigTlynHPlKnWKnQz8e1xSMSBEp7vTfXfD3QSNXOsAoUuCZ4RgOedyJgUhBT9BY8JYkWZEklHMMSUpzak4Q5MQNhjjQhRkjO4X67WHteps65LWJHa786BtgES5OumdBt8p6xLrTOu3A4tlrTpVqQDhHJ0a1QS4OMQpen98eFs-p6vXp5flYpXqXBZdKipjKsyZFljoHIOURU25yZipGVAOgqsMKwGCkUhFVhiVSaNVNIzSiudTdDvs3fn2o4fQlVsbNDSNctD2oZSEcool3cubf2UmM045xsdhkWNBBDkOuWCSCRnh9S-4aXvv4l1KKXNOKc9ERPMBad-G4MGU2nbfh-28sk1JcLn_ZMllOXwyTtz9mth5u1X-8097NVgLAD_u0PwCVaemRQ |
CODEN | ITKEEH |
CitedBy_id | crossref_primary_10_1007_s10115_005_0211_z crossref_primary_10_1007_s10586_019_03006_z crossref_primary_10_1109_TKDE_2010_191 crossref_primary_10_3724_SP_J_1087_2009_03092 crossref_primary_10_1007_s00500_019_04063_7 crossref_primary_10_1007_s10115_012_0588_4 crossref_primary_10_1109_TEM_2024_3396503 crossref_primary_10_1016_j_datak_2008_08_002 crossref_primary_10_2139_ssrn_3360362 crossref_primary_10_1002_int_22328 crossref_primary_10_1016_j_jii_2024_100710 crossref_primary_10_1007_s10619_008_7031_6 crossref_primary_10_1080_00405000802131160 crossref_primary_10_1109_JSYST_2020_3027716 crossref_primary_10_1016_j_inffus_2023_102026 crossref_primary_10_1109_TITB_2012_2188534 crossref_primary_10_1007_s11227_023_05235_x crossref_primary_10_1016_j_knosys_2012_10_014 crossref_primary_10_1016_S0020_0255_03_00172_5 crossref_primary_10_1109_TSMC_2016_2560533 crossref_primary_10_3724_SP_J_1016_2008_00091 crossref_primary_10_1016_j_aei_2023_102245 crossref_primary_10_1109_TKDE_2003_1161592 crossref_primary_10_1109_TKDE_2010_166 crossref_primary_10_1016_j_future_2009_08_005 crossref_primary_10_1109_TKDE_2008_190 crossref_primary_10_1016_j_knosys_2024_111721 crossref_primary_10_1016_j_ins_2024_120312 crossref_primary_10_1109_TFUZZ_2013_2239650 crossref_primary_10_1016_j_fss_2021_10_017 |
Cites_doi | 10.1016/s0169-023x(98)00039-1 10.1016/0169-023X(95)00029-R 10.1016/0169-023x(92)90040-i 10.1145/191246.191314 10.1109/64.585106 10.1002/(SICI)1098-111X(199710)12:10<763::AID-INT5>3.0.CO;2-W 10.1080/01621459.1971.10482265 10.1214/aos/1176346060 10.1016/0169-023X(95)00038-T 10.1109/69.494166 10.1109/69.43423 10.1111/1467-9868.00075 10.1016/0950-5849(92)90011-D 10.1111/1467-9868.00083 10.1007/BF01263334 10.1109/69.166990 10.1145/169725.169712 10.1109/69.506705 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2001 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2001 |
DBID | RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D 7TB FR3 F28 |
DOI | 10.1109/69.971186 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Mechanical & Transportation Engineering Abstracts Engineering Research Database ANTE: Abstracts in New Technology & Engineering |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional Mechanical & Transportation Engineering Abstracts Engineering Research Database ANTE: Abstracts in New Technology & Engineering |
DatabaseTitleList | Technology Research Database Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 1558-2191 |
EndPage | 912 |
ExternalDocumentID | 2632027191 10_1109_69_971186 971186 |
GroupedDBID | -~X .DC 0R~ 1OL 29I 4.4 5GY 5VS 6IK 97E 9M8 AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNI RNS RXW RZB TAE TAF TN5 UHB VH1 AAYOK AAYXX CITATION RIG 7SC 7SP 8FD JQ2 L7M L~C L~D 7TB FR3 F28 |
ID | FETCH-LOGICAL-c398t-7bffb065c707c30e998d46f25fd5e46e76a20a7e7517bf728fa29fca46f544b63 |
IEDL.DBID | RIE |
ISSN | 1041-4347 |
IngestDate | Fri Sep 05 12:18:04 EDT 2025 Fri Sep 05 07:16:14 EDT 2025 Fri Sep 05 00:11:00 EDT 2025 Fri Sep 05 12:03:18 EDT 2025 Fri Jul 25 06:34:48 EDT 2025 Tue Jul 01 05:16:39 EDT 2025 Thu Apr 24 22:51:37 EDT 2025 Wed Aug 27 02:52:16 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 6 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c398t-7bffb065c707c30e998d46f25fd5e46e76a20a7e7517bf728fa29fca46f544b63 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 |
PQID | 993644627 |
PQPubID | 23500 |
PageCount | 11 |
ParticipantIDs | proquest_miscellaneous_28307171 crossref_citationtrail_10_1109_69_971186 proquest_journals_993644627 ieee_primary_971186 proquest_miscellaneous_914640946 proquest_miscellaneous_26759579 proquest_miscellaneous_29264600 crossref_primary_10_1109_69_971186 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2001-11-01 |
PublicationDateYYYYMMDD | 2001-11-01 |
PublicationDate_xml | – month: 11 year: 2001 text: 2001-11-01 day: 01 |
PublicationDecade | 2000 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on knowledge and data engineering |
PublicationTitleAbbrev | TKDE |
PublicationYear | 2001 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref24 ref12 ref14 Vardi (ref23); B55 ref20 ref11 ref22 ref10 ref21 ref2 Sadreddini (ref19); 4 Agrawal (ref1) ref17 ref16 Dempster (ref9); B39 ref18 ref8 ref7 McClean (ref15) Chang (ref6) ref4 ref3 ref5 |
References_xml | – ident: ref21 doi: 10.1016/s0169-023x(98)00039-1 – ident: ref5 doi: 10.1016/0169-023X(95)00029-R – ident: ref18 doi: 10.1016/0169-023x(92)90040-i – ident: ref12 doi: 10.1145/191246.191314 – ident: ref2 doi: 10.1109/64.585106 – volume: 4 start-page: 115 issue: 2 volume-title: Database Technology ident: ref19 article-title: A Model for Integration of Raw Data and Aggregate Views in Heterogeneous Statistical Databases – start-page: 269 volume-title: Proc. Fourth Int’l Conf. Knowledge Discovery and Data Mining (KDD ’98) ident: ref15 article-title: Aggregation of Imprecise and Uncertain Information for Knowledge Discovery in Databases – ident: ref14 doi: 10.1002/(SICI)1098-111X(199710)12:10<763::AID-INT5>3.0.CO;2-W – ident: ref10 doi: 10.1080/01621459.1971.10482265 – ident: ref24 doi: 10.1214/aos/1176346060 – ident: ref3 doi: 10.1016/0169-023X(95)00038-T – ident: ref7 doi: 10.1109/69.494166 – ident: ref8 doi: 10.1109/69.43423 – ident: ref16 doi: 10.1111/1467-9868.00075 – start-page: 307 volume-title: Advances in Knowledge Discovery and Data Mining ident: ref1 article-title: Fast Discovery of Association Rules – ident: ref20 doi: 10.1016/0950-5849(92)90011-D – ident: ref11 doi: 10.1111/1467-9868.00083 – ident: ref22 doi: 10.1007/BF01263334 – volume: B39 start-page: 1 volume-title: J. R. Statistics Soc. ident: ref9 article-title: Maximum Likelihood from Incomplete Data via the EM Algorithm (with discussion) – volume: B55 start-page: 569 issue: 3 volume-title: J. R. Statistics Soc. B ident: ref23 article-title: From Image Deblurring to Optimal Investments: Maximum Likelihood Solutions for Positive Linear Inverse Problems (with discussion) – start-page: 277 volume-title: Proc. Int’l Conf. Data and Knowledge Systems for Manufacturing and Eng. ident: ref6 article-title: Determining Probabilities for Probabilistic Partial Values – ident: ref4 doi: 10.1109/69.166990 – ident: ref13 doi: 10.1145/169725.169712 – ident: ref17 doi: 10.1109/69.506705 |
SSID | ssj0008781 |
Score | 1.9288971 |
Snippet | Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of... [...] for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. [...] while traditional query... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 902 |
SubjectTerms | Agglomeration Algebra Data models Database systems Deductive databases Divergence Information retrieval Operators Probability Probability distribution Probability theory Queries Query processing Relational databases Stochastic processes Studies Uncertainty |
Title | Aggregation of imprecise and uncertain information in databases |
URI | https://ieeexplore.ieee.org/document/971186 https://www.proquest.com/docview/993644627 https://www.proquest.com/docview/26759579 https://www.proquest.com/docview/28307171 https://www.proquest.com/docview/29264600 https://www.proquest.com/docview/914640946 |
Volume | 13 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8IwGH4jnPQgihoRPxrjwcug7KNdT4YYCfHgSRJuS9e1xKgbcXDx1_u2G8RP4m2MB9L0Y-_zrG-fF-BKZjiwhoUeamTlWU7uycxmAuC68gMtjTHO7fOBjSfh_TSa1j7b7iyM1toln-mevXR7-VmhlvZVWV9wpMOsAQ2cZdVRrfVDN-auHimKC5REQchrE6EBFX0metUPv4QeV0vlxwPYRZVRqzquXTozQptM8txbLtKeev9m1fjPBu_Bbs0uybCaDvuwpfM2tFaVG0i9kNuw88mG8ABuhjNU3TM3RqQw5Ol1bj0vSk1knhEMfFXaAKlNVh0MP9rkUhsEy0OYjO4eb8deXVjBU4GIFx5PjUmReyhOuQqoRsmVhcz4kckiHTLNmfSp5JpHA4RyPzbSF0ZJxERhmLLgCJp5ketjIPh_gqeRQdVhwkCl8YAqQVXkU6UyTXUHrld9nqjaddwWv3hJnPqgImEiqbqpA5dr6Lyy2vgN1LbdvAas7nZX45jUa7BMkHkh2WM-78DF-ltcPHZHROa6WJaJj3LJ7lNuQMSBVbyDDQiBnBJpYwfIHwiB0cjKaHbya-O7sO0S29wBx1NoLt6W-gyZziI9d3P8A8jP_Ig |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTsMwEB2xHIADhQKirBbiwCXFTRw7PiGEQGU9tRK3yHFshIAUkfbC1zN20oqlIG5ZXqLIjjPvxeM3AIcqx461nAWokXXgOHmgcpcJgOMqjIyy1nq3zzve7bOr-_i-9tn2a2GMMT75zLTdpp_Lzwd65H6VHUuBdJjPwjyGfRZXi7Umn91E-IqkKC9QFEVM1DZCHSqPuWxXl34JPr6ayo9PsI8rF41qwXbp7QhdOslTezTM2vr9m1njPx95BZZrfklOqxdiFWZM0YTGuHYDqYdyE5Y-GRGuwcnpA-ruB99LZGDJ48urc70oDVFFTjD0VYkDpLZZ9TDcdemlLgyW69C_OO-ddYO6tEKgI5kMA5FZmyH70IIKHVGDoitn3IaxzWPDuBFchVQJI-IOQkWYWBVKqxViYsYyHm3AXDEozCYQvJ8UWWxRd1gW6SzpUC2pjkOqdW6oacHRuM1TXfuOu_IXz6nXH1SmXKZVM7XgYAJ9rcw2poGarpkngPHR7XE_pvUoLFPkXkj3eChasD85i8PHzYmowgxGZRqiYHIzlX8gkshp3s4fCImsEoljC8gvCInxyAlpvjX14fdhodu7vUlvLu-ut2HRp7n55Y47MDd8G5ld5D3DbM-_7x8fYv_V |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Aggregation+of+imprecise+and+uncertain+information+in+databases&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=McClean%2C+S&rft.au=Scotney%2C+B&rft.au=Shapcott%2C+M&rft.date=2001-11-01&rft.issn=1041-4347&rft.volume=13&rft.issue=6&rft.spage=902&rft.epage=912&rft_id=info:doi/10.1109%2F69.971186&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon |