Aggregation of imprecise and uncertain information in databases

Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of representing uncertainty and may thus be used to provide a mechanism for storing uncertain information in a database. We consider the problem of aggre...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on knowledge and data engineering Vol. 13; no. 6; pp. 902 - 912
Main Authors McClean, S., Scotney, B., Shapcott, M.
Format Journal Article
LanguageEnglish
Published New York IEEE 01.11.2001
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1041-4347
1558-2191
DOI10.1109/69.971186

Cover

Abstract Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of representing uncertainty and may thus be used to provide a mechanism for storing uncertain information in a database. We consider the problem of aggregation using an imprecise probability data model that allows us to represent imprecision by partial probabilities and uncertainty using probability distributions. Most work to date has concentrated on providing functionality for extending the relational algebra with a view to executing traditional queries on uncertain or imprecise data. However, for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. Thus, while traditional query processing is tuple-driven, processing of uncertain data is often attribute-driven where we use aggregation operators to discover attribute properties. The aggregation operator that we define uses the Kullback-Leibler information divergence between the aggregated probability distribution and the individual tuple values to provide a probability distribution for the domain values of an attribute or group of attributes. The provision of such aggregation operators is a central requirement in furnishing a database with the capability to perform the operations necessary for knowledge discovery in databases.
AbstractList Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of representing uncertainty and may thus be used to provide a mechanism for storing uncertain information in a database. We consider the problem of aggregation using an imprecise probability data model that allows us to represent imprecision by partial probabilities and uncertainty using probability distributions. Most work to date has concentrated on providing functionality for extending the relational algebra with a view to executing traditional queries on uncertain or imprecise data. However, for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. Thus, while traditional query processing is tuple-driven, processing of uncertain data is often attribute-driven where we use aggregation operators to discover attribute properties. The aggregation operator that we define uses the Kullback-Leibler information divergence between the aggregated probability distribution and the individual tuple values to provide a probability distribution for the domain values of an attribute or group of attributes. The provision of such aggregation operators is a central requirement in furnishing a database with the capability to perform the operations necessary for knowledge discovery in databases.
Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of representing uncertainty and may thus be used to provide a mechanism for storing uncertain information in a database. We consider the problem of aggregation using an imprecise probability data model that allows us to represent imprecision by partial probabilities and uncertainty using probability distributions. Most work to date has concentrated on providing functionality for extending the relational algebra with a view to executing traditional queries on uncertain or imprecise data. However, for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. Thus, while traditional query processing is tuple-driven, processing of uncertain data is often attribute-driven where we use aggregation operators to discover attribute properties. The aggregation operator that we define uses the Kullback-Leibler information divergence between the aggregated probability distribution and the individual tuple values to provide a probability distribution for the domain values of an attribute or group of attributes. The provision of such aggregation operators is a central requirement in furnishing a database with the capability to perform the operations necessary for knowledge discovery in databases
[...] for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. [...] while traditional query processing is tuple-driven, processing of uncertain data is often attribute-driven where we use aggregation operators to discover attribute properties.
Author McClean, S.
Scotney, B.
Shapcott, M.
Author_xml – sequence: 1
  givenname: S.
  surname: McClean
  fullname: McClean, S.
  organization: Fac. of Informatics, Ulster Univ., Coleraine, UK
– sequence: 2
  givenname: B.
  surname: Scotney
  fullname: Scotney, B.
– sequence: 3
  givenname: M.
  surname: Shapcott
  fullname: Shapcott, M.
BookMark eNqF0TtPwzAQAGALFYm2MLAyRQwghrR24kc8oariJVVigTlynHPlKnWKnQz8e1xSMSBEp7vTfXfD3QSNXOsAoUuCZ4RgOedyJgUhBT9BY8JYkWZEklHMMSUpzak4Q5MQNhjjQhRkjO4X67WHteps65LWJHa786BtgES5OumdBt8p6xLrTOu3A4tlrTpVqQDhHJ0a1QS4OMQpen98eFs-p6vXp5flYpXqXBZdKipjKsyZFljoHIOURU25yZipGVAOgqsMKwGCkUhFVhiVSaNVNIzSiudTdDvs3fn2o4fQlVsbNDSNctD2oZSEcool3cubf2UmM045xsdhkWNBBDkOuWCSCRnh9S-4aXvv4l1KKXNOKc9ERPMBad-G4MGU2nbfh-28sk1JcLn_ZMllOXwyTtz9mth5u1X-8097NVgLAD_u0PwCVaemRQ
CODEN ITKEEH
CitedBy_id crossref_primary_10_1007_s10115_005_0211_z
crossref_primary_10_1007_s10586_019_03006_z
crossref_primary_10_1109_TKDE_2010_191
crossref_primary_10_3724_SP_J_1087_2009_03092
crossref_primary_10_1007_s00500_019_04063_7
crossref_primary_10_1007_s10115_012_0588_4
crossref_primary_10_1109_TEM_2024_3396503
crossref_primary_10_1016_j_datak_2008_08_002
crossref_primary_10_2139_ssrn_3360362
crossref_primary_10_1002_int_22328
crossref_primary_10_1016_j_jii_2024_100710
crossref_primary_10_1007_s10619_008_7031_6
crossref_primary_10_1080_00405000802131160
crossref_primary_10_1109_JSYST_2020_3027716
crossref_primary_10_1016_j_inffus_2023_102026
crossref_primary_10_1109_TITB_2012_2188534
crossref_primary_10_1007_s11227_023_05235_x
crossref_primary_10_1016_j_knosys_2012_10_014
crossref_primary_10_1016_S0020_0255_03_00172_5
crossref_primary_10_1109_TSMC_2016_2560533
crossref_primary_10_3724_SP_J_1016_2008_00091
crossref_primary_10_1016_j_aei_2023_102245
crossref_primary_10_1109_TKDE_2003_1161592
crossref_primary_10_1109_TKDE_2010_166
crossref_primary_10_1016_j_future_2009_08_005
crossref_primary_10_1109_TKDE_2008_190
crossref_primary_10_1016_j_knosys_2024_111721
crossref_primary_10_1016_j_ins_2024_120312
crossref_primary_10_1109_TFUZZ_2013_2239650
crossref_primary_10_1016_j_fss_2021_10_017
Cites_doi 10.1016/s0169-023x(98)00039-1
10.1016/0169-023X(95)00029-R
10.1016/0169-023x(92)90040-i
10.1145/191246.191314
10.1109/64.585106
10.1002/(SICI)1098-111X(199710)12:10<763::AID-INT5>3.0.CO;2-W
10.1080/01621459.1971.10482265
10.1214/aos/1176346060
10.1016/0169-023X(95)00038-T
10.1109/69.494166
10.1109/69.43423
10.1111/1467-9868.00075
10.1016/0950-5849(92)90011-D
10.1111/1467-9868.00083
10.1007/BF01263334
10.1109/69.166990
10.1145/169725.169712
10.1109/69.506705
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2001
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2001
DBID RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
7TB
FR3
F28
DOI 10.1109/69.971186
DatabaseName IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Mechanical & Transportation Engineering Abstracts
Engineering Research Database
ANTE: Abstracts in New Technology & Engineering
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
Mechanical & Transportation Engineering Abstracts
Engineering Research Database
ANTE: Abstracts in New Technology & Engineering
DatabaseTitleList
Technology Research Database
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2191
EndPage 912
ExternalDocumentID 2632027191
10_1109_69_971186
971186
GroupedDBID -~X
.DC
0R~
1OL
29I
4.4
5GY
5VS
6IK
97E
9M8
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
F5P
HZ~
H~9
ICLAB
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNI
RNS
RXW
RZB
TAE
TAF
TN5
UHB
VH1
AAYOK
AAYXX
CITATION
RIG
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
7TB
FR3
F28
ID FETCH-LOGICAL-c398t-7bffb065c707c30e998d46f25fd5e46e76a20a7e7517bf728fa29fca46f544b63
IEDL.DBID RIE
ISSN 1041-4347
IngestDate Fri Sep 05 12:18:04 EDT 2025
Fri Sep 05 07:16:14 EDT 2025
Fri Sep 05 00:11:00 EDT 2025
Fri Sep 05 12:03:18 EDT 2025
Fri Jul 25 06:34:48 EDT 2025
Tue Jul 01 05:16:39 EDT 2025
Thu Apr 24 22:51:37 EDT 2025
Wed Aug 27 02:52:16 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c398t-7bffb065c707c30e998d46f25fd5e46e76a20a7e7517bf728fa29fca46f544b63
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
PQID 993644627
PQPubID 23500
PageCount 11
ParticipantIDs proquest_miscellaneous_28307171
crossref_citationtrail_10_1109_69_971186
proquest_journals_993644627
ieee_primary_971186
proquest_miscellaneous_914640946
proquest_miscellaneous_26759579
proquest_miscellaneous_29264600
crossref_primary_10_1109_69_971186
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2001-11-01
PublicationDateYYYYMMDD 2001-11-01
PublicationDate_xml – month: 11
  year: 2001
  text: 2001-11-01
  day: 01
PublicationDecade 2000
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on knowledge and data engineering
PublicationTitleAbbrev TKDE
PublicationYear 2001
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref24
ref12
ref14
Vardi (ref23); B55
ref20
ref11
ref22
ref10
ref21
ref2
Sadreddini (ref19); 4
Agrawal (ref1)
ref17
ref16
Dempster (ref9); B39
ref18
ref8
ref7
McClean (ref15)
Chang (ref6)
ref4
ref3
ref5
References_xml – ident: ref21
  doi: 10.1016/s0169-023x(98)00039-1
– ident: ref5
  doi: 10.1016/0169-023X(95)00029-R
– ident: ref18
  doi: 10.1016/0169-023x(92)90040-i
– ident: ref12
  doi: 10.1145/191246.191314
– ident: ref2
  doi: 10.1109/64.585106
– volume: 4
  start-page: 115
  issue: 2
  volume-title: Database Technology
  ident: ref19
  article-title: A Model for Integration of Raw Data and Aggregate Views in Heterogeneous Statistical Databases
– start-page: 269
  volume-title: Proc. Fourth Int’l Conf. Knowledge Discovery and Data Mining (KDD ’98)
  ident: ref15
  article-title: Aggregation of Imprecise and Uncertain Information for Knowledge Discovery in Databases
– ident: ref14
  doi: 10.1002/(SICI)1098-111X(199710)12:10<763::AID-INT5>3.0.CO;2-W
– ident: ref10
  doi: 10.1080/01621459.1971.10482265
– ident: ref24
  doi: 10.1214/aos/1176346060
– ident: ref3
  doi: 10.1016/0169-023X(95)00038-T
– ident: ref7
  doi: 10.1109/69.494166
– ident: ref8
  doi: 10.1109/69.43423
– ident: ref16
  doi: 10.1111/1467-9868.00075
– start-page: 307
  volume-title: Advances in Knowledge Discovery and Data Mining
  ident: ref1
  article-title: Fast Discovery of Association Rules
– ident: ref20
  doi: 10.1016/0950-5849(92)90011-D
– ident: ref11
  doi: 10.1111/1467-9868.00083
– ident: ref22
  doi: 10.1007/BF01263334
– volume: B39
  start-page: 1
  volume-title: J. R. Statistics Soc.
  ident: ref9
  article-title: Maximum Likelihood from Incomplete Data via the EM Algorithm (with discussion)
– volume: B55
  start-page: 569
  issue: 3
  volume-title: J. R. Statistics Soc. B
  ident: ref23
  article-title: From Image Deblurring to Optimal Investments: Maximum Likelihood Solutions for Positive Linear Inverse Problems (with discussion)
– start-page: 277
  volume-title: Proc. Int’l Conf. Data and Knowledge Systems for Manufacturing and Eng.
  ident: ref6
  article-title: Determining Probabilities for Probabilistic Partial Values
– ident: ref4
  doi: 10.1109/69.166990
– ident: ref13
  doi: 10.1145/169725.169712
– ident: ref17
  doi: 10.1109/69.506705
SSID ssj0008781
Score 1.9288971
Snippet Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of...
[...] for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. [...] while traditional query...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 902
SubjectTerms Agglomeration
Algebra
Data models
Database systems
Deductive databases
Divergence
Information retrieval
Operators
Probability
Probability distribution
Probability theory
Queries
Query processing
Relational databases
Stochastic processes
Studies
Uncertainty
Title Aggregation of imprecise and uncertain information in databases
URI https://ieeexplore.ieee.org/document/971186
https://www.proquest.com/docview/993644627
https://www.proquest.com/docview/26759579
https://www.proquest.com/docview/28307171
https://www.proquest.com/docview/29264600
https://www.proquest.com/docview/914640946
Volume 13
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8IwGH4jnPQgihoRPxrjwcug7KNdT4YYCfHgSRJuS9e1xKgbcXDx1_u2G8RP4m2MB9L0Y-_zrG-fF-BKZjiwhoUeamTlWU7uycxmAuC68gMtjTHO7fOBjSfh_TSa1j7b7iyM1toln-mevXR7-VmhlvZVWV9wpMOsAQ2cZdVRrfVDN-auHimKC5REQchrE6EBFX0metUPv4QeV0vlxwPYRZVRqzquXTozQptM8txbLtKeev9m1fjPBu_Bbs0uybCaDvuwpfM2tFaVG0i9kNuw88mG8ABuhjNU3TM3RqQw5Ol1bj0vSk1knhEMfFXaAKlNVh0MP9rkUhsEy0OYjO4eb8deXVjBU4GIFx5PjUmReyhOuQqoRsmVhcz4kckiHTLNmfSp5JpHA4RyPzbSF0ZJxERhmLLgCJp5ketjIPh_gqeRQdVhwkCl8YAqQVXkU6UyTXUHrld9nqjaddwWv3hJnPqgImEiqbqpA5dr6Lyy2vgN1LbdvAas7nZX45jUa7BMkHkh2WM-78DF-ltcPHZHROa6WJaJj3LJ7lNuQMSBVbyDDQiBnBJpYwfIHwiB0cjKaHbya-O7sO0S29wBx1NoLt6W-gyZziI9d3P8A8jP_Ig
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTsMwEB2xHIADhQKirBbiwCXFTRw7PiGEQGU9tRK3yHFshIAUkfbC1zN20oqlIG5ZXqLIjjPvxeM3AIcqx461nAWokXXgOHmgcpcJgOMqjIyy1nq3zzve7bOr-_i-9tn2a2GMMT75zLTdpp_Lzwd65H6VHUuBdJjPwjyGfRZXi7Umn91E-IqkKC9QFEVM1DZCHSqPuWxXl34JPr6ayo9PsI8rF41qwXbp7QhdOslTezTM2vr9m1njPx95BZZrfklOqxdiFWZM0YTGuHYDqYdyE5Y-GRGuwcnpA-ruB99LZGDJ48urc70oDVFFTjD0VYkDpLZZ9TDcdemlLgyW69C_OO-ddYO6tEKgI5kMA5FZmyH70IIKHVGDoitn3IaxzWPDuBFchVQJI-IOQkWYWBVKqxViYsYyHm3AXDEozCYQvJ8UWWxRd1gW6SzpUC2pjkOqdW6oacHRuM1TXfuOu_IXz6nXH1SmXKZVM7XgYAJ9rcw2poGarpkngPHR7XE_pvUoLFPkXkj3eChasD85i8PHzYmowgxGZRqiYHIzlX8gkshp3s4fCImsEoljC8gvCInxyAlpvjX14fdhodu7vUlvLu-ut2HRp7n55Y47MDd8G5ld5D3DbM-_7x8fYv_V
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Aggregation+of+imprecise+and+uncertain+information+in+databases&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=McClean%2C+S&rft.au=Scotney%2C+B&rft.au=Shapcott%2C+M&rft.date=2001-11-01&rft.issn=1041-4347&rft.volume=13&rft.issue=6&rft.spage=902&rft.epage=912&rft_id=info:doi/10.1109%2F69.971186&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon