Nearest Neighbor Subsequence Search in Time Series Data
Continuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support on these big time series datasets. However, finding exact similarity results, especially at the granularity of subsequences, is known to be prohibitively costly for large...
Saved in:
Published in | 2019 IEEE International Conference on Big Data (Big Data) pp. 2057 - 2066 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.12.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Continuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support on these big time series datasets. However, finding exact similarity results, especially at the granularity of subsequences, is known to be prohibitively costly for large data sets. In this paper, we thus propose an efficient framework for solving this exact subsequence similarity match problem, called TINN (TIme series Nearest Neighbor search). Exploiting the range interval diversity properties of time series datasets, TINN captures similarity at two levels of abstraction, namely, relationships among subsequences within each long time series and relationships across distinct time series in the data set. These relationships are compactly organized in an augmented relationship graph model, with the former relationships encoded in similarity vectors at TINN nodes and the later captured by augmented edge types in the TINN Graph. Query processing strategy deploy novel pruning techniques on the TINN Graph, including node skipping, vertical and horizontal pruning, to significantly reduce the number of time series as well as subsequences to be explored. Comprehensive experiments on synthetic and real world time series data demonstrate that our TINN model consistently outperforms state-of-the-art approaches while still guaranteeing to retrieve exact matches. |
---|---|
AbstractList | Continuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support on these big time series datasets. However, finding exact similarity results, especially at the granularity of subsequences, is known to be prohibitively costly for large data sets. In this paper, we thus propose an efficient framework for solving this exact subsequence similarity match problem, called TINN (TIme series Nearest Neighbor search). Exploiting the range interval diversity properties of time series datasets, TINN captures similarity at two levels of abstraction, namely, relationships among subsequences within each long time series and relationships across distinct time series in the data set. These relationships are compactly organized in an augmented relationship graph model, with the former relationships encoded in similarity vectors at TINN nodes and the later captured by augmented edge types in the TINN Graph. Query processing strategy deploy novel pruning techniques on the TINN Graph, including node skipping, vertical and horizontal pruning, to significantly reduce the number of time series as well as subsequences to be explored. Comprehensive experiments on synthetic and real world time series data demonstrate that our TINN model consistently outperforms state-of-the-art approaches while still guaranteeing to retrieve exact matches. |
Author | Bashir, Muzammil Sarkozy, Gabor Neamtu, Rodica Ahsan, Ramoza Rundensteiner, Elke A. |
Author_xml | – sequence: 1 givenname: Ramoza surname: Ahsan fullname: Ahsan, Ramoza organization: Worcester Polytechnic Institute,Worcester MA,USA – sequence: 2 givenname: Muzammil surname: Bashir fullname: Bashir, Muzammil organization: Worcester Polytechnic Institute,Worcester MA,USA – sequence: 3 givenname: Rodica surname: Neamtu fullname: Neamtu, Rodica organization: Worcester Polytechnic Institute,Worcester MA,USA – sequence: 4 givenname: Elke A. surname: Rundensteiner fullname: Rundensteiner, Elke A. organization: Worcester Polytechnic Institute,Worcester MA,USA – sequence: 5 givenname: Gabor surname: Sarkozy fullname: Sarkozy, Gabor organization: Worcester Polytechnic Institute,Worcester MA,USA |
BookMark | eNotj7lOAzEURY0EBQl8AY35gBne8zK2SwirFIUioY68PCeWyAQ8k4K_h4hUV0dHOtKdsPN-3xNjtwgtIri7h7J59KNXBhy0AtC1DqBTAs_YBI2wCFbb7pKZBflKw8gXVDbbsK98eQgDfR-oj8SXfzJueen5quyOWAsN_Bi-YhfZfw50fdop-3h-Ws1em_n7y9vsft4URDs2yTglQ9BIMmUVo6EsPAXIxioTUxQCLPpM2oROS4vCJxul1FHklHPwcspu_ruFiNZftex8_Vmfvshfll5Fkw |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/BigData47090.2019.9006421 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Agriculture |
EISBN | 1728108586 9781728108582 |
EndPage | 2066 |
ExternalDocumentID | 9006421 |
Genre | orig-research |
GroupedDBID | 6IE 6IL CBEJK RIE RIL |
ID | FETCH-LOGICAL-i118t-d7943bb51e3df4cc7ef2aeb0f7847cdc22081afe57b653812ad8c335c2fdffba3 |
IEDL.DBID | RIE |
IngestDate | Thu Jun 29 18:38:33 EDT 2023 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i118t-d7943bb51e3df4cc7ef2aeb0f7847cdc22081afe57b653812ad8c335c2fdffba3 |
PageCount | 10 |
ParticipantIDs | ieee_primary_9006421 |
PublicationCentury | 2000 |
PublicationDate | 2019-Dec. |
PublicationDateYYYYMMDD | 2019-12-01 |
PublicationDate_xml | – month: 12 year: 2019 text: 2019-Dec. |
PublicationDecade | 2010 |
PublicationTitle | 2019 IEEE International Conference on Big Data (Big Data) |
PublicationTitleAbbrev | BigData |
PublicationYear | 2019 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
Score | 1.7571391 |
Snippet | Continuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support on these big time series... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 2057 |
SubjectTerms | Agriculture Bridges Data mining Indexes Meteorology Nearest Neighbor Search Subsequence Mining Temperature sensors Time series analysis Time Series Data |
Title | Nearest Neighbor Subsequence Search in Time Series Data |
URI | https://ieeexplore.ieee.org/document/9006421 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JasMwEB2SHEp76ZKU7qjQY-3Isixbx24hFBJ6aCC3oGUUQsEpiXPp19ey3ZSWHnozwqBlkB4avfcG4MaYROqMJoGwKAKuo3JLceaCDKXVVghOq_opo7EYTvjzNJm24HarhUHEinyGof-s3vLt0mx8qqwvaaXLbEM7o6zWau3AdWOb2b9fzB9VoXhKJfWcLRk2__8onFLhxmAfRl891nSRt3BT6NB8_DJj_O-QDqD3rdAjL1vsOYQW5kewdzdfNVYa2IV07O1p1wUZ--xnGWriD4mGOU1qmjFZ5MSLQIhPkuGa-Jn1YDJ4en0YBk2ZhGBR3g6KwHqPN62TCGPruDEpOqZQU5eWyGOsYayEfeUwSbUoj7eIKZuZOE4Mc9Y5reJj6OTLHE-ASGVNpKUzqA3nSJUwwqYYO3QujbU8ha5fgtl77YQxa2Z_9nfzOez6MNTkjwvoFKsNXpYQXuirKnafmwae7g |
link.rule.ids | 310,311,786,790,795,796,802,27956,55107 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bS8MwFD7MCV5evEzxbgQf7Za2abI-ehtTt-LDBnsbTXIyhtDJ1r34603aOlF88K0ESi6H5CMn3_cdgGuloli2aeRxjdxj0rdbigXGa2Ospeac0aJ-Sj_h3SF7HkWjGtystDCIWJDPsOk-i7d8PVNLlyprxbTQZa7BusV5Kkq11gZcVcaZrbvp5CHNUyZoTB1rK25Wf_wonVIgR2cH-l99loSRt-Yyl0318cuO8b-D2oWDb40eeV2hzx7UMNuH7dvJvDLTwAaIxBnULnKSuPynDTZxx0TFnSYl0ZhMM-JkIMSlyXBB3MwOYNh5HNx3vapQgje194Pc087lTcrIx1AbppRAE6QoqREWe5RWQWCBPzUYCcntAecHqW6rMIxUYLQxMg0PoZ7NMjwCEqda-TI2CqViDGnKFdcCQ4PGiFDGx9BwSzB-L70wxtXsT_5uvoTN7qDfG_eekpdT2HIhKakgZ1DP50s8t4Cey4sijp8oVKJC |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+IEEE+International+Conference+on+Big+Data+%28Big+Data%29&rft.atitle=Nearest+Neighbor+Subsequence+Search+in+Time+Series+Data&rft.au=Ahsan%2C+Ramoza&rft.au=Bashir%2C+Muzammil&rft.au=Neamtu%2C+Rodica&rft.au=Rundensteiner%2C+Elke+A.&rft.date=2019-12-01&rft.pub=IEEE&rft.spage=2057&rft.epage=2066&rft_id=info:doi/10.1109%2FBigData47090.2019.9006421&rft.externalDocID=9006421 |