Nearest Neighbor Subsequence Search in Time Series Data

Continuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support on these big time series datasets. However, finding exact similarity results, especially at the granularity of subsequences, is known to be prohibitively costly for large...

Full description

Saved in:
Bibliographic Details
Published in2019 IEEE International Conference on Big Data (Big Data) pp. 2057 - 2066
Main Authors Ahsan, Ramoza, Bashir, Muzammil, Neamtu, Rodica, Rundensteiner, Elke A., Sarkozy, Gabor
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2019
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Continuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support on these big time series datasets. However, finding exact similarity results, especially at the granularity of subsequences, is known to be prohibitively costly for large data sets. In this paper, we thus propose an efficient framework for solving this exact subsequence similarity match problem, called TINN (TIme series Nearest Neighbor search). Exploiting the range interval diversity properties of time series datasets, TINN captures similarity at two levels of abstraction, namely, relationships among subsequences within each long time series and relationships across distinct time series in the data set. These relationships are compactly organized in an augmented relationship graph model, with the former relationships encoded in similarity vectors at TINN nodes and the later captured by augmented edge types in the TINN Graph. Query processing strategy deploy novel pruning techniques on the TINN Graph, including node skipping, vertical and horizontal pruning, to significantly reduce the number of time series as well as subsequences to be explored. Comprehensive experiments on synthetic and real world time series data demonstrate that our TINN model consistently outperforms state-of-the-art approaches while still guaranteeing to retrieve exact matches.
AbstractList Continuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support on these big time series datasets. However, finding exact similarity results, especially at the granularity of subsequences, is known to be prohibitively costly for large data sets. In this paper, we thus propose an efficient framework for solving this exact subsequence similarity match problem, called TINN (TIme series Nearest Neighbor search). Exploiting the range interval diversity properties of time series datasets, TINN captures similarity at two levels of abstraction, namely, relationships among subsequences within each long time series and relationships across distinct time series in the data set. These relationships are compactly organized in an augmented relationship graph model, with the former relationships encoded in similarity vectors at TINN nodes and the later captured by augmented edge types in the TINN Graph. Query processing strategy deploy novel pruning techniques on the TINN Graph, including node skipping, vertical and horizontal pruning, to significantly reduce the number of time series as well as subsequences to be explored. Comprehensive experiments on synthetic and real world time series data demonstrate that our TINN model consistently outperforms state-of-the-art approaches while still guaranteeing to retrieve exact matches.
Author Bashir, Muzammil
Sarkozy, Gabor
Neamtu, Rodica
Ahsan, Ramoza
Rundensteiner, Elke A.
Author_xml – sequence: 1
  givenname: Ramoza
  surname: Ahsan
  fullname: Ahsan, Ramoza
  organization: Worcester Polytechnic Institute,Worcester MA,USA
– sequence: 2
  givenname: Muzammil
  surname: Bashir
  fullname: Bashir, Muzammil
  organization: Worcester Polytechnic Institute,Worcester MA,USA
– sequence: 3
  givenname: Rodica
  surname: Neamtu
  fullname: Neamtu, Rodica
  organization: Worcester Polytechnic Institute,Worcester MA,USA
– sequence: 4
  givenname: Elke A.
  surname: Rundensteiner
  fullname: Rundensteiner, Elke A.
  organization: Worcester Polytechnic Institute,Worcester MA,USA
– sequence: 5
  givenname: Gabor
  surname: Sarkozy
  fullname: Sarkozy, Gabor
  organization: Worcester Polytechnic Institute,Worcester MA,USA
BookMark eNotj7lOAzEURY0EBQl8AY35gBne8zK2SwirFIUioY68PCeWyAQ8k4K_h4hUV0dHOtKdsPN-3xNjtwgtIri7h7J59KNXBhy0AtC1DqBTAs_YBI2wCFbb7pKZBflKw8gXVDbbsK98eQgDfR-oj8SXfzJueen5quyOWAsN_Bi-YhfZfw50fdop-3h-Ws1em_n7y9vsft4URDs2yTglQ9BIMmUVo6EsPAXIxioTUxQCLPpM2oROS4vCJxul1FHklHPwcspu_ruFiNZftex8_Vmfvshfll5Fkw
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/BigData47090.2019.9006421
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Agriculture
EISBN 1728108586
9781728108582
EndPage 2066
ExternalDocumentID 9006421
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i118t-d7943bb51e3df4cc7ef2aeb0f7847cdc22081afe57b653812ad8c335c2fdffba3
IEDL.DBID RIE
IngestDate Thu Jun 29 18:38:33 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i118t-d7943bb51e3df4cc7ef2aeb0f7847cdc22081afe57b653812ad8c335c2fdffba3
PageCount 10
ParticipantIDs ieee_primary_9006421
PublicationCentury 2000
PublicationDate 2019-Dec.
PublicationDateYYYYMMDD 2019-12-01
PublicationDate_xml – month: 12
  year: 2019
  text: 2019-Dec.
PublicationDecade 2010
PublicationTitle 2019 IEEE International Conference on Big Data (Big Data)
PublicationTitleAbbrev BigData
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.7571391
Snippet Continuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support on these big time series...
SourceID ieee
SourceType Publisher
StartPage 2057
SubjectTerms Agriculture
Bridges
Data mining
Indexes
Meteorology
Nearest Neighbor Search
Subsequence Mining
Temperature sensors
Time series analysis
Time Series Data
Title Nearest Neighbor Subsequence Search in Time Series Data
URI https://ieeexplore.ieee.org/document/9006421
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JasMwEB2SHEp76ZKU7qjQY-3Isixbx24hFBJ6aCC3oGUUQsEpiXPp19ey3ZSWHnozwqBlkB4avfcG4MaYROqMJoGwKAKuo3JLceaCDKXVVghOq_opo7EYTvjzNJm24HarhUHEinyGof-s3vLt0mx8qqwvaaXLbEM7o6zWau3AdWOb2b9fzB9VoXhKJfWcLRk2__8onFLhxmAfRl891nSRt3BT6NB8_DJj_O-QDqD3rdAjL1vsOYQW5kewdzdfNVYa2IV07O1p1wUZ--xnGWriD4mGOU1qmjFZ5MSLQIhPkuGa-Jn1YDJ4en0YBk2ZhGBR3g6KwHqPN62TCGPruDEpOqZQU5eWyGOsYayEfeUwSbUoj7eIKZuZOE4Mc9Y5reJj6OTLHE-ASGVNpKUzqA3nSJUwwqYYO3QujbU8ha5fgtl77YQxa2Z_9nfzOez6MNTkjwvoFKsNXpYQXuirKnafmwae7g
link.rule.ids 310,311,786,790,795,796,802,27956,55107
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bS8MwFD7MCV5evEzxbgQf7Za2abI-ehtTt-LDBnsbTXIyhtDJ1r34603aOlF88K0ESi6H5CMn3_cdgGuloli2aeRxjdxj0rdbigXGa2Ospeac0aJ-Sj_h3SF7HkWjGtystDCIWJDPsOk-i7d8PVNLlyprxbTQZa7BusV5Kkq11gZcVcaZrbvp5CHNUyZoTB1rK25Wf_wonVIgR2cH-l99loSRt-Yyl0318cuO8b-D2oWDb40eeV2hzx7UMNuH7dvJvDLTwAaIxBnULnKSuPynDTZxx0TFnSYl0ZhMM-JkIMSlyXBB3MwOYNh5HNx3vapQgje194Pc087lTcrIx1AbppRAE6QoqREWe5RWQWCBPzUYCcntAecHqW6rMIxUYLQxMg0PoZ7NMjwCEqda-TI2CqViDGnKFdcCQ4PGiFDGx9BwSzB-L70wxtXsT_5uvoTN7qDfG_eekpdT2HIhKakgZ1DP50s8t4Cey4sijp8oVKJC
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+IEEE+International+Conference+on+Big+Data+%28Big+Data%29&rft.atitle=Nearest+Neighbor+Subsequence+Search+in+Time+Series+Data&rft.au=Ahsan%2C+Ramoza&rft.au=Bashir%2C+Muzammil&rft.au=Neamtu%2C+Rodica&rft.au=Rundensteiner%2C+Elke+A.&rft.date=2019-12-01&rft.pub=IEEE&rft.spage=2057&rft.epage=2066&rft_id=info:doi/10.1109%2FBigData47090.2019.9006421&rft.externalDocID=9006421