Gaze Prediction in Dynamic 360° Immersive Videos

This paper explores gaze prediction in dynamic 360° immersive videos, i.e., based on the history scan path and VR contents, we predict where a viewer will look at an upcoming time. To tackle this problem, we first present the large-scale eye-tracking in dynamic VR scene dataset. Our dataset contains...

Full description

Saved in:
Bibliographic Details
Published in2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 5333 - 5342
Main Authors Xu, Yanyu, Dong, Yanbing, Wu, Junru, Sun, Zhengzhong, Shi, Zhiru, Yu, Jingyi, Gao, Shenghua
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2018
Subjects
Online AccessGet full text

Cover

Loading…
Abstract This paper explores gaze prediction in dynamic 360° immersive videos, i.e., based on the history scan path and VR contents, we predict where a viewer will look at an upcoming time. To tackle this problem, we first present the large-scale eye-tracking in dynamic VR scene dataset. Our dataset contains 208 360° videos captured in dynamic scenes, and each video is viewed by at least 31 subjects. Our analysis shows that gaze prediction depends on its history scan path and image contents. In terms of the image contents, those salient objects easily attract viewers' attention. On the one hand, the saliency is related to both appearance and motion of the objects. Considering that the saliency measured at different scales is different, we propose to compute saliency maps at different spatial scales: the sub-image patch centered at current gaze point, the sub-image corresponding to the Field of View (FoV), and the panorama image. Then we feed both the saliency maps and the corresponding images into a Convolutional Neural Network (CNN) for feature extraction. Meanwhile, we also use a Long-Short-Term-Memory (LSTM) to encode the history scan path. Then we combine the CNN features and LSTM features for gaze displacement prediction between gaze point at a current time and gaze point at an upcoming time. Extensive experiments validate the effectiveness of our method for gaze prediction in dynamic VR scenes.
AbstractList This paper explores gaze prediction in dynamic 360° immersive videos, i.e., based on the history scan path and VR contents, we predict where a viewer will look at an upcoming time. To tackle this problem, we first present the large-scale eye-tracking in dynamic VR scene dataset. Our dataset contains 208 360° videos captured in dynamic scenes, and each video is viewed by at least 31 subjects. Our analysis shows that gaze prediction depends on its history scan path and image contents. In terms of the image contents, those salient objects easily attract viewers' attention. On the one hand, the saliency is related to both appearance and motion of the objects. Considering that the saliency measured at different scales is different, we propose to compute saliency maps at different spatial scales: the sub-image patch centered at current gaze point, the sub-image corresponding to the Field of View (FoV), and the panorama image. Then we feed both the saliency maps and the corresponding images into a Convolutional Neural Network (CNN) for feature extraction. Meanwhile, we also use a Long-Short-Term-Memory (LSTM) to encode the history scan path. Then we combine the CNN features and LSTM features for gaze displacement prediction between gaze point at a current time and gaze point at an upcoming time. Extensive experiments validate the effectiveness of our method for gaze prediction in dynamic VR scenes.
Author Sun, Zhengzhong
Xu, Yanyu
Yu, Jingyi
Dong, Yanbing
Wu, Junru
Gao, Shenghua
Shi, Zhiru
Author_xml – sequence: 1
  givenname: Yanyu
  surname: Xu
  fullname: Xu, Yanyu
– sequence: 2
  givenname: Yanbing
  surname: Dong
  fullname: Dong, Yanbing
– sequence: 3
  givenname: Junru
  surname: Wu
  fullname: Wu, Junru
– sequence: 4
  givenname: Zhengzhong
  surname: Sun
  fullname: Sun, Zhengzhong
– sequence: 5
  givenname: Zhiru
  surname: Shi
  fullname: Shi, Zhiru
– sequence: 6
  givenname: Jingyi
  surname: Yu
  fullname: Yu, Jingyi
– sequence: 7
  givenname: Shenghua
  surname: Gao
  fullname: Gao, Shenghua
BookMark eNotzE1Kw0AUAOBRFKw1axdu5gKJ783vm6VUWwsFi5Ruy2TyAiMmkaQI9VSewZO50NW3-67FRT_0LMQtQoUI4X6x375WCpAqAGvDmSiCJ7SanDMKwrmYIThduoDhShTT9AYAypEmY2cCV_GL5XbkJqdjHnqZe_l46mOXk9QOfr7luut4nPIny31ueJhuxGUb3ycu_p2L3fJpt3guNy-r9eJhU-YAxzIqm5T2bfJgaxOSQkMGOSCRSbb25Dk1Hrz1bZ1Mcsa11jGRb1LQFFnPxd1fm5n58DHmLo6nA1lPznr9C1uhRS4
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR.2018.00559
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore Digital Libary (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9781538664209
1538664208
EISSN 1063-6919
EndPage 5342
ExternalDocumentID 8578657
Genre orig-research
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i90t-a25c237fc705b49c214841e91884c5b787ecd70757fbc4c646f56e887dc938ae3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:52:16 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i90t-a25c237fc705b49c214841e91884c5b787ecd70757fbc4c646f56e887dc938ae3
PageCount 10
ParticipantIDs ieee_primary_8578657
PublicationCentury 2000
PublicationDate 2018-Jun
PublicationDateYYYYMMDD 2018-06-01
PublicationDate_xml – month: 06
  year: 2018
  text: 2018-Jun
PublicationDecade 2010
PublicationTitle 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
PublicationTitleAbbrev CVPR
PublicationYear 2018
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002683845
ssj0003211698
Score 2.545206
Snippet This paper explores gaze prediction in dynamic 360° immersive videos, i.e., based on the history scan path and VR contents, we predict where a viewer will look...
SourceID ieee
SourceType Publisher
StartPage 5333
SubjectTerms Cameras
Games
Gaze tracking
Resists
Saliency detection
Task analysis
Videos
Title Gaze Prediction in Dynamic 360° Immersive Videos
URI https://ieeexplore.ieee.org/document/8578657
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELbaTkwFWsRbHhhJm8TvuVAVpKIKlapbVZ8vUoWUIpou_Cp-A78MOwkFIQa2OMpg-2Lf6_vuCLlKJBjuAKIEpY04iCzS3k6PUrV0HFDFyAJRePwgR0_8fi7mDXK948IgYgk-w154LHP5bg3bECrra_97SaGapOkdt4qrtYunpFIzXWfIwph5z0YaXVfzSWLTH8wmjwHLFcCTItQm_dFOpdQmwzYZf82jApE897aF7cHbrxKN_53oPul-8_boZKeRDkgD80PSrg1NWh_jTYckAeLjvwxJmiAYusrpTdWanjIZf7zTuzKe7W9COls5XG-6ZDq8nQ5GUd07IVqZuIiWqYCUqQxULCw3kHqvhydoEq29PKw_pQhOeXNBZRY4SC4zIdFfOA4M00tkR6SVr3M8JlQF3jODmNss40JZy_1GJ6G_UeoEA3dCOmEDFi9VdYxFvfbTv1-fkb0gggpsdU5axesWL7xaL-xlKc9PueKfzw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NTgIxEJ4gHvSECsZ_e_Do4u72d88oAQVCDBJuhE67CTFZjMDFp_IZfDLb3RWN8eCtbfbQdrad6cz3zQBcRQITZhCDyAodMORpoJydHsRyZhhaGVrqicL9geg8sfsJn1TgesOFsdbm4DPb9M08lm8WuPaushvlfi_B5RZsO73Po4KttfGoxEJRVcbIfJ-6t41IVJnPJwqTm9Z4-OjRXB4-yX120h8FVXJ90q5B_2smBYzkuble6Sa-_UrS-N-p7kHjm7lHhhudtA8Vmx1ArTQ1SXmQl3WIPMjHfenDNF40ZJ6R26I4PaEi_Hgn3dyj7e5CMp4bu1g2YNS-G7U6QVk9IZgn4SqYxRxjKlOUIdcswdi9e1hkk0gpJxHtzqlFI53BIFONDAUTKRfWXTkGE6pmlh5CNVtk9giI9MxniiHTacq41Jq5jY58haPYcIrmGOp-A6YvRX6Mabn2k7-HL2GnM-r3pr3u4OEUdr04CujVGVRXr2t77pT8Sl_ksv0ERl-jGA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+IEEE%2FCVF+Conference+on+Computer+Vision+and+Pattern+Recognition&rft.atitle=Gaze+Prediction+in+Dynamic+360%C2%B0+Immersive+Videos&rft.au=Xu%2C+Yanyu&rft.au=Dong%2C+Yanbing&rft.au=Wu%2C+Junru&rft.au=Sun%2C+Zhengzhong&rft.date=2018-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=5333&rft.epage=5342&rft_id=info:doi/10.1109%2FCVPR.2018.00559&rft.externalDocID=8578657