Efficient Per-Shot Transformer-Based Bitrate Ladder Prediction for Adaptive Video Streaming
Recently, HTTP adaptive streaming (HAS) has become a standard approach for over-the-top (OTT)-based video streaming services due to its ability to provide smooth streaming. In HAS, stream representations are encoded to target a specific bitrate providing a wide range of operating bitrates known as t...
Saved in:
Published in | 2023 IEEE International Conference on Image Processing (ICIP) pp. 1835 - 1839 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
08.10.2023
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/ICIP49359.2023.10222094 |
Cover
Loading…
Abstract | Recently, HTTP adaptive streaming (HAS) has become a standard approach for over-the-top (OTT)-based video streaming services due to its ability to provide smooth streaming. In HAS, stream representations are encoded to target a specific bitrate providing a wide range of operating bitrates known as the bitrate ladder. In the past, a fixed bitrate ladder approach for all videos has been widely used. However, such a method does not consider video content, which can vary considerably in motion, texture, and scene complexity. Moreover, building a per-title bitrate ladder based on an exhaustive encoding is quite expensive due to the large encoding parameter space. Thus, alternative solutions allowing accurate and efficient per-title bitrate ladder prediction are in great demand. On the other hand, self-attention-based architectures have achieved tremendous performance in large language models (LLMs) and particularly vision transformers (ViTs) in computer vision tasks. Therefore, this paper investigates ViT's capabilities in building an efficient bitrate ladder without performing any encoding process. We provide the first in-depth analysis of the prediction accuracy and the complexity overhead induced by the ViTs model in predicting the bitrate ladder on a large and diverse video dataset. The source code of the proposed solution and the dataset will be made publicly available. |
---|---|
AbstractList | Recently, HTTP adaptive streaming (HAS) has become a standard approach for over-the-top (OTT)-based video streaming services due to its ability to provide smooth streaming. In HAS, stream representations are encoded to target a specific bitrate providing a wide range of operating bitrates known as the bitrate ladder. In the past, a fixed bitrate ladder approach for all videos has been widely used. However, such a method does not consider video content, which can vary considerably in motion, texture, and scene complexity. Moreover, building a per-title bitrate ladder based on an exhaustive encoding is quite expensive due to the large encoding parameter space. Thus, alternative solutions allowing accurate and efficient per-title bitrate ladder prediction are in great demand. On the other hand, self-attention-based architectures have achieved tremendous performance in large language models (LLMs) and particularly vision transformers (ViTs) in computer vision tasks. Therefore, this paper investigates ViT's capabilities in building an efficient bitrate ladder without performing any encoding process. We provide the first in-depth analysis of the prediction accuracy and the complexity overhead induced by the ViTs model in predicting the bitrate ladder on a large and diverse video dataset. The source code of the proposed solution and the dataset will be made publicly available. |
Author | Fezza, Sid Ahmed Hamidouche, Wassim Morin, Luce Telili, Ahmed |
Author_xml | – sequence: 1 givenname: Ahmed surname: Telili fullname: Telili, Ahmed organization: Univ. Rennes, INSA Rennes, CNRS, IETR - UMR,Rennes,France,6164 – sequence: 2 givenname: Wassim surname: Hamidouche fullname: Hamidouche, Wassim organization: Univ. Rennes, INSA Rennes, CNRS, IETR - UMR,Rennes,France,6164 – sequence: 3 givenname: Sid Ahmed surname: Fezza fullname: Fezza, Sid Ahmed organization: National Higher School of Telecommunications and ICT,Oran,Algeria – sequence: 4 givenname: Luce surname: Morin fullname: Morin, Luce organization: Univ. Rennes, INSA Rennes, CNRS, IETR - UMR,Rennes,France,6164 |
BookMark | eNo1j99KwzAcRiPohZu-gWBeoDX_21xuZWqhYGHTGy9GmvziAjYdaRB8ewvq1YHD4YNvhS7jFAGhe0pKSol-aJu2F5pLXTLCeEkJY4xocYFWtGI11TWX6hq977wPNkDMuIdU7E9Txodk4uynNC5ia2ZweBtyMhlwZ5yDhPsELtgcpoiXDG-cOefwBfgtOJjwPicwY4gfN-jKm88Zbv-4Rq-Pu0PzXHQvT22z6YoTq6tcCCu0195oZwWp5EC9soQLQhUFIodKSm0Aakekqxjh3DMnBi9hkUoZxfka3f3uBgA4nlMYTfo-_h_mP_oiURU |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/ICIP49359.2023.10222094 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library (IEL) (UW System Shared) IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 1728198356 9781728198354 |
EndPage | 1839 |
ExternalDocumentID | 10222094 |
Genre | orig-research |
GrantInformation_xml | – fundername: Région Bretagne funderid: 10.13039/501100011697 |
GroupedDBID | 6IE 6IH CBEJK RIE RIO |
ID | FETCH-LOGICAL-h287t-4c49f9fa9dc4075b1f6c0340161e05b7559aee8d05d72033f2d4bf5eaee66a633 |
IEDL.DBID | RIE |
IngestDate | Wed Jan 10 09:27:48 EST 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | true |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-h287t-4c49f9fa9dc4075b1f6c0340161e05b7559aee8d05d72033f2d4bf5eaee66a633 |
OpenAccessLink | https://hal.science/hal-04356639 |
PageCount | 5 |
ParticipantIDs | ieee_primary_10222094 |
PublicationCentury | 2000 |
PublicationDate | 2023-Oct.-8 |
PublicationDateYYYYMMDD | 2023-10-08 |
PublicationDate_xml | – month: 10 year: 2023 text: 2023-Oct.-8 day: 08 |
PublicationDecade | 2020 |
PublicationTitle | 2023 IEEE International Conference on Image Processing (ICIP) |
PublicationTitleAbbrev | ICIP |
PublicationYear | 2023 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
Score | 2.2625685 |
Snippet | Recently, HTTP adaptive streaming (HAS) has become a standard approach for over-the-top (OTT)-based video streaming services due to its ability to provide... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 1835 |
SubjectTerms | adaptive video streaming Bit rate Bitrate ladder Buildings Computational modeling Feature extraction HEVC Image coding Streaming media Transformers video compression vision transformer |
Title | Efficient Per-Shot Transformer-Based Bitrate Ladder Prediction for Adaptive Video Streaming |
URI | https://ieeexplore.ieee.org/document/10222094 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA62J08qVnyTg9dsd7tJ2j3a0tKKloKtFDyUPCa0iN1Sthd_vZNsV1EQvC1DstlkIDPfzsw3hNwZrRxknYSh7W0z7hLDlE4F47atW1omKglk1U9jOZzxh7mY74vVQy0MAITkM4j8Y4jl29zs_K-yZkAniEdqpIbIrSzW2udsJXHWHPVGE-4rTSPfEzyqRv_omxLMxuCIjKsFy2yRt2hX6Mh8_OJi_PcXHZPGd4UenXzZnhNyAOtT8toPfBA4g05gy56XeUGnlWOKgi5aLEu7q0BISx_9nbPFt_hQjVcPxWH03qqNvwHpy8pCTn3QWr3jEg0yG_SnvSHbN09gSwRBBeOGZy5zKrMGMZvQiZMmTrn38CAWuo1IQgF0bCysj8SmrmW5dgJQKKWSaXpG6ut8DeeEShAWp6MriP5dR9gMt6ylMWCcF8cXpOFPZrEp-TEW1aFc_iG_IodeQWUm3TWpF9sd3KBpL_RtUOkndculIA |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dS8MwEA86H_RJxYnf5sHXdO3aZOujGxubbmPgJgMfRj4uOMR1jO7Fv95LuioKgm_lSJomB_nd9e5-R8idVtJC2owYYm-DJTbSTKqYs8Q0VF2JSEaerHo4Er1p8jDjs22xuq-FAQCffAaBe_SxfJPpjftVVvPeCfoju2QPgZ9HRbnWNmsrCtNav90fJ67WNHBdwYNy_I_OKR44uodkVC5Z5Iu8BZtcBfrjFxvjv7_piFS_a_To-At9jskOLE_IS8czQuAMOoY1e3rNcjopTVMUtBCzDG0tPCUtHbhbZ41vccEapyCKw-i9kSt3B9LnhYGMurC1fMclqmTa7UzaPbZtn8Be0Q3KWaKT1KZWpkaj18ZVZIUO48TZeBBy1UBfQgI0TciNi8XGtm4SZTmgUAgp4viUVJbZEs4IFcANTkdjEC28JjcpblkJrUFbJw7PSdWdzHxVMGTMy0O5-EN-S_Z7k-FgPuiPHi_JgVNWkVd3RSr5egPXCPS5uvHq_QRC_ahp |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+IEEE+International+Conference+on+Image+Processing+%28ICIP%29&rft.atitle=Efficient+Per-Shot+Transformer-Based+Bitrate+Ladder+Prediction+for+Adaptive+Video+Streaming&rft.au=Telili%2C+Ahmed&rft.au=Hamidouche%2C+Wassim&rft.au=Fezza%2C+Sid+Ahmed&rft.au=Morin%2C+Luce&rft.date=2023-10-08&rft.pub=IEEE&rft.spage=1835&rft.epage=1839&rft_id=info:doi/10.1109%2FICIP49359.2023.10222094&rft.externalDocID=10222094 |