MobileNetV2: Inverted Residuals and Linear Bottlenecks

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detectio...

Full description

Saved in:
Bibliographic Details
Published in2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 4510 - 4520
Main Authors Sandler, Mark, Howard, Andrew, Zhu, Menglong, Zhmoginov, Andrey, Chen, Liang-Chieh
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2018
Subjects
Online AccessGet full text

Cover

Loading…
Abstract In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on ImageNet [1] classification, COCO object detection [2], VOC image segmentation [3]. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as actual latency, and the number of parameters.
AbstractList In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on ImageNet [1] classification, COCO object detection [2], VOC image segmentation [3]. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as actual latency, and the number of parameters.
Author Zhu, Menglong
Chen, Liang-Chieh
Howard, Andrew
Zhmoginov, Andrey
Sandler, Mark
Author_xml – sequence: 1
  givenname: Mark
  surname: Sandler
  fullname: Sandler, Mark
– sequence: 2
  givenname: Andrew
  surname: Howard
  fullname: Howard, Andrew
– sequence: 3
  givenname: Menglong
  surname: Zhu
  fullname: Zhu, Menglong
– sequence: 4
  givenname: Andrey
  surname: Zhmoginov
  fullname: Zhmoginov, Andrey
– sequence: 5
  givenname: Liang-Chieh
  surname: Chen
  fullname: Chen, Liang-Chieh
BookMark eNotjktLw0AUhUdRsK1Zu3CTP5D03nllxp2GqoX4oGi3JcncgdE4kSQK_nsDCgfO4jt8nCU7iX0kxi4QckSw63L_vMs5oMkBZCGPWGILg0oYrSUHe8wWCFpk2qI9Y8k4vgEA10YYqRZMP_RN6OiRpj2_Srfxm4aJXLqjMbivuhvTOrq0CpHqIb3pp6mjSO37eM5O_Uwp-e8Ve73dvJT3WfV0ty2vqyxgoabMe_SkWyUaYb1z0MpGzs_Qo-aNtp4bcrzFhgS2BFYoN6917b0UBp0sxIpd_nkDER0-h_BRDz8Ho4o5XPwCfw9IGA
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR.2018.00474
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9781538664209
1538664208
EISSN 1063-6919
EndPage 4520
ExternalDocumentID 8578572
Genre orig-research
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i175t-ff1fe6c53b39fdd0c4b41531f162b69f28ed2c1be31ce0935de6c6aff4381d473
IEDL.DBID RIE
IngestDate Wed Aug 27 02:52:16 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-ff1fe6c53b39fdd0c4b41531f162b69f28ed2c1be31ce0935de6c6aff4381d473
PageCount 11
ParticipantIDs ieee_primary_8578572
PublicationCentury 2000
PublicationDate 2018-Jun
PublicationDateYYYYMMDD 2018-06-01
PublicationDate_xml – month: 06
  year: 2018
  text: 2018-Jun
PublicationDecade 2010
PublicationTitle 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
PublicationTitleAbbrev CVPR
PublicationYear 2018
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002683845
ssj0003211698
Score 2.6440444
Snippet In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and...
SourceID ieee
SourceType Publisher
StartPage 4510
SubjectTerms Computational modeling
Computer architecture
Manifolds
Neural networks
Task analysis
Title MobileNetV2: Inverted Residuals and Linear Bottlenecks
URI https://ieeexplore.ieee.org/document/8578572
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NSwMxEA21J09VW_GbHDy67eaj2cSjxVKEllJs6a1skglIYVe624u_3mR3rSIevCUhh5AweTOZNy8I3etUsdTFNvLQxyM-dN7mFCWRACrTxIoYKt2C6UxMlvxlPVy30MOhFgYAKvIZ9EOzyuXb3OzDU9lABmWWxF-4Rz5wq2u1Du8pVEgmmwxZ6DMf2QglGzUfEqvBaDVfBC5XIE_yQPD78Z1KhSbjDpp-raMmkWz7-1L3zccvicb_LvQE9b7r9vD8gEinqAXZGeo0jiZuzLjoIjHNtb8NZlCu6CMOUhs773jiBRRVaVaB08xiH6V6K8BPeVA5zsBsix5ajp9fR5Oo-UAhevNeQRk5RxwIM2SaKWdtbLj2eM2II4JqoRyVYKkhGhgxEDKi1s8WqXNB98vyhJ2jdpZncIEwcWFOohPrfTzOjaQgYy001QmXqeKXqBu2YfNea2Rsmh24-nv4Gh2Hg6gpVzeoXe72cOvBvdR31al-AgPqols
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFG4IHvSECsbf9uDRwdp1XedRokFlhBAg3MjaviaGZBg2Lv71tttEYzx4a5semjbvfa993_uK0K1M4yA1vvYs9DGPhcbaXEyJx4GKNNLch1K3IBnxwYy9LMJFA93tamEAoCSfQdc1y1y-XquteyrrCafMElmHu2dxPyRVtdbuRYVyEYg6R-b6gb3b8FjUej7Ej3v9-Xji2FyOPskcxe_Hhyolnjy1UPK1kopGsupuC9lVH79EGv-71EPU-a7cw-MdJh2hBmTHqFWHmrg25LyNeLKW1h-MoJjTe-zENjY29MQTyMvirBynmcb2nmrtAD-snc5xBmqVd9Ds6XHaH3j1Fwrem40LCs8YYoCrMJBBbLT2FZMWsQNiCKeSx4YK0FQRCQFR4HKi2s7mqTFO-UuzKDhBzWydwSnCxLg5kYy0jfIYU4KC8CWXVEZMpDE7Q223Dcv3SiVjWe_A-d_DN2h_ME2Gy-Hz6PUCHbhDqQhYl6hZbLZwZaG-kNflCX8Cr6SlpA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+IEEE%2FCVF+Conference+on+Computer+Vision+and+Pattern+Recognition&rft.atitle=MobileNetV2%3A+Inverted+Residuals+and+Linear+Bottlenecks&rft.au=Sandler%2C+Mark&rft.au=Howard%2C+Andrew&rft.au=Zhu%2C+Menglong&rft.au=Zhmoginov%2C+Andrey&rft.date=2018-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=4510&rft.epage=4520&rft_id=info:doi/10.1109%2FCVPR.2018.00474&rft.externalDocID=8578572