MobileNetV2: Inverted Residuals and Linear Bottlenecks

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detectio...

Full description

Saved in:

Bibliographic Details
Published in	2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 4510 - 4520
Main Authors	Sandler, Mark, Howard, Andrew, Zhu, Menglong, Zhmoginov, Andrey, Chen, Liang-Chieh
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2018
Subjects	Computational modeling Computer architecture Manifolds Neural networks Task analysis
Online Access	Get full text

Cover

Loading…

Abstract	In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on ImageNet [1] classification, COCO object detection [2], VOC image segmentation [3]. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as actual latency, and the number of parameters.
AbstractList	In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on ImageNet [1] classification, COCO object detection [2], VOC image segmentation [3]. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as actual latency, and the number of parameters.
Author	Zhu, Menglong Chen, Liang-Chieh Howard, Andrew Zhmoginov, Andrey Sandler, Mark
Author_xml	– sequence: 1 givenname: Mark surname: Sandler fullname: Sandler, Mark – sequence: 2 givenname: Andrew surname: Howard fullname: Howard, Andrew – sequence: 3 givenname: Menglong surname: Zhu fullname: Zhu, Menglong – sequence: 4 givenname: Andrey surname: Zhmoginov fullname: Zhmoginov, Andrey – sequence: 5 givenname: Liang-Chieh surname: Chen fullname: Chen, Liang-Chieh
BookMark	eNotjktLw0AUhUdRsK1Zu3CTP5D03nllxp2GqoX4oGi3JcncgdE4kSQK_nsDCgfO4jt8nCU7iX0kxi4QckSw63L_vMs5oMkBZCGPWGILg0oYrSUHe8wWCFpk2qI9Y8k4vgEA10YYqRZMP_RN6OiRpj2_Srfxm4aJXLqjMbivuhvTOrq0CpHqIb3pp6mjSO37eM5O_Uwp-e8Ve73dvJT3WfV0ty2vqyxgoabMe_SkWyUaYb1z0MpGzs_Qo-aNtp4bcrzFhgS2BFYoN6917b0UBp0sxIpd_nkDER0-h_BRDz8Ho4o5XPwCfw9IGA
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/CVPR.2018.00474
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Applied Sciences
EISBN	9781538664209 1538664208
EISSN	1063-6919
EndPage	4520
ExternalDocumentID	8578572
Genre	orig-research
GroupedDBID	6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO
ID	FETCH-LOGICAL-i175t-ff1fe6c53b39fdd0c4b41531f162b69f28ed2c1be31ce0935de6c6aff4381d473
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:52:16 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i175t-ff1fe6c53b39fdd0c4b41531f162b69f28ed2c1be31ce0935de6c6aff4381d473
PageCount	11
ParticipantIDs	ieee_primary_8578572
PublicationCentury	2000
PublicationDate	2018-Jun
PublicationDateYYYYMMDD	2018-06-01
PublicationDate_xml	– month: 06 year: 2018 text: 2018-Jun
PublicationDecade	2010
PublicationTitle	2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
PublicationTitleAbbrev	CVPR
PublicationYear	2018
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0002683845 ssj0003211698
Score	2.6440444
Snippet	In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and...
SourceID	ieee
SourceType	Publisher
StartPage	4510
SubjectTerms	Computational modeling Computer architecture Manifolds Neural networks Task analysis
Title	MobileNetV2: Inverted Residuals and Linear Bottlenecks
URI	https://ieeexplore.ieee.org/document/8578572
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NSwMxEA21J09VW_GbHDy67eaj2cSjxVKEllJs6a1skglIYVe624u_3mR3rSIevCUhh5AweTOZNy8I3etUsdTFNvLQxyM-dN7mFCWRACrTxIoYKt2C6UxMlvxlPVy30MOhFgYAKvIZ9EOzyuXb3OzDU9lABmWWxF-4Rz5wq2u1Du8pVEgmmwxZ6DMf2QglGzUfEqvBaDVfBC5XIE_yQPD78Z1KhSbjDpp-raMmkWz7-1L3zccvicb_LvQE9b7r9vD8gEinqAXZGeo0jiZuzLjoIjHNtb8NZlCu6CMOUhs773jiBRRVaVaB08xiH6V6K8BPeVA5zsBsix5ajp9fR5Oo-UAhevNeQRk5RxwIM2SaKWdtbLj2eM2II4JqoRyVYKkhGhgxEDKi1s8WqXNB98vyhJ2jdpZncIEwcWFOohPrfTzOjaQgYy001QmXqeKXqBu2YfNea2Rsmh24-nv4Gh2Hg6gpVzeoXe72cOvBvdR31al-AgPqols
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFG4IHvSECsbf9uDRwdp1XedRokFlhBAg3MjaviaGZBg2Lv71tttEYzx4a5semjbvfa993_uK0K1M4yA1vvYs9DGPhcbaXEyJx4GKNNLch1K3IBnxwYy9LMJFA93tamEAoCSfQdc1y1y-XquteyrrCafMElmHu2dxPyRVtdbuRYVyEYg6R-b6gb3b8FjUej7Ej3v9-Xji2FyOPskcxe_Hhyolnjy1UPK1kopGsupuC9lVH79EGv-71EPU-a7cw-MdJh2hBmTHqFWHmrg25LyNeLKW1h-MoJjTe-zENjY29MQTyMvirBynmcb2nmrtAD-snc5xBmqVd9Ds6XHaH3j1Fwrem40LCs8YYoCrMJBBbLT2FZMWsQNiCKeSx4YK0FQRCQFR4HKi2s7mqTFO-UuzKDhBzWydwSnCxLg5kYy0jfIYU4KC8CWXVEZMpDE7Q223Dcv3SiVjWe_A-d_DN2h_ME2Gy-Hz6PUCHbhDqQhYl6hZbLZwZaG-kNflCX8Cr6SlpA
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+IEEE%2FCVF+Conference+on+Computer+Vision+and+Pattern+Recognition&rft.atitle=MobileNetV2%3A+Inverted+Residuals+and+Linear+Bottlenecks&rft.au=Sandler%2C+Mark&rft.au=Howard%2C+Andrew&rft.au=Zhu%2C+Menglong&rft.au=Zhmoginov%2C+Andrey&rft.date=2018-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=4510&rft.epage=4520&rft_id=info:doi/10.1109%2FCVPR.2018.00474&rft.externalDocID=8578572