A vision transformer for decoding surgeon activity from surgical videos

The intraoperative activity of a surgeon has substantial impact on postoperative outcomes. However, for most surgical procedures, the details of intraoperative surgical actions, which can vary widely, are not well understood. Here we report a machine learning system leveraging a vision transformer a...

Full description

Saved in:

Bibliographic Details
Published in	Nature biomedical engineering Vol. 7; no. 6; pp. 780 - 796
Main Authors	Kiyasseh, Dani, Ma, Runzhuo, Haque, Taseen F., Miles, Brian J., Wagner, Christian, Donoho, Daniel A., Anandkumar, Animashree, Hung, Andrew J.
Format	Journal Article
Language	English
Published	London Nature Publishing Group UK 01.06.2023 Nature Publishing Group
Subjects	631/114/1305 639/166/985 Biomedical and Life Sciences Biomedical Engineering/Biotechnology Biomedicine Hospitals Humans Image processing Learning algorithms Machine learning Postoperative period Robotic surgery Robotic Surgical Procedures - methods Skills Surgeons Video
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The intraoperative activity of a surgeon has substantial impact on postoperative outcomes. However, for most surgical procedures, the details of intraoperative surgical actions, which can vary widely, are not well understood. Here we report a machine learning system leveraging a vision transformer and supervised contrastive learning for the decoding of elements of intraoperative surgical activity from videos commonly collected during robotic surgeries. The system accurately identified surgical steps, actions performed by the surgeon, the quality of these actions and the relative contribution of individual video frames to the decoding of the actions. Through extensive testing on data from three different hospitals located in two different continents, we show that the system generalizes across videos, surgeons, hospitals and surgical procedures, and that it can provide information on surgical gestures and skills from unannotated videos. Decoding intraoperative activity via accurate machine learning systems could be used to provide surgeons with feedback on their operating skills, and may allow for the identification of optimal surgical behaviour and for the study of relationships between intraoperative factors and postoperative outcomes. A machine learning system leveraging a vision transformer and supervised contrastive learning accurately decodes elements of intraoperative surgical activity from videos commonly collected during robotic surgeries.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2157-846X 2157-846X
DOI:	10.1038/s41551-023-01010-8