PROCESSING VIDEOS BASED ON TEMPORAL STAGES
Disclosed is a technical solution to process a video that captures actions to be performed for completing a task based on a chronological sequence of stages within the task. An example system may identify an action sequence from an instruction for the task. The system inputs the action sequence into...
Saved in:
Main Authors | , , , , |
---|---|
Format | Patent |
Language | English |
Published |
20.04.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Disclosed is a technical solution to process a video that captures actions to be performed for completing a task based on a chronological sequence of stages within the task. An example system may identify an action sequence from an instruction for the task. The system inputs the action sequence into a trained model (e.g., a recurrent neural network), which outputs the chronological sequence of stages. The RNN may be trained through self-supervised learning. The system may input the video and the chronological sequence of stages into another trained model, e.g., a temporal convolutional network. The other trained model may include hidden layers arranged before an attention layer. The hidden layers may extract features from the video and feed the features into the attention layer. The attention layer may determine attention weights of the features based on the chronological sequence of stages. |
---|---|
Bibliography: | Application Number: US202218050757 |