Sports and human action recognition in videos.
ResNet (2+1)D Convolutions is a network which explicitly factorizes 3D convolution into two separate and successive operations, a 2D spatial convolution and a 1D temporal convolution. It used for video understanding applications.
Model checkpoint:Kinetics-400
Input resolution:112x112
Number of parameters:31.5M
Model size (float):120 MB
Model size (w8a8):30.8 MB
Camera
Action Recognition
Source Model: BSD-3-CLAUSE
Deployable Model: AI-HUB-MODELS-LICENSE