video classification 3d cnn pytorch下载 - video classification 3d cnn pytorch源代码下载

video classification 3d cnn pytorch

Python

1.0.0

下载

使用3D Resnet的视频分类

这是使用该代码训练的3D Resnet进行视频（操作）分类的Pytorch代码。
3D Resnet在动力学数据集上进行了培训，其中包括400个动作类。
该代码在分数模式下使用视频作为输入和输出类名称和每16帧的类别分数。
在功能模式下，此代码为每16帧输出512个DIM（全局平均池）的功能。

此代码的火炬（LUA）版本可在此处找到。

要求

Pytorch

 conda install pytorch torchvision cuda80 -c soumith

ffmpeg，ffprobe

 wget http://johnvansickle.com/ffmpeg/releases/ffmpeg-release-64bit-static.tar.xz
tar xvf ffmpeg-release-64bit-static.tar.xz
cd ./ffmpeg-3.3.3-64bit-static/; sudo cp ffmpeg ffprobe /usr/local/bin;

Python 3

准备

下载此代码。
下载验证的型号。
- Resnext-101在我们的实验中取得了最佳性能。（请参阅详细信息。）

用法

假设输入视频文件位于./videos中。

要计算每个16帧的类别分数，请使用--mode score 。

 python main.py --input ./input --video_root ./videos --output ./output.json --model ./resnet-34-kinetics.pth --mode score

要可视化分类结果，请使用generate_result_video/generate_result_video.py 。

要计算每个16帧的视频功能，请使用--mode feature 。

 python main.py --input ./input --video_root ./videos --output ./output.json --model ./resnet-34-kinetics.pth --mode feature

引用

如果您使用此代码，请引用以下内容：

 @article{hara3dcnns,
  author={Kensho Hara and Hirokatsu Kataoka and Yutaka Satoh},
  title={Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?},
  journal={arXiv preprint},
  volume={arXiv:1711.09577},
  year={2017},
}

展开

附加信息