Ho Kei Cheng,Alexander Schwing
伊利诺伊大学Urbana-Champaign
[arxiv] [PDF] [项目页面]
处理长期阻塞:
很长的视频;蒙面层插入:
资料来源:https://www.youtube.com/watch?v=q5xr0f4a0iu
外域案件:
资料来源:kaguya -sama:爱是战争 - 天才之战的爱与大脑 - EP.3; A-1图片
我们将视频对象分割(VOS)框架为第一和最重要的是记忆问题。先验作品主要使用一种类型的特征内存。这可以是网络权重(IE,在线学习),最后框架分割(例如,MaskTrack),空间隐藏表示形式(例如,基于cons-RNN的方法),空间 - 注意特征(例如,STM,STCN,AOT)或某种长期的紧凑型特征(例如,AFB,AFB-urr)。
记忆跨度短的方法对变化并不强大,而具有较大内存库的方法则遭受计算和GPU内存使用情况的灾难性增加。生成后,尝试长期注意的VoS(例如AFB-ERR)热切地压缩功能,从而导致功能分辨率损失。
我们的方法灵感来自Atkinson-Shiffrin人类记忆模型,该模型具有感官记忆,工作记忆和长期内存。这些内存商店在我们的内存阅读机制中具有不同的临时量表,并相互补充。它在短期和长期视频数据集中的性能都很好,可以轻松处理10,000多个帧的视频。
首先,在get_started.md之后安装所需的Python软件包和数据集。
有关培训,请参见训练。
有关推论,请参见推理。md。
如果您觉得此回购有用,请引用我们的论文!
@inproceedings { cheng2022xmem ,
title = { {XMem}: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model } ,
author = { Cheng, Ho Kei and Alexander G. Schwing } ,
booktitle = { ECCV } ,
year = { 2022 }
}本文开发的相关项目:
@inproceedings { cheng2021stcn ,
title = { Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation } ,
author = { Cheng, Ho Kei and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { NeurIPS } ,
year = { 2021 }
}
@inproceedings { cheng2021mivos ,
title = { Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion } ,
author = { Cheng, Ho Kei and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { CVPR } ,
year = { 2021 }
}我们在交互式演示中使用F-BRS:https://github.com/saic-vul/fbrs_interactive_segmentation
如果要引用数据集:
Bibtex
@inproceedings { shi2015hierarchicalECSSD ,
title = { Hierarchical image saliency detection on extended CSSD } ,
author = { Shi, Jianping and Yan, Qiong and Xu, Li and Jia, Jiaya } ,
booktitle = { TPAMI } ,
year = { 2015 } ,
}
@inproceedings { wang2017DUTS ,
title = { Learning to Detect Salient Objects with Image-level Supervision } ,
author = { Wang, Lijun and Lu, Huchuan and Wang, Yifan and Feng, Mengyang
and Wang, Dong, and Yin, Baocai and Ruan, Xiang } ,
booktitle = { CVPR } ,
year = { 2017 }
}
@inproceedings { FSS1000 ,
title = { FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation } ,
author = { Li, Xiang and Wei, Tianhan and Chen, Yau Pun and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { CVPR } ,
year = { 2020 }
}
@inproceedings { zeng2019towardsHRSOD ,
title = { Towards High-Resolution Salient Object Detection } ,
author = { Zeng, Yi and Zhang, Pingping and Zhang, Jianming and Lin, Zhe and Lu, Huchuan } ,
booktitle = { ICCV } ,
year = { 2019 }
}
@inproceedings { cheng2020cascadepsp ,
title = { {CascadePSP}: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement } ,
author = { Cheng, Ho Kei and Chung, Jihoon and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { CVPR } ,
year = { 2020 }
}
@inproceedings { xu2018youtubeVOS ,
title = { Youtube-vos: A large-scale video object segmentation benchmark } ,
author = { Xu, Ning and Yang, Linjie and Fan, Yuchen and Yue, Dingcheng and Liang, Yuchen and Yang, Jianchao and Huang, Thomas } ,
booktitle = { ECCV } ,
year = { 2018 }
}
@inproceedings { perazzi2016benchmark ,
title = { A benchmark dataset and evaluation methodology for video object segmentation } ,
author = { Perazzi, Federico and Pont-Tuset, Jordi and McWilliams, Brian and Van Gool, Luc and Gross, Markus and Sorkine-Hornung, Alexander } ,
booktitle = { CVPR } ,
year = { 2016 }
}
@inproceedings { denninger2019blenderproc ,
title = { BlenderProc } ,
author = { Denninger, Maximilian and Sundermeyer, Martin and Winkelbauer, Dominik and Zidan, Youssef and Olefir, Dmitry and Elbadrawy, Mohamad and Lodhi, Ahsan and Katam, Harinandan } ,
booktitle = { arXiv:1911.01911 } ,
year = { 2019 }
}
@inproceedings { shapenet2015 ,
title = { {ShapeNet: An Information-Rich 3D Model Repository} } ,
author = { Chang, Angel Xuan and Funkhouser, Thomas and Guibas, Leonidas and Hanrahan, Pat and Huang, Qixing and Li, Zimo and Savarese, Silvio and Savva, Manolis and Song, Shuran and Su, Hao and Xiao, Jianxiong and Yi, Li and Yu, Fisher } ,
booktitle = { arXiv:1512.03012 } ,
year = { 2015 }
}