Ho Kei Cheng,Alexander Schwing
伊利諾伊大學Urbana-Champaign
[arxiv] [PDF] [項目頁面]
處理長期阻塞:
很長的視頻;蒙面層插入:
資料來源:https://www.youtube.com/watch?v=q5xr0f4a0iu
外域案件:
資料來源:kaguya -sama:愛是戰爭 - 天才之戰的愛與大腦 - EP.3; A-1圖片
我們將視頻對象分割(VOS)框架為第一和最重要的是記憶問題。先驗作品主要使用一種類型的特徵內存。這可以是網絡權重(IE,在線學習),最後框架分割(例如,MaskTrack),空間隱藏表示形式(例如,基於cons-RNN的方法),空間 - 注意特徵(例如,STM,STCN,AOT)或某種長期的緊湊型特徵(例如,AFB,AFB-urr)。
記憶跨度短的方法對變化並不強大,而具有較大內存庫的方法則遭受計算和GPU內存使用情況的災難性增加。生成後,嘗試長期注意的VoS(例如AFB-ERR)熱切地壓縮功能,從而導致功能分辨率損失。
我們的方法靈感來自Atkinson-Shiffrin人類記憶模型,該模型具有感官記憶,工作記憶和長期內存。這些內存商店在我們的內存閱讀機制中具有不同的臨時量表,並相互補充。它在短期和長期視頻數據集中的性能都很好,可以輕鬆處理10,000多個幀的視頻。
首先,在get_started.md之後安裝所需的Python軟件包和數據集。
有關培訓,請參見訓練。
有關推論,請參見推理。 md。
如果您覺得此回購有用,請引用我們的論文!
@inproceedings { cheng2022xmem ,
title = { {XMem}: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model } ,
author = { Cheng, Ho Kei and Alexander G. Schwing } ,
booktitle = { ECCV } ,
year = { 2022 }
}本文開發的相關項目:
@inproceedings { cheng2021stcn ,
title = { Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation } ,
author = { Cheng, Ho Kei and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { NeurIPS } ,
year = { 2021 }
}
@inproceedings { cheng2021mivos ,
title = { Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion } ,
author = { Cheng, Ho Kei and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { CVPR } ,
year = { 2021 }
}我們在交互式演示中使用F-BRS:https://github.com/saic-vul/fbrs_interactive_segmentation
如果要引用數據集:
Bibtex
@inproceedings { shi2015hierarchicalECSSD ,
title = { Hierarchical image saliency detection on extended CSSD } ,
author = { Shi, Jianping and Yan, Qiong and Xu, Li and Jia, Jiaya } ,
booktitle = { TPAMI } ,
year = { 2015 } ,
}
@inproceedings { wang2017DUTS ,
title = { Learning to Detect Salient Objects with Image-level Supervision } ,
author = { Wang, Lijun and Lu, Huchuan and Wang, Yifan and Feng, Mengyang
and Wang, Dong, and Yin, Baocai and Ruan, Xiang } ,
booktitle = { CVPR } ,
year = { 2017 }
}
@inproceedings { FSS1000 ,
title = { FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation } ,
author = { Li, Xiang and Wei, Tianhan and Chen, Yau Pun and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { CVPR } ,
year = { 2020 }
}
@inproceedings { zeng2019towardsHRSOD ,
title = { Towards High-Resolution Salient Object Detection } ,
author = { Zeng, Yi and Zhang, Pingping and Zhang, Jianming and Lin, Zhe and Lu, Huchuan } ,
booktitle = { ICCV } ,
year = { 2019 }
}
@inproceedings { cheng2020cascadepsp ,
title = { {CascadePSP}: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement } ,
author = { Cheng, Ho Kei and Chung, Jihoon and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { CVPR } ,
year = { 2020 }
}
@inproceedings { xu2018youtubeVOS ,
title = { Youtube-vos: A large-scale video object segmentation benchmark } ,
author = { Xu, Ning and Yang, Linjie and Fan, Yuchen and Yue, Dingcheng and Liang, Yuchen and Yang, Jianchao and Huang, Thomas } ,
booktitle = { ECCV } ,
year = { 2018 }
}
@inproceedings { perazzi2016benchmark ,
title = { A benchmark dataset and evaluation methodology for video object segmentation } ,
author = { Perazzi, Federico and Pont-Tuset, Jordi and McWilliams, Brian and Van Gool, Luc and Gross, Markus and Sorkine-Hornung, Alexander } ,
booktitle = { CVPR } ,
year = { 2016 }
}
@inproceedings { denninger2019blenderproc ,
title = { BlenderProc } ,
author = { Denninger, Maximilian and Sundermeyer, Martin and Winkelbauer, Dominik and Zidan, Youssef and Olefir, Dmitry and Elbadrawy, Mohamad and Lodhi, Ahsan and Katam, Harinandan } ,
booktitle = { arXiv:1911.01911 } ,
year = { 2019 }
}
@inproceedings { shapenet2015 ,
title = { {ShapeNet: An Information-Rich 3D Model Repository} } ,
author = { Chang, Angel Xuan and Funkhouser, Thomas and Guibas, Leonidas and Hanrahan, Pat and Huang, Qixing and Li, Zimo and Savarese, Silvio and Savva, Manolis and Song, Shuran and Su, Hao and Xiao, Jianxiong and Yi, Li and Yu, Fisher } ,
booktitle = { arXiv:1512.03012 } ,
year = { 2015 }
}