Ho Kei Cheng, Alexander Schwing
University of Illinois Urbana-Champaign
[arxiv] [PDF] [Halaman Proyek]
Menangani oklusi jangka panjang:
Video yang sangat panjang; Penyisipan Lapisan Bertopeng:
Sumber: https://www.youtube.com/watch?v=Q5XR0F4A0IU
Kasus out-of-domain:
Sumber: Kaguya -Sama: Love is War - The Battle of Genius 'Love and Brains - Ep.3; Gambar A-1
Kami membingkai Segmentasi Objek Video (VOS), pertama dan terutama, sebagai masalah memori . Pekerjaan sebelumnya sebagian besar menggunakan satu jenis memori fitur. Ini dapat dalam bentuk bobot jaringan (yaitu, pembelajaran online), segmentasi bingkai terakhir (misalnya, masktrack), representasi tersembunyi spasial (misalnya, metode berbasis CONV-RNN), fitur spasial-perhatian (misalnya, STM, STCN, AOT), atau semacam fitur jangka panjang (EG, AFB-bur).
Metode dengan rentang memori pendek tidak kuat terhadap perubahan, sedangkan mereka yang memiliki bank memori besar mengalami peningkatan bencana dalam perhitungan dan penggunaan memori GPU. Upaya VOS attential jangka panjang seperti fitur kompres AFB-ARR dengan penuh semangat segera setelah dihasilkan, yang mengarah pada hilangnya resolusi fitur.
Metode kami terinspirasi oleh model memori manusia Atkinson-Shiffrin, yang memiliki memori sensorik , memori yang berfungsi , dan memori jangka panjang . Simpan memori ini memiliki skala sementara yang berbeda dan saling melengkapi dalam mekanisme pembacaan memori kita. Ini berkinerja baik dalam dataset video jangka pendek dan jangka panjang, menangani video dengan lebih dari 10.000 frame dengan mudah.
Pertama, instal paket Python yang diperlukan dan kumpulan data mengikuti Geting_Started.md.
Untuk pelatihan, lihat pelatihan.md.
Untuk inferensi, lihat inferensi.md.
Harap kutip makalah kami jika Anda menemukan repo ini berguna!
@inproceedings { cheng2022xmem ,
title = { {XMem}: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model } ,
author = { Cheng, Ho Kei and Alexander G. Schwing } ,
booktitle = { ECCV } ,
year = { 2022 }
}Proyek terkait yang dikembangkan oleh makalah ini:
@inproceedings { cheng2021stcn ,
title = { Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation } ,
author = { Cheng, Ho Kei and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { NeurIPS } ,
year = { 2021 }
}
@inproceedings { cheng2021mivos ,
title = { Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion } ,
author = { Cheng, Ho Kei and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { CVPR } ,
year = { 2021 }
}Kami menggunakan F-BRS dalam demo interaktif: https://github.com/saic-vul/fbrs_interactive_segmentation
Dan jika Anda ingin mengutip dataset:
Bibtex
@inproceedings { shi2015hierarchicalECSSD ,
title = { Hierarchical image saliency detection on extended CSSD } ,
author = { Shi, Jianping and Yan, Qiong and Xu, Li and Jia, Jiaya } ,
booktitle = { TPAMI } ,
year = { 2015 } ,
}
@inproceedings { wang2017DUTS ,
title = { Learning to Detect Salient Objects with Image-level Supervision } ,
author = { Wang, Lijun and Lu, Huchuan and Wang, Yifan and Feng, Mengyang
and Wang, Dong, and Yin, Baocai and Ruan, Xiang } ,
booktitle = { CVPR } ,
year = { 2017 }
}
@inproceedings { FSS1000 ,
title = { FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation } ,
author = { Li, Xiang and Wei, Tianhan and Chen, Yau Pun and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { CVPR } ,
year = { 2020 }
}
@inproceedings { zeng2019towardsHRSOD ,
title = { Towards High-Resolution Salient Object Detection } ,
author = { Zeng, Yi and Zhang, Pingping and Zhang, Jianming and Lin, Zhe and Lu, Huchuan } ,
booktitle = { ICCV } ,
year = { 2019 }
}
@inproceedings { cheng2020cascadepsp ,
title = { {CascadePSP}: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement } ,
author = { Cheng, Ho Kei and Chung, Jihoon and Tai, Yu-Wing and Tang, Chi-Keung } ,
booktitle = { CVPR } ,
year = { 2020 }
}
@inproceedings { xu2018youtubeVOS ,
title = { Youtube-vos: A large-scale video object segmentation benchmark } ,
author = { Xu, Ning and Yang, Linjie and Fan, Yuchen and Yue, Dingcheng and Liang, Yuchen and Yang, Jianchao and Huang, Thomas } ,
booktitle = { ECCV } ,
year = { 2018 }
}
@inproceedings { perazzi2016benchmark ,
title = { A benchmark dataset and evaluation methodology for video object segmentation } ,
author = { Perazzi, Federico and Pont-Tuset, Jordi and McWilliams, Brian and Van Gool, Luc and Gross, Markus and Sorkine-Hornung, Alexander } ,
booktitle = { CVPR } ,
year = { 2016 }
}
@inproceedings { denninger2019blenderproc ,
title = { BlenderProc } ,
author = { Denninger, Maximilian and Sundermeyer, Martin and Winkelbauer, Dominik and Zidan, Youssef and Olefir, Dmitry and Elbadrawy, Mohamad and Lodhi, Ahsan and Katam, Harinandan } ,
booktitle = { arXiv:1911.01911 } ,
year = { 2019 }
}
@inproceedings { shapenet2015 ,
title = { {ShapeNet: An Information-Rich 3D Model Repository} } ,
author = { Chang, Angel Xuan and Funkhouser, Thomas and Guibas, Leonidas and Hanrahan, Pat and Huang, Qixing and Li, Zimo and Savarese, Silvio and Savva, Manolis and Song, Shuran and Su, Hao and Xiao, Jianxiong and Yi, Li and Yu, Fisher } ,
booktitle = { arXiv:1512.03012 } ,
year = { 2015 }
}Hubungi: [email protected]