semantic segmentation下载 - semantic segmentation源代码下载

semantic segmentation

Python

v0.2.6

下载

语义细分

易于使用和可自定义的SOTA语义分割模型，并在Pytorch中具有丰富的数据集

重大返工！敬请关注...

自2022年以来，已经发生了很多变化，如今甚至有开放世界的细分模型（任何细分）。但是，传统的细分模型仍然需要高精度和自定义用例。此存储库将根据新的Pytorch版本，更新的模型以及如何与自定义数据集一起使用的文档进行更新。

预期发布日期 - > 2024年5月

计划的功能：

整个培训管道上的返工
基线预训练模型
新的更新想法
与SOTA骨干模型（教程）易于集成
自定义数据集教程
分布式培训

当前要丢弃的功能：

提供的数据集将减少。但是，相反，代表性的数据集教程将保留使用。
提供的模型数量将减少。取而代之的是，将保留有价值的技巧和模块，并且可以轻松地与任何模型集成。
增强将被官方的Torchvisionv2变换所取代。
转换和推断其他框架

特征

适用于以下任务：
- 场景解析
- 人解析
- 面对解析
- 医疗图像细分（即将推出）
20+数据集
15多个SOTA骨架
10+ SOTA语义分割模型
Pytorch，ONNX，Tflite，OpenVino出口和推理

模型动物园

支持的骨干：

Resnet（CVPR 2016）
Resnetd（Arxiv 2018）
Mobilenetv2（CVPR 2018）
Mobilenetv3（ICCV 2019）
麻省理工学院（神经2021）
休息（神经2021）
Micronet（ICCV 2021）
Resnet+（Arxiv 2021）
PVTV2（CVMJ 2022）
泳池形式（CVPR 2022）
Convnext（CVPR 2022）
统一器（Arxiv 2022）
Van（Arxiv 2022）
戴维特（Arxiv 2022）

支持的头/方法：

FCN（CVPR 2015）
UPERNET（ECCV 2018）
Bisenetv1（ECCV 2018）
FPN（CVPR 2019）
SFNET（ECCV 2020）
Segformer（Neurips 2021）
FAPN（ICCV 2021）
condnet（IEEE SPL 2021）
轻汉（ICLR 2021）
Lawin（Arxiv 2022）
Topformer（CVPR 2022）

支持的独立模型：

Bisenetv2（IJCV 2021）
DDRNET（ARXIV 2021）

支持的模块：

PPM（CVPR 2017）
PSA（Arxiv 2021）

请参阅基准和可用预训练模型的模型。

并检查骨架是否有支撑的骨架。

注意：大多数方法没有预训练的模型。在一个存储库中将不同的模型与预先训练的权重相结合和有限的资源以重新培训自己非常困难。

支持的数据集

场景解析：

ADE20K
城市景观
可可固定
卡姆维德
Pascal-Contept
Mapillary Vistas
太阳RGB-D

人解析：

MHPV2
MHPV1
唇
CCIHP
CIHP
Atr

面对解析：

海伦
拉帕
ibugmask
Celebamaskhq
脸部固有

其他的：

Suim

有关更多详细信息和数据集准备，请参阅数据集。

可用的增强（单击以展开）

在此处检查笔记本以测试增强效果。

像素级变换：

colorjitter（亮度，对比度，饱和，色调）
伽玛，清晰度，自动对比，均衡，后代
Gaussianblur，灰度

空间级别的变换：

仿射，随机性
水平浮裁，垂直流体
Center Crop，Randomcrop
PAD，重新izepad，调整大小
RandomresizedCrop

用法

安装

python> = 3.6
火炬> = 1.8.1
火炬> = 0.9.1

然后，克隆回购，并使用以下方式安装项目

$ git clone https://github.com/sithu31296/semantic-segmentation
$ cd semantic-segmentation
$ pip install -e .

配置（单击以展开）

在configs中创建一个配置文件。可以在此处找到ADE20K数据集的示例配置。然后编辑您认为是否需要的字段。所有培训，评估和预测脚本都需要此配置文件。

培训（单击以扩展）

用单个GPU训练：

$ python tools/train.py --cfg configs/CONFIG_FILE.yaml

要使用多个GPU训练，请将配置文件中的DDP字段设置为true ，然后运行如下：

$ python -m torch.distributed.launch --nproc_per_node=2 --use_env tools/train.py --cfg configs/ < CONFIG_FILE_NAME > .yaml

评估（单击以展开）

确保将配置文件的MODEL_PATH设置为训练有素的模型目录。

$ python tools/val.py --cfg configs/ < CONFIG_FILE_NAME > .yaml

要评估多尺度和翻转，请将MSF中的ENABLE字段更改为true ，并运行与上述相同的命令。

推理

要进行推断，请从下面编辑配置文件的参数。

将MODEL >> NAME和BACKBONE更改为您所需的预处理模型。
根据验证的模型将DATASET >> NAME更改为数据集名称。
将TEST >> MODEL_PATH设置为测试模型的预处理重量。
将TEST >> FILE更改为您要测试的文件或图像文件夹路径。
测试结果将保存在SAVE_DIR中。

 # # example using ade20k pretrained models
$ python tools/infer.py --cfg configs/ade20k.yaml

示例测试结果（segformer-b2）：

test_result

转换为其他框架（ONNX，Coreml，OpenVino，Tflite）

要转换为ONNX和Coreml，请运行：

$ python tools/export.py --cfg configs/ < CONFIG_FILE_NAME > .yaml

要转换为OpenVino和Tflite，请参见Torch_optimize。

推理（ONNX，OpenVino，Tflite）

 # # ONNX Inference
$ python scripts/onnx_infer.py --model < ONNX_MODEL_PATH > --img-path < TEST_IMAGE_PATH >

# # OpenVINO Inference
$ python scripts/openvino_infer.py --model < OpenVINO_MODEL_PATH > --img-path < TEST_IMAGE_PATH >

# # TFLite Inference
$ python scripts/tflite_infer.py --model < TFLite_MODEL_PATH > --img-path < TEST_IMAGE_PATH >

参考（单击以展开）

https://github.com/coincheung/bisenet
https://github.com/open-mmlab/mmsegementation
https://github.com/rwightman/pytorch-image-models

引用（单击以展开）

 @article{xie2021segformer,
  title={SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers},
  author={Xie, Enze and Wang, Wenhai and Yu, Zhiding and Anandkumar, Anima and Alvarez, Jose M and Luo, Ping},
  journal={arXiv preprint arXiv:2105.15203},
  year={2021}
}

@misc{xiao2018unified,
  title={Unified Perceptual Parsing for Scene Understanding}, 
  author={Tete Xiao and Yingcheng Liu and Bolei Zhou and Yuning Jiang and Jian Sun},
  year={2018},
  eprint={1807.10221},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@article{hong2021deep,
  title={Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes},
  author={Hong, Yuanduo and Pan, Huihui and Sun, Weichao and Jia, Yisong},
  journal={arXiv preprint arXiv:2101.06085},
  year={2021}
}

@misc{zhang2021rest,
  title={ResT: An Efficient Transformer for Visual Recognition}, 
  author={Qinglong Zhang and Yubin Yang},
  year={2021},
  eprint={2105.13677},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{huang2021fapn,
  title={FaPN: Feature-aligned Pyramid Network for Dense Image Prediction}, 
  author={Shihua Huang and Zhichao Lu and Ran Cheng and Cheng He},
  year={2021},
  eprint={2108.07058},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{wang2021pvtv2,
  title={PVTv2: Improved Baselines with Pyramid Vision Transformer}, 
  author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
  year={2021},
  eprint={2106.13797},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@article{Liu2021PSA,
  title={Polarized Self-Attention: Towards High-quality Pixel-wise Regression},
  author={Huajun Liu and Fuqiang Liu and Xinyi Fan and Dong Huang},
  journal={Arxiv Pre-Print arXiv:2107.00782 },
  year={2021}
}

@misc{chao2019hardnet,
  title={HarDNet: A Low Memory Traffic Network}, 
  author={Ping Chao and Chao-Yang Kao and Yu-Shan Ruan and Chien-Hsiang Huang and Youn-Long Lin},
  year={2019},
  eprint={1909.00948},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@inproceedings{sfnet,
  title={Semantic Flow for Fast and Accurate Scene Parsing},
  author={Li, Xiangtai and You, Ansheng and Zhu, Zhen and Zhao, Houlong and Yang, Maoke and Yang, Kuiyuan and Tong, Yunhai},
  booktitle={ECCV},
  year={2020}
}

@article{Li2020SRNet,
  title={Towards Efficient Scene Understanding via Squeeze Reasoning},
  author={Xiangtai Li and Xia Li and Ansheng You and Li Zhang and Guang-Liang Cheng and Kuiyuan Yang and Y. Tong and Zhouchen Lin},
  journal={ArXiv},
  year={2020},
  volume={abs/2011.03308}
}

@ARTICLE{Yucondnet21,
  author={Yu, Changqian and Shao, Yuanjie and Gao, Changxin and Sang, Nong},
  journal={IEEE Signal Processing Letters}, 
  title={CondNet: Conditional Classifier for Scene Segmentation}, 
  year={2021},
  volume={28},
  number={},
  pages={758-762},
  doi={10.1109/LSP.2021.3070472}
}

@misc{yan2022lawin,
  title={Lawin Transformer: Improving Semantic Segmentation Transformer with Multi-Scale Representations via Large Window Attention}, 
  author={Haotian Yan and Chuang Zhang and Ming Wu},
  year={2022},
  eprint={2201.01615},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{yu2021metaformer,
  title={MetaFormer is Actually What You Need for Vision}, 
  author={Weihao Yu and Mi Luo and Pan Zhou and Chenyang Si and Yichen Zhou and Xinchao Wang and Jiashi Feng and Shuicheng Yan},
  year={2021},
  eprint={2111.11418},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{wightman2021resnet,
  title={ResNet strikes back: An improved training procedure in timm}, 
  author={Ross Wightman and Hugo Touvron and Hervé Jégou},
  year={2021},
  eprint={2110.00476},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{liu2022convnet,
  title={A ConvNet for the 2020s}, 
  author={Zhuang Liu and Hanzi Mao and Chao-Yuan Wu and Christoph Feichtenhofer and Trevor Darrell and Saining Xie},
  year={2022},
  eprint={2201.03545},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@misc{li2022uniformer,
  title={UniFormer: Unifying Convolution and Self-attention for Visual Recognition}, 
  author={Kunchang Li and Yali Wang and Junhao Zhang and Peng Gao and Guanglu Song and Yu Liu and Hongsheng Li and Yu Qiao},
  year={2022},
  eprint={2201.09450},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

展开

附加信息