欢迎来到Anakin Github。
阿纳金(Anakin)是一种跨平台,高性能的推理引擎,最初是由百度工程师开发的,是工业产品的大规模应用。
请参阅我们的发布公告,以跟踪Anakin的最新功能。
灵活性
Anakin是一种跨平台,高性能的推理引擎,支持广泛的神经网络体系结构和不同的硬件平台。在GPU / X86 / ARM平台上运行Anakin很容易。
Anakin已与Nvidia Tensorrt和开源此集成API集成,以提供服务,开发人员可以直接调用API或根据需要对其进行修改,这对于开发要求更加灵活。
高性能
为了全面发挥硬件的性能,我们在不同级别上优化了远期预测。
自动图融合。给定算法下所有性能优化的目标是使Alu尽可能忙。操作员融合可以有效地减少内存访问并使Alu忙碌。
内存重复使用。正向预测是一个单向计算。我们重复使用不同运算符的输入和输出之间的内存,从而减少了整体内存开销。
组装级优化。 Saber是Anakin的基础DNN库,在组装级别进行了深入优化。
CPU:
Intel(R) Xeon(R) CPU 5117 @ 2.0GHz
GPU:Tesla P4
库达:CUDA8
Cudnn:v7
ms )和内存(MB)
Anakin的对应物是公认的高性能推理引擎NVIDIA TensorRT 5,Tensorrt 5不支持我们使用自定义插件来支持的型号。
| batch_size | RT潜伏期FP32(MS) | Anakin2潜伏期FP32(MS) | RT内存(MB) | Anakin2内存(MB) |
|---|---|---|---|---|
| 1 | 8.52532 | 8.2387 | 1090.89 | 702 |
| 2 | 14.1209 | 13.8772 | 1056.02 | 768.76 |
| 4 | 24.4529 | 24.3391 | 1002.17 | 840.54 |
| 8 | 46.7956 | 46.3309 | 1098.98 | 935.61 |
| batch_size | RT潜伏期FP32(MS) | Anakin2潜伏期FP32(MS) | RT潜伏期INT8(MS) | Anakin2潜伏期INT8(MS) | RT内存FP32(MB) | Anakin2内存FP32(MB) |
|---|---|---|---|---|---|---|
| 1 | 4.6447 | 3.0863 | 1.78892 | 1.61537 | 1134.88 | 311.25 |
| 2 | 6.69187 | 5.13995 | 2.71136 | 2.70022 | 1108.86 | 382 |
| 4 | 11.1943 | 9.20513 | 4.16771 | 4.77145 | 885.96 | 406.86 |
| 8 | 19.8769 | 17.1976 | 6.2798 | 8.68197 | 813.84 | 532.61 |
| batch_size | RT潜伏期(MS) | anakin2潜伏期(MS) | RT潜伏期INT8(MS) | Anakin2潜伏期INT8(MS) | RT内存(MB) | Anakin2内存(MB) |
|---|---|---|---|---|---|---|
| 1 | 9.98695 | 5.44947 | 2.81031 | 2.74399 | 1159.16 | 500.5 |
| 2 | 17.3489 | 8.85699 | 4.8641 | 4.69473 | 1158.73 | 492 |
| 4 | 20.6198 | 16.8214 | 7.11608 | 8.45324 | 1021.68 | 541.08 |
| 8 | 31.9653 | 33.5015 | 11.2403 | 15.4336 | 914.49 | 611.54 |
CPU:
Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHzwith HT,用于FP32测试
CPU:Intel(R) Xeon(R) Gold 6271 CPU @ 2.60GHzwith HT,用于INT8测试
系统:GCC 4.8.2的CentOS 6.3,用于Anakin和Intel Caffe之间的基准
8 thread parallel
Anakin的对应物是MKLML的Intel Cafe(1.1.6) 。
| net_name | batch_size | Anakin2潜伏期(2650V4)FP32(MS) | CAFFE潜伏期(2650V4)FP32(MS) | anakin2延迟INT8(6271)(MS) |
|---|---|---|---|---|
| RESNET50 | 1 | 20.6201 | 24.1369 | 3.20866 |
| RESNET50 | 2 | 39.2286 | 43.1096 | 5.44311 |
| RESNET50 | 4 | 77.1392 | 81.8814 | 9.93424 |
| RESNET50 | 8 | 152.941 | 158.321 | 19.5618 |
| VGG16 | 1 | 55.6132 | 70.532 | 15.3181 |
| VGG16 | 2 | 96.5034 | 131.451 | 22.5082 |
| VGG16 | 4 | 180.479 | 247.926 | 37.2974 |
| VGG16 | 8 | 346.619 | 485.44 | 67.6682 |
| Mobilenetv1 | 1 | 3.98104 | 5.42775 | 0.926546 |
| Mobilenetv1 | 2 | 7.27079 | 9.16058 | 1.35007 |
| Mobilenetv1 | 4 | 14.4029 | 16.2505 | 2.37271 |
| Mobilenetv1 | 8 | 29.1651 | 29.8381 | 3.75992 |
| VGG16_SSD | 1 | 125.948 | 143.412 | |
| VGG16_SSD | 2 | 247.242 | 266.22 | |
| VGG16_SSD | 4 | 488.377 | 510.978 | |
| VGG16_SSD | 8 | 972.762 | 995.407 | |
| Mobilenetv2 | 1 | 3.78504 | 23.0066 | |
| Mobilenetv2 | 2 | 7.24622 | 65.9301 | |
| Mobilenetv2 | 4 | 13.7638 | 85.3893 | |
| Mobilenetv2 | 8 | 28.4093 | 131.669 |
CPU:
Kirin 980
CPU:Snapdragon 652
CPU:Snapdragon 855
CPU:RK3399
Anakin的对应物是ncnn(20190320) 。这个基准测试我们测试ARMV7 ARMV8分裂
one batch的延迟( ms )| 基林980 | Anakin FP32 | Anakin Int8 | NCNN FP32 | NCNN INT8 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | |
| Mobilenet_v1 | 34.172 | 19.369 | 12.723 | 37.588 | 20.692 | 13.280 | 45.420 | 24.220 | 16.730 | 50.560 | 27.820 | 20.010 |
| Mobilenet_v2 | 30.489 | 17.784 | 12.327 | 29.581 | 17.208 | 15.307 | 30.390 | 17.310 | 12.900 | |||
| Mobilenet_ssd | 71.609 | 37.477 | 28.952 | 88.220 | 70.070 | 66.430 | 103.700 | 85.160 | 85.320 | |||
| RESNET50 | 255.748 | 137.842 | 104.628 | 1299.480 | 695.830 | 498.010 | 243.360 | 131.100 | 89.800 | |||
| Shufflenetv1 | 11.544 | 8.931 | 7.027 | 12.810 | 9.390 | 8.030 | ||||||
| ShuffLenetV2 | 11.687 | 7.899 | 5.321 | 20.402 | 11.529 | 9.061 | ||||||
| 挤压 | 28.580 | 16.638 | 14.435 | |||||||||
| Googlenet | 93.917 | 52.742 | 40.301 | 130.875 | 72.522 | 54.204 |
| Snapdragon 855 | Anakin FP32 | Anakin Int8 | NCNN FP32 | NCNN INT8 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | |
| Mobilenet_v1 | 32.019 | 19.024 | 10.491 | 34.363 | 20.292 | 10.382 | 37.110 | 22.310 | 13.520 | 47.430 | 28.350 | 15.830 |
| Mobilenet_v2 | 28.533 | 17.455 | 10.433 | 24.487 | 15.182 | 9.133 | 25.060 | 15.970 | 11.250 | |||
| Mobilenet_ssd | 66.454 | 41.397 | 23.639 | 101.560 | 69.380 | 43.930 | 136.420 | 91.010 | 47.490 | |||
| RESNET50 | 201.362 | 132.133 | 78.300 | 1141.290 | 724.090 | 385.990 | 229.020 | 138.450 | 82.060 | |||
| Shufflenetv1 | 10.153 | 7.101 | 5.327 | 11.610 | 8.020 | 5.870 | ||||||
| ShuffLenetV2 | 10.868 | 6.713 | 4.526 | 17.306 | 10.987 | 6.788 | ||||||
| 挤压 | 25.880 | 16.134 | 9.697 | |||||||||
| Googlenet | 85.774 | 54.518 | 34.025 | 118.120 | 73.686 | 41.865 |
| Snapdragon 652 | Anakin FP32 | Anakin Int8 | NCNN FP32 | NCNN INT8 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | |
| Mobilenet_v1 | 109.994 | 54.937 | 33.174 | 83.887 | 43.639 | 24.665 | 123.320 | 122.670 | 65.100 | 128.800 | 154.370 | 125.570 |
| Mobilenet_v2 | 80.712 | 46.314 | 30.874 | 69.340 | 43.590 | 31.864 | 89.920 | 90.900 | 55.320 | |||
| Mobilenet_ssd | 246.459 | 121.684 | 134.019 | 248.190 | 138.170 | 142.350 | 247.020 | 145.080 | 211.000 | |||
| RESNET50 | 673.285 | 346.287 | 378.065 | 880.940 | 514.190 | 533.760 | 313.630 | |||||
| Shufflenetv1 | 34.948 | 26.635 | 21.571 | 39.950 | 25.520 | 20.180 | ||||||
| ShuffLenetV2 | 35.530 | 21.440 | 16.434 | 49.498 | 29.116 | 19.346 | ||||||
| 挤压 | 87.037 | 47.192 | 28.663 | |||||||||
| Googlenet | 268.023 | 148.533 | 95.624 | 236.492 | 131.510 | 81.561 |
| RK3399 | Anakin FP32 | Anakin Int8 | NCNN FP32 | NCNN INT8 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | |
| Mobilenet_v1 | 111.317 | 60.008 | 87.201 | 45.693 | 149.270 | 91.200 | 142.790 | 86.140 | ||||
| Mobilenet_v2 | 105.767 | 60.899 | 79.065 | 53.914 | 118.530 | 86.900 | ||||||
| Mobilenet_ssd | 232.923 | 128.337 | 268.900 | 157.860 | 256.560 | 149.730 | ||||||
| RESNET50 | 671.800 | 369.386 | 1029.300 | 571.230 | 569.250 | 344.830 | ||||||
| Shufflenetv1 | 38.761 | 25.971 | ||||||||||
| ShuffLenetV2 | 36.220 | 22.095 | 51.879 | 30.351 | ||||||||
| 挤压 | 98.489 | 54.863 | ||||||||||
| Googlenet | 274.166 | 159.429 | 235.085 | 133.044 |
one batch的延迟( ms )| 基林980 | Anakin FP32 | Anakin Int8 | NCNN FP32 | NCNN INT8 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | |
| Mobilenet_v1 | 39.051 | 19.813 | 14.184 | 39.026 | 22.048 | 14.250 | 50.240 | 26.850 | 20.010 | 92.900 | 49.420 | 37.160 |
| Mobilenet_v2 | 36.052 | 19.550 | 14.507 | 32.656 | 19.641 | 15.735 | 35.890 | 20.730 | 18.550 | |||
| Mobilenet_ssd | 83.474 | 44.530 | 33.116 | 99.960 | 53.160 | 84.360 | 180.000 | 91.380 | 68.140 | |||
| RESNET50 | 291.478 | 158.954 | 129.484 | 1412.37 | 766.62 | 560.760 | 355.010 | 189.18 | 133.410 | |||
| Shufflenetv1 | 11.909 | 9.761 | 7.441 | 16.030 | 10.660 | 8.120 | ||||||
| ShuffLenetV2 | 11.755 | 7.983 | 6.289 | 21.968 | 14.111 | 9.888 | ||||||
| 挤压 | 30.148 | 20.908 | 17.084 | |||||||||
| Googlenet | 108.210 | 65.798 | 58.630 | 140.886 | 79.910 | 60.693 |
| Snapdragon 855 | Anakin FP32 | Anakin Int8 | NCNN FP32 | NCNN INT8 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | |
| Mobilenet_v1 | 34.015 | 20.064 | 11.410 | 42.222 | 21.532 | 11.746 | 41.150 | 24.870 | 18.420 | 79.180 | 48.470 | 24.530 |
| Mobilenet_v2 | 30.742 | 18.507 | 11.354 | 24.628 | 15.133 | 9.079 | 30.060 | 19.220 | 15.520 | |||
| Mobilenet_ssd | 69.749 | 44.010 | 26.000 | 85.030 | 62.770 | 48.940 | 154.600 | 138.700 | 82.140 | |||
| RESNET50 | 218.581 | 146.509 | 92.899 | 1380.340 | 996.410 | 540.660 | 324.720 | 261.920 | 126.270 | |||
| Shufflenetv1 | 11.032 | 7.430 | 5.369 | 13.390 | 9.270 | 6.360 | ||||||
| ShuffLenetV2 | 11.372 | 7.120 | 4.728 | 19.393 | 12.278 | 7.719 | ||||||
| 挤压 | 27.860 | 17.538 | 10.729 | |||||||||
| Googlenet | 100.719 | 69.509 | 49.021 | 127.982 | 83.369 | 50.275 |
| Snapdragon 652 | Anakin FP32 | Anakin Int8 | NCNN FP32 | NCNN INT8 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | 1个线程 | 2线 | 4线 | |
| Mobilenet_v1 | 121.982 | 63.004 | 37.325 | 86.672 | 45.728 | 26.354 | 130.740 | 140.850 | 81.810 | 184.630 | 192.730 | 144.740 |
| Mobilenet_v2 | 89.113 | 50.609 | 35.291 | 72.679 | 45.888 | 33.887 | 94.520 | 101.380 | 65.570 | |||
| Mobilenet_ssd | 236.466 | 132.293 | 86.335 | 270.630 | 295.520 | 174.280 | 350.640 | 286.420 | 243.850 | |||
| RESNET50 | 751.528 | 405.433 | 255.699 | 2762.890 | 1447.070 | 883.730 | 664.180 | 369.020 | ||||
| Shufflenetv1 | 36.883 | 23.718 | 15.144 | 53.660 | 33.450 | 23.330 | ||||||
| ShuffLenetV2 | 36.933 | 26.353 | 20.507 | 53.243 | 31.083 | 21.550 | ||||||
| 挤压 | 92.748 | 51.936 | 33.027 | |||||||||
| Googlenet | 296.092 | 179.542 | 125.509 | 242.505 | 140.083 | 89.646 |
| RK3399 | Anakin FP32 | Anakin Int8 | NCNN FP32 | NCNN INT8 | ||||
|---|---|---|---|---|---|---|---|---|
| 1个线程 | 2线 | 1个线程 | 2线 | 1个线程 | 2线 | 1个线程 | 2线 | |
| Mobilenet_v1 | 116.981 | 65.033 | 87.768 | 47.617 | 155.830 | 98.520 | 201.800 | 116.440 |
| Mobilenet_v2 | 118.229 | 70.567 | 83.790 | 55.413 | 126.530 | 90.930 | ||
| Mobilenet_ssd | 237.196 | 134.508 | 292.130 | 183.650 | 361.570 | 200.370 | ||
| RESNET50 | 725.582 | 413.995 | 2883.120 | 1632.800 | 702.660 | 404.970 | ||
| Shufflenetv1 | 41.094 | 27.353 | ||||||
| ShuffLenetV2 | 37.660 | 23.489 | 53.558 | 32.122 | ||||
| 挤压 | 104.519 | 59.402 | ||||||
| Googlenet | 305.304 | 190.897 | 244.855 | 142.493 |
您所需要的只是在DOC索引中
我们还提供英语和中文教程文档。
用户指南
您可以从此处获取项目的工作原理,C ++接口描述和代码示例。您也可以在此处了解模型转换器。
开发人员指南
您可能想了解Anakin的更多细节,并使其变得更好。请参阅如何添加自定义设备以及如何添加自定义设备操作员。
如何贡献
感谢您的贡献!
欢迎您将问题和错误报告作为GitHub问题提交。
Anakin由Apache-2.0许可提供。
Anakin指的是以下项目: