AppleNeuralHash2ONNX下载AppleNeuralHash2ONNX源代码下载

AppleNeuralHash2ONNX

其他类别

1.0.0

下载

Appleneuralhash2onnx

将CSAM检测的Apple Neuralhash模型转换为ONNX。

简介

Apple Neuralhash是一种基于神经网络的图像的感知散列方法。它可以忍受图像调整大小和压缩。哈希的步骤如下：

将图像转换为RGB。
将图像大小调整为360x360 。
将RGB值归一化为[-1, 1]范围。
对NeuralHash模型进行推断。
计算96x128矩阵的点产物，其结果为128个浮子。
将二进制步骤应用于最终的96个浮点矢量。
将1.0和0.0的向量转换为位，从而产生96位二进制数据。

在这个项目中，我们将Apple的Neuralhash模型转换为ONNX格式。还包括用于测试模型的演示脚本。

先决条件

操作系统

MacOS和Linux都可以工作。在以下各节中，debian用于Linux示例。

LZFSE解码器

MACOS：通过运行brew install lzfse 。
Linux：从LZFSE来源构建和安装。

Python

Python 3.6及以上应该有效。安装以下依赖项：

pip install onnx coremltools

转换指南

步骤1：获取NeuralHash模型

您将需要最近的MACOS或iOS构建中的4个文件：

NeuralHash_128x96_seed1.dat
NeuralHashv3b-current.Espresso.net
NeuralHashv3b-current.Espresso.Shape
neuralHashv3b-current.Espresso

选项1：来自MacOS或越狱iOS设备（推荐）

如果您已安装了最近版本的MacOS（11.4+）或越狱iOS（14.7+），只需从/System/Library/Frameworks/Vision.framework/Resources/ （在MacOS）或/System/Library/Frameworks/Vision.framework/ ios）中获取这些文件。

选项2：来自iOS IPSW（单击以显示）

从ipsw.me下载最近的iOS构建（14.7+）的任何.ipsw 。
解开文件：

 cd /path/to/ipsw/file
mkdir unpacked_ipsw
cd unpacked_ipsw
unzip ../ * .ipsw

找到系统图像：

ls -lh

您需要的是最大的.dmg文件，例如018-63036-003.dmg 。

安装系统图像。在MacOS上只需打开Finder中的文件即可。在Linux上运行以下命令：

 # Build and install apfs-fuse
sudo apt install fuse libfuse3-dev bzip2 libbz2-dev cmake g++ git libattr1-dev zlib1g-dev
git clone https://github.com/sgan81/apfs-fuse.git
cd apfs-fuse
git submodule init
git submodule update
mkdir build
cd build
cmake ..
make
sudo make install
sudo ln -s /bin/fusermount /bin/fusermount3
# Mount image
mkdir rootfs
apfs-fuse 018-63036-003.dmg rootfs

所需的文件在/System/Library/Frameworks/Vision.framework/下。

将它们放在同一目录下：

mkdir NeuralHash
cd NeuralHash
cp /System/Library/Frameworks/Vision.framework/Resources/NeuralHashv3b-current.espresso. * .
cp /System/Library/Frameworks/Vision.framework/Resources/neuralhash_128x96_seed1.dat .

步骤2：解码模型结构和形状

通常在JSON中编制了model.espresso.net和model.espresso.shape的Core ML模型存储结构。神经模型的模型是相同的，但被LZFSE压缩。

dd if=NeuralHashv3b-current.espresso.net bs=4 skip=7 | lzfse -decode -o model.espresso.net
dd if=NeuralHashv3b-current.espresso.shape bs=4 skip=7 | lzfse -decode -o model.espresso.shape
cp NeuralHashv3b-current.espresso.weights model.espresso.weights

步骤3：将模型转换为ONNX

 cd ..
git clone https://github.com/AsuharietYgvar/TNN.git
cd TNN
python3 tools/onnx2tnn/onnx-coreml/coreml2onnx.py ../NeuralHash

结果模型是NeuralHash/model.onnx 。

用法

检查型号

Netron是实现此目的的理想工具。

用onxruntime计算神经哈希

安装所需库：

pip install onnxruntime pillow

在图像上运行nnhash.py ：

python3 nnhash.py /path/to/model.onnx /path/to/neuralhash_128x96_seed1.dat image.jpg

示例输出：

 ab14febaa837b6c1484c35e6

注意：这里生成的神经哈希与iOS设备上生成的神经可能有几点。这是可以预期的，因为无论如何，不同的iOS设备会产生略有不同的哈希。原因是神经网络基于浮点计算。准确性高度取决于硬件。对于较小的网络，这不会有任何区别。但是Neuralhash具有200多层，导致了严重的累积错误。

设备	哈希
iPad Pro 10.5英寸	`2b186faa6b36ffcc4c4635e1`
M1 Mac	`2b5c6faa6bb7bdcc4c4731a1`
iOS模拟器	`2b5c6faa6bb6bdcc4c4731a1`
ONNX运行时	`2b5c6faa6bb6bdcc4c4735a1`