tiny_vectordb下载tiny_vectordb源代码下载

tiny_vectordb

其他源码

1.0.0

下载

微小的矢量数据库

专为小型项目设计的轻量级矢量数据库。

特征

Just-On-time（JIT）编译以通过在编译时设置向量大小来优化向量操作。
使用特征加速矢量操作。
仅使用Python列表的过程向量，不需要任何其他第三方数据格式。
将向量作为基本64编码字符串存储在SQLite数据库中。

表现
比基于Numpy的向量操作快10倍以上。

发展状况

它当前与G ++或Clang ++兼容。
您可能需要在VectorDatabase初始化中修改compile_config参数以注入您的编译命令。
为了使其与其他编译器一起使用，您可能需要更改tiny_vectordb.jit模块。

安装

pip install tiny_vectordb

好！

卸载

该软件包将在源目录中发射一些编译的文件，这些文件可能不会使用pip uninstall自动删除，因此，如果您想昏昏欲睡，则需要手动运行以下命令。

python -c " import tiny_vectordb; tiny_vectordb.cleanup() "

之后，您可以安全地卸载包装：

pip uninstall tiny_vectordb

用法：

 from tiny_vectordb import VectorDatabase

collection_configs = [
    {
        "name" : "hello" ,
        "dimension" : 256 ,
    },
    {
        "name" : "world" ,
        "dimension" : 1000 ,
    }
]
database = VectorDatabase ( "test.db" , collection_configs )
collection = database [ "hello" ]

# add vectors
collection . setBlock (
    [ "id1" , "id2" ],             # ids
    [[ 1 ] * 256 , [ 2 ] * 256 ]      # vectors
)

# search for nearest vectors
search_ids , search_scores = collection . search ([ 1.9 ] * 256 )

有关更多用法，请参见example.py 。

设计注：

数据库中没有使用Numpy阵列，因为我希望它尽可能轻巧，并且数字列表可以将其转换为JSON，以与HTTP请求进行通信。
数据始终存储在连续内存中，以确保最佳的搜索性能。
因此，当它们设想内存重新分配时，首选添加和删除是在批处理中进行的。
以下是批处理操作的一些有用功能：

 class VectorCollection ( Generic [ NumVar ]):
    def addBlock ( self , ids : list [ str ], vectors : list [ list [ NumVar ]]) -> None :
    def setBlock ( self , ids : list [ str ], vectors : list [ list [ NumVar ]]) -> None :
    def deleteBlock ( self , ids : list [ str ]) -> None :
    def getBlock ( self , ids : list [ str ]) -> list [ list [ NumVar ]]: