client vector search下载 - client vector search源代码下载

client vector search

其他源码

1.0.0

下载

客户端矢量搜索

可以嵌入，搜索和缓存的客户端矢量搜索库。在浏览器和服务器端工作。

它的表现优于Openai的文本插入-ADA-002，并且比Pinecone和其他Vectordb的速度快。

我是searchbase.app的创始人，我们需要为产品和客户提供此功能。我们将在生产中使用此库。您可以确定它将得到维护和改进。

默认情况下使用变压器嵌入文档：GTE-SMALL（〜30MB）。
计算嵌入之间的余弦相似性。
在客户端创建索引并搜索
带有浏览器缓存支持的缓存向量。

许多改进即将到来！

路线图

我们的目标是构建一个超级简单，快速的矢量搜索，该搜索与数百到数千个向量一起使用。每个用户〜1K矢量覆盖了99％的用例。

我们最初将事情保持超级简单，低于100ms

戒酒

添加在节点和浏览器env上有效的HNSW索引，不要依靠HNSW Binder Libs
为LIB添加适当的测试套件和CI/CD
- 简单的健康测试
  - 嘲笑 @Xenova/jest的变形金刚，对此不满意
- 性能测试，召回，内存使用情况，CPU使用率等。

安装

npm i client-vector-search

Quickstart

该库提供了用于嵌入和矢量搜索的插件解决方案。它旨在易于使用，高效和通用。这是一个快速启动指南：

  import { getEmbedding , EmbeddingIndex } from 'client-vector-search' ;

  // getEmbedding is an async function, so you need to use 'await' or '.then()' to get the result
  const embedding = await getEmbedding ( "Apple" ) ; // Returns embedding as number[]

  // Each object should have an 'embedding' property of type number[]
  const initialObjects = [
  { id : 1 , name : "Apple" , embedding : embedding } ,
  { id : 2 , name : "Banana" , embedding : await getEmbedding ( "Banana" ) } ,
  { id : 3 , name : "Cheddar" , embedding : await getEmbedding ( "Cheddar" ) } ,
  { id : 4 , name : "Space" , embedding : await getEmbedding ( "Space" ) } ,
  { id : 5 , name : "database" , embedding : await getEmbedding ( "database" ) } ,
  ] ;
  const index = new EmbeddingIndex ( initialObjects ) ; // Creates an index

  // The query should be an embedding of type number[]
  const queryEmbedding = await getEmbedding ( 'Fruit' ) ; // Query embedding
  const results = await index . search ( queryEmbedding , { topK : 5 } ) ; // Returns top similar objects

  // specify the storage type
  await index . saveIndex ( 'indexedDB' ) ;
  const results = await index . search ( [ 1 , 2 , 3 ] , {
    topK : 5 ,
    useStorage : 'indexedDB' ,
    // storageOptions: { // use only if you overrode the defaults
    //   indexedDBName: 'clientVectorDB',
    //   indexedDBObjectStoreName: 'ClientEmbeddingStore',
    // },
  } ) ;

  console . log ( results ) ;

  await index . deleteIndexedDB ( ) ; // if you overrode default, specify db name

故障射击

nextjs

要在NextJS项目中使用它，您需要更新next.config.js文件以包括以下内容：

 module . exports = {
  // Override the default webpack configuration
  webpack : ( config ) => {
    // See https://webpack.js.org/configuration/resolve/#resolvealias
    config . resolve . alias = {
      ... config . resolve . alias ,
      sharp$ : false ,
      "onnxruntime-node$" : false ,
    } ;
    return config ;
  } ,
} ;

加载页面后的模型加载

您可以在使用模型生成嵌入之前初始化该模型。这将确保在使用之前加载模型并提供更好的UX。

 import { initializeModel } from "client-vector-search"
. . .
  useEffect ( ( ) => {
    try {
      initializeModel ( ) ;
    } catch ( e ) {
      console . log ( e ) ;
    }
  } , [ ] ) ;

用法指南

本指南提供了库的主要功能的分步演练。它涵盖了从生成字符串的嵌入到索引上的操作（例如添加，更新和删除对象）上的所有内容。它还包括有关如何将索引保存到数据库并在其中执行搜索操作的说明。

在我们有参考文献文档之前，您可以在本指南中找到所有方法及其用法。每个步骤都伴随着代码段，以说明所讨论方法的用法。确保跟随并在您自己的环境中尝试示例，以更好地了解一切工作原理。

让我们开始吧！

步骤1：生成字符串的嵌入

使用getEmbedding方法为给定的字符串生成嵌入。

 const embedding = await getEmbedding ( "Apple" ) ; // Returns embedding as number[]

注意： getEmbedding是异步的；确保在await 。

步骤2：计算余弦相似性

计算两个嵌入之间的余弦相似性。

 const similarity = cosineSimilarity ( embedding1 , embedding2 , 6 ) ;

注意：两个嵌入的长度都应相同。

步骤3：创建索引

创建一个具有初始对象数组的索引。每个对象必须具有“嵌入”属性。

 const initialObjects = [ ... ] ;
const index = new EmbeddingIndex ( initialObjects ) ;

步骤4：添加索引

将对象添加到索引中。

 const objectToAdd = { id : 6 , name : 'Cat' , embedding : await getEmbedding ( 'Cat' ) } ;
index . add ( objectToAdd ) ;

步骤5：更新索引

在索引中更新现有对象。

 const vectorToUpdate = { id : 6 , name : 'Dog' , embedding : await getEmbedding ( 'Dog' ) } ;
index . update ( { id : 6 } , vectorToUpdate ) ;

步骤6：从索引中删除

从索引中删除对象。

 index . remove ( { id : 6 } ) ;

步骤7：从索引检索

从索引中检索对象。

 const vector = index . get ( { id : 1 } ) ;

步骤8：搜索索引

使用查询嵌入搜索索引。

 const queryEmbedding = await getEmbedding ( 'Fruit' ) ;
const results = await index . search ( queryEmbedding , { topK : 5 } ) ;

步骤9：打印索引

将整个索引打印到控制台。

 index . printIndex ( ) ;

步骤10：将索引保存到索引DB（用于浏览器）

将索引保存到持续的索引DDB数据库中。笔记

 await index . saveIndex ( "indexedDB" , { DBName : "clientVectorDB" , objectStoreName : "ClientEmbeddingStore" } )

重要：在索引中搜索

在IndexedDB中执行搜索操作。

 const results = await index . search ( queryEmbedding , {
  topK : 5 ,
  useStorage : "indexedDB" ,
  storageOptions : { // only if you want to override the default options, defaults are below
    indexedDBName : 'clientVectorDB' ,
    indexedDBObjectStoreName : 'ClientEmbeddingStore'
  }
} ) ;

-- -

### Delete Database
To delete an entire database .

`` ` ts
await IndexedDbManager . deleteIndexedDB ( "clientVectorDB" ) ;

删除对象存储

从数据库中删除对象存储。

 await IndexedDbManager . deleteIndexedDBObjectStore ( "clientVectorDB" , "ClientEmbeddingStore" ) ;

检索所有对象

从特定对象存储中检索所有对象。

 const allObjects = await IndexedDbManager . getAllObjectsFromIndexedDB ( "clientVectorDB" , "ClientEmbeddingStore" ) ;

展开

附加信息

版本 1.0.0
类型其他源码
更新时间 2025-03-05
大小 64.63KB
来自于 Github

client vector search

客户端矢量搜索

路线图

戒酒

安装

Quickstart

故障射击

nextjs

加载页面后的模型加载

用法指南

步骤1：生成字符串的嵌入

步骤2：计算余弦相似性

步骤3：创建索引

步骤4：添加索引

步骤5：更新索引

步骤6：从索引中删除

步骤7：从索引检索

步骤8：搜索索引

步骤9：打印索引

步骤10：将索引保存到索引DB（用于浏览器）

重要：在索引中搜索

删除对象存储

检索所有对象

java client

amneziawg windows client

rdt client

discord bot client

词搜索 800

azure search python samples

chat.petals.dev

GPT Prompt Templates

GPTyped

Google Dorks

shepherd

mongo express

Google Dorks

shepherd

mongo express