EmbedPG is a Node.js API service that uses PostgreSQL with the pgvector extension. It helps store and search vector data in a database. This project is an early version to see if it's useful for people.
Vector databases are really useful but often expensive and restricted. I created EmbedPG to make these databases easier and cheaper to use. It works well for different sizes of projects. The main cost comes from using cloud services like PostgreSQL and server space. EmbedPG helps you set up a vector database quickly with easy-to-use API endpoints and a command-line tool.
We store and search embeddings using PostgreSQL with the pgVector extension. You can find pgVector here: pgVector on GitHub.
pgVector supports:
Yes, there are cloud solutions that support pgVector:
EmbedPG leverages several key technologies and packages to deliver its functionality:
Before you begin the installation process, ensure that you have the following prerequisites installed:
npm install -g pnpm.EmbedPG requires PostgreSQL with the pgvector extension. You can set this up using:
pgvector repository available at pgvector on GitHub.# Pull the Docker image
docker pull arisrayelyan/pgvector:latest
# Run the Docker container
docker run -d
--name pgvector
-e POSTGRES_PASSWORD=postgres
-e POSTGRES_USER=postgres
-e POSTGRES_DB=postgres
-p 5432:5432 arisrayelyan/pgvectorClone the EmbedPG Repository
git [email protected]:arisrayelyan/embed-pg.gitNavigate to the Project Directory
cd embed-pgInstall Dependencies
pnpm installSet Up Environment Variables
Copy the .env.example file to .env and set the environment variables as needed.
# General Settings
NODE_ENV=development
# Database Configuration
DB_USERNAME=postgres
DB_PASSWORD=postgres
DB_NAME=postgres
DB_HOST=localhost
DB_PORT=5432
# Server Settings
PORT=3000
CORS_ORIGINS=http://localhost:3000 # Set the allowed origins for CORS
# OpenAI Configuration
OPENAI_API_KEY=""
OPEN_AI_MODEL=""
OPEN_AI_API_ENDPOINT=""Note: OpenAI environment variables are required when EmbedPG needs to handle embedding requests for you.
After setting up your environment variables and installing EmbedPG, you're ready to set up the components needed for your service to operate effectively.
pnpm generate:collections: This command launches an interactive command line tool that guides you through generating all necessary components for any new collection you want to add. This includes services, API endpoints, database entities, and migrations, ensuring your vector database service is comprehensive and ready to handle specific data needs.pnpm generate:token: This command generates a new API token for your service, which you can use to authenticate requests to your EmbedPG service.pnpm start: Start the service in production mode.pnpm build: Build the application for production.pnpm dev: Start the service in development mode and apply database migrations.pnpm dev:db migration: Apply database migrations in development mode.
--create flag creates a new migration file.--up flag applies all pending migrations.--down flag rolls back the last migration.--to flag applies all migrations up to a specific migration.pnpm db migration: Apply database migrations in production mode. Use the same flags as in development mode.pnpm lint: Check the source code for style and programming errors.pnpm lint:fix: Automatically fix linting errors in the source code.When you generate a new collection, EmbedPG creates the following files:
src directory to include the new files.Note: You can customize the generated files to suit your specific needs. But do not remove the ! embedPg comment in the files, as EmbedPG uses this to identify the generated files.
Here is the API documentation for EmbedPG.
Before deploying EmbedPG to a production environment, ensure you have set up the necessary environment variables and configurations. Also make sure
that you are running PostgreSQL with the pgvector extension (See section Cloud Solutions Supporting pgVector).
Run the following command to build the application for production:
pnpm buildAfter build is complete you will have a dist directory with the compiled code. Deploy this code to your server and run the following command to start the service:
At first run the database migrations:
pnpm prod:db migration --upGenerate a new API token:
pnpm generate:tokenThen start the service:
pnpm startYou can also deploy EmbedPG using Docker.
Build the Docker image (Note: Make sure you have set up the necessary environment variables):
./scripts/build-server.shThis project is licensed under the MIT License. See the LICENSE file for details.
Thank you for your interest in contributing to EmbedPG! See the CONTRIBUTING.md file for guidelines on how to contribute to the project.