This project is a Flask-based API designed to retrieve documents using Pinecone for vector search. It includes features like:
The application uses:
We started by setting up the basic Flask application and API endpoints:
/health: A simple endpoint to check if the API is running./search: An endpoint to query Pinecone with text embeddings and retrieve results.For each query, we generate embeddings using a pre-trained BERT model (via Hugging Face’s transformers library). These embeddings are used to perform vector searches using Pinecone.
We integrated Pinecone, a vector database, to store and query document embeddings. This allows efficient and fast retrieval of documents based on similarity search.
We implemented rate limiting using Flask-Limiter to restrict users from making more than 5 requests per minute:
We added caching using Flask-Caching. Caching ensures that identical queries are served from memory, reducing the need to hit the database and vector search engine repeatedly. Cached results expire after 5 minutes.
We implemented a background scraper that can scrape a user-provided website for articles or data and update the Pinecone index with new documents:
BeautifulSoup.We Dockerized the project using a Dockerfile. This allows the project to be easily deployed in any environment with consistent behavior across different systems.
project/
├── app.py # Main Flask application
├── database.py # Database setup for user management
├── cache.py # Caching configuration
├── limiter.py # Rate limiting configuration
├── utils.py # Utility functions (embedding, Pinecone query)
├── scraping.py # Background scraping logic
├── requirements.txt # Python dependencies
├── Dockerfile # Docker configuration
├── .env # Environment variables (not committed to version control)
├── .dockerignore # Ignore unnecessary files in the Docker build
└── README.md # Project documentation
app.py: Contains the Flask application and all API routes.database.py: Handles the setup and schema for user management using SQLite.cache.py: Manages caching for faster response times.limiter.py: Implements rate-limiting functionality.utils.py: Provides helper functions for generating embeddings and querying Pinecone.scraping.py: Contains the logic for background scraping and updating the Pinecone index.Dockerfile: Used to build and run the application in a Docker container.git clone <repository-url>
cd project
python -m venv venv
source venv/bin/activate # On Windows, use venvScriptsactivate
pip install -r requirements.txt
Create a .env file in the project root and add your Pinecone API key and environment:
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_ENVIRONMENT=your_pinecone_environment
To set up the database, run the following code:
>>> from app import db, app
>>> with app.app_context():
>>> db.create_all()
python app.py
The app will be running at http://localhost:5000.
docker build -t flask-app .
docker run -p 5000:5000 flask-app
Now, your app will be running at http://localhost:5000.
URL: /health
Method: GET
Description: Checks if the API is running.
Response:
json
Copy code
{
"status": "API is running"
}
URL: /search
Method: POST
Description: Search documents based on text queries.
Request Body:
json
Copy code
{
"query": "Your search query",
"user_id": "user123",
"top_k": 3
}
Response: Returns a list of matching documents based on the query.
/start_scraping
Method: POST
Description: Starts the background scraping process for a specific site.
Request Body:json
Copy code
{
"url": "https://example.com"
}
Response:
json
{
"message": "Started scraping for https://example.com"
}
api.log.
Background scraping logs are written to scraping.log.