RAG (Retrieval-Augmented Generation) app integrated with a voice assistant and knowledge base management system.
This application integrates a RAG (Retrieval-Augmented Generation) model with a voice assistant, allowing users to interact with the system via voice or text input. Additionally, it includes a knowledge base management system, enabling users to add, view, and delete documents used by the RAG model via URLs.
The application is deployed on Streamlit Share and can be accessed at the following URL:
LangChain is a framework designed for building applications that leverage language models. It provides tools for connecting language models to external data sources, enabling more complex and contextual interactions.
The application uses several OpenAI models to provide conversational capabilities and document retrieval:
gpt-3.5-turbo) to generate responses based on user queries and previous conversation context.whisper-1) for automatic speech recognition to transcribe audio inputs from users.Additionally, Cohere Re-ranker (default: rerank-english-v2.0) to improve the relevance of retrieved documents by re-ranking them based on their relevance to the query.
DeepLake is used as a vector store to store and retrieve document embeddings. It facilitates efficient similarity search and retrieval of relevant documents from the knowledge base.
Apify is a web scraping and automation platform that allows for the extraction of data from websites. It is used to scrape documents from URLs provided by users and store them in the knowledge base.
Streamlit is an open-source app framework that allows for the creation of custom web applications for machine learning and data science projects with minimal effort. It is used here to build the user interface of the application.
To install the application locally, you need to have Docker installed on your machine. Then, run following commands:
docker build -t rag-with-knowledge-base-management .docker run -p 8501:8501 rag-with-knowledge-base-managementThe application should now be accessible at http://localhost:8501.
Please make sure to add your API keys to the .env file before running the application. The following keys inside .env.example need to be filled in:
OPENAI_API_KEY - OpenAI API keyCOHERE_API_KEY - Cohere API keyAPIFY_API_TOKEN - Apify API tokenACTIVELOOP_TOKEN - ActiveLoop API tokenACTIVELOOP_ORG_ID - ActiveLoop organization IDDistributed under the open-source Apache 2.0 License. See LICENSE for more information.
Following repositories were useful in building this project: