Large Language Models are designed to do human-language tasks effectively like Translation, Summarization, Reasoning, Classification, Capturing Contextual Information, Capture Semantics and Syntax of Languages, but recently after the emergance of GPT-3.5, and open-source Instruct models like Llama-2, Zephyer,..etc, we started to use LLMs in knowledge tasks like asking them about theories in Quantum Physics, Statistics, solving problems, debuging code, provide some information about specific domain use-cases, and more.. of course alot of them can provide answers to alot of questions, but once we starting asking them about questions LLMs don't know anything about or they can't memorize them, they hallucinate, and provide wrong answers.
a solution to this problem to use LLMs effectively, and provide them up to date information is by using Vector Databases & Prompt-Engineering Techniques to build Retrieval-Augmented Generation Systems, which can:
This Technique reduce hallucination, and provide LLMs with Context(Knowledge) they needs to answer questions effectively.
mkdir data/db
mkdir data/newsmkdir sts
cd sts
git lfs install
git clone https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2poetry installmake qdrantit is the simplest step just run this command
make run date='2023-11-09'