BYOSC Build Your Own Scalable Chatbots Download - BYOSC Build Your Own Scalable Chatbots Source code download

BYOSC Build Your Own Scalable Chatbots

AI Source Code

1.0.0

Download

BYOSC-Build-Your-Own-Scalable-Chatbots

Introduction

When preparing for university exams, having a partner has been proven to be essential to discover knowledge gaps and clarifying specific doubts on the topic treated during classes. While chatbots based on LLMs such as ChatGPT, Phind and Clod are providing help to students already, they cannot provide a lecture/material-specific help on the students’ university courses. We propose to create a system to fine-tune chatbots on specific material of specific courses. Thanks to this, we will create study buddies for the courses of a typical university student, able to answer doubts, generate questions and more!

It's possible to test the chatbot at this link.

Chat

The implicit scope of the project (and of the entire course) is to build a scalable infrastructure that can host our MLops. For this reason, the traditional monolithic ML pipeline is split into three different processes: Feature Pipeline, Training Pipeline, Inference Pipeline.

Architecture

Feature Pipeline

The Feature Pipeline is in charge of:

Downloading new available file from a public shared folder
Generating transcripts of the files
Uploading new transcript to hopsworks
Using GPT 3.5 APIs to generate a valid Instruction Set for the fine-tuning of a foundation model
Uploading such Instruction Set to hopsworks

Running the Feature Pipeline

There are several option to run the Feature Pipeline:

Execute the FeaturePipeline/Reading.ipynb notebook
Execute the FeaturePipeline/FeaturePipeline.py using python3 FeaturePipeline/FeaturePipeline.py

A copy of the latter is slightly modified in the file FeaturePipeline/FeaturePipeline_modal.py to make it runnable on the modal hosting service using modal [run|deploy] FeaturePipeline/FeaturePipeline.py

Training Pipeline

The Training Pipeline is in charge of:

Retrieving the Instruction Set from hopsworks
Setting up a trainer instance using PEFT and LoRa techniques
Running the training
Uploading the result to HuggingFace

Running the Training Pipeline

To execute the Training Pipeline, run the notebook TrainingPipeline/FineTuning.ipynb

Inference Pipeline

The Inference Pipeline is in charge of:

Run a chatbot interface using Streamlit + Langchain
Downloading transcripts of the material from hopsworks
Computing embeddings of the original material using Sentence BERT
Computing embeddings of the user's question using Sentence BERT
Retrieve the best-matching material for a certain question
Augmenting the chatbot's response using RAG

Running the Inference Pipeline

To execute the Inference Pipeline, run streamlit run chatbot_app.py

Conclusions

While experimentally the fine-tuning process is not enough to make the foundational model consistently better than a non-fine-tuned one, the RAG-enabled chatbot is able not only to answer user's questions correctly following the original material, but it is also able to give (mostly) correct references of where the answer is taken from, essential feature for a student studying for an University exam!

Future Work

Fine tuning doesn't work as well as intended due to the lack of material used and computational resources. As future work, we want to improve the knowledge-extraction process and use more computational power to address the problems shown in the report.

Expand

Additional Information

Version 1.0.0
Type AI Source Code
Update Time 2025-06-30
size 17.54MB
From Github

Related Applications

Open Your Head

2023-10-24
Your BMR Check app

2023-07-31
Train Your Minibot

2022-08-06
Rain on Your Parade

2022-08-05
Before Your Eyes

2022-07-23
Art your brains

2022-07-23

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
ML stack

AI Source Code

1.0.0
awesome free chatgpt

AI Source Code

1.0.0
promptl

AI Source Code

1.0.0
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
hidusbf

Other source code

1.0.0

Related Information All