Vector_db_with_llm
Fastapi adalah kerangka kerja web modern, cepat (berkinerja tinggi), untuk membangun API dengan Python 3.8+ berdasarkan python standar. Ini adalah repositori yang menyediakan untuk mengirimkan catatan ke aplikasi Prometheus-Export.
Langchain (https://github.com/langchain-ai/langchain) adalah kerangka kerja untuk mengembangkan aplikasi yang ditenagai oleh model bahasa besar (LLM). Langchain adalah kerangka kerja open source untuk membangun aplikasi berdasarkan model bahasa besar (LLM). LLMS adalah model pembelajaran dalam yang besar yang telah dilatih sebelumnya pada sejumlah besar data yang dapat menghasilkan respons terhadap kueri pengguna. Misalnya, menjawab pertanyaan atau membuat gambar dari petunjuk berbasis teks. Prompt adalah pertanyaan yang digunakan orang untuk mencari tanggapan dari LLM.
vectordb (yaitu, milvus, faiss (pencarian kesamaan ai facebook), chroma, qdrant, pinecone): vektor vektor ai (vectordb) dirancang untuk menyimpan dan mengelola vektor, pinecone): vektor vektor (vectordb) dirancang untuk menyimpan dan mengelola vektor. Data vektor mengacu pada representasi numerik objek, yang dapat digunakan untuk pencarian kesamaan, pengelompokan, dan tugas lainnya. (Referensi: https://krishna-yogik.medium.com/vectordb-tutorial-a-beginners-guide-06dc333fac2f)
RAG (Pengambilan Generasi Augemented) adalah teknik AI yang memungkinkan perusahaan untuk secara otomatis menanamkan data berpemilik mereka yang paling saat ini dan relevan langsung ke prompt LLM mereka.
Apache Flink: Apache Kafka dan Apache Flink adalah dua alat yang kuat dalam data besar dan pemrosesan aliran. Sementara Kafka dikenal karena sistem pesannya yang kuat, Flink baik dalam pemrosesan aliran waktu nyata dan analitik.
streamlit adalah data python open-source untuk para ilmuwan data dan AI/ML Insinyur untuk memberikan data interaksi sumber open-source untuk ilmuwan data dan AI/ML Data untuk memberikan data interaktif Sumber Sumber Open
pip install streamlit pip install streamlit-chat streamlit run [streamlit-filenam.py] [--server.port 30001] streamlit run app.pyblack==23.3.0 mypy==1.4.1 pre-commit==3.3.3 watchdog pytestGradio (https://www.gradio.app/guides/quickstart): Gradio adalah paket python open-source yang memungkinkan Anda untuk dengan cepat membangun aplikasi demo atau web untuk model pembelajaran mesin Anda, API, atau fungsi python sewenang-wenang. Anda kemudian dapat membagikan tautan ke demo atau aplikasi web Anda hanya dalam beberapa detik menggunakan fitur berbagi bawaan Gradio.
./gradio-start.shpip install --upgrade gradio )Elasticsearch Gen-AI: https://github.com/elastic-codotbot
. (https://github.com/features/codespaces, https://velog.io/@profile_exe/github-codespaces): Github Codespaces membuat Anda naik dan mengkode lebih cepat dengan lingkungan pengembangan awan yang dikonfigurasi penuh, aman untuk Github.
Pyautogui memungkinkan skrip Python Anda mengontrol mouse dan keyboard untuk mengotomatiskan interaksi dengan aplikasi lain. API dirancang agar sederhana. Pyautogui bekerja di Windows, MacOS, dan Linux, dan berjalan pada Python 2 dan 3. Untuk menginstal dengan PIP, jalankan pip install pyautogui . Lihat halaman instalasi untuk lebih jelasnya (https://pyautogui.readthedocs.io/en/latest/)
import pyautogui import time while True: print(pyautogui.position ()) pyautogui.moveTo(100,200) pyautogui.click(100, 200) # pyautogui.moveTo(200,200, duration=0.5) time.sleep(10)./jupyter-notebook.sh ): http://localhost:8889/tree/Langchain/workflow/jupyter-workflow sudo yum install gcc openssl-devel bzip2-devel libffi-devel zlib-devel git wget https://www.python.org/ftp/python/3.9.0/Python-3.9.0.tgz tar –zxvf Python-3.9.0.tgz or tar -xvf Python-3.9.0.tgz cd Python-3.9.0 ./configure --libdir=/usr/lib64 sudo make sudo make altinstall # python3 -m venv .venv --without-pip sudo yum install python3-pip sudo ln -s /usr/lib64/python3.9/lib-dynload/ /usr/local/lib/python3.9/lib-dynload # -- From Python ^3.10 , It need to be installed openssl # openssl cd /usr/local/src wget https://www.openssl.org/source/openssl-1.1.1t.tar.gz tar xvf openssl-1.1.1t.tar.gz cd openssl-1.1.1t/ ./config --prefix=/usr/local/ssl --openssldir=/usr/local/ssl shared zlib make sudo make install export LDFLAGS= " -L/usr/local/ssl/lib " export CPPFLAGS= " -I/usr/local/ssl/include " # openssl확인 /usr/local/ssl/bin/openssl version export LD_LIBRARY_PATH=/usr/local/ssl/lib: $LD_LIBRARY_PATH echo $LD_LIBRARY_PATH sudo yum install gcc openssl-devel bzip2-devel libffi-devel zlib-devel git wget https://www.python.org/ftp/python/3.11.0/Python-3.11.0.tgz tar –zxvf Python-3.11.0.tgz or tar -xvf Python-3.11.0.tgz cd Python-3.11.0 # --with-openssl-rpath=auto 옵션을 추가하여 파이썬이 자동으로 올바른 OpenSSL 라이브러리 경로를 찾도록 함 # ./configure --libdir=/usr/lib64 --with-openssl=/usr/local/ssl --with-openssl-rpath=auto ./configure --libdir=/usr/lib64 --with-openssl=/usr/bin/ssl --with-openssl-rpath=auto sudo make sudo make altinstall # -- Error occurs when installing packages via pip like below (.venv) -bash-4.2$ pip install elasticsearch==7.13 WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by ' SSLError("Can ' t connect to HTTPS URL because the SSL module is not available. " )': /simple/elasticsearch/ WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError( " Can ' t connect to HTTPS URL because the SSL module is not available.") ' : /simple/elasticsearch/ ERROR: Operation cancelled by user # -- python -m venv .venv source .venv/bin/activate # -- Swagger pip install poetry poetry add fastapi poetry add uvicorn poetry add gunicorn poetry add pytz poetry add httpx poetry add pytest poetry add pytest-cov poetry add requests poetry add pyyaml poetry add elasticsearch==7.13 poetry add python-dotenv poetry add jupyter # -- # -- Vector poetry config virtualenvs.in-project true pip install poetry poetry init poetry add openai langchain langchainhub tiktoken chromadb langchain-community bs4 python-dotenv poetry add sentence-transformers poetry add pypdf poetry add docx2txt poetry add faiss-cpu poetry add requests pip install --q openai langchain langchainhub tiktoken chromadb langchain-community bs4 # when error occur like this # ImportError: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'OpenSSL 1.0.2k-fips 26 Jan 2017'. See: https://github.com/urllib3/urllib3/issues/2168 pip install urllib3==1.26.18 pip install pytz pip install requests==2.27.1./service-start.shpip install git+https://github.com/jm/git_pip_install.git atau git+https://oss.navercorp.com/nsml/nsml_notebook.git@branch_name atau git+https: //oss.navercercorp.com/nsml )./Langchain/workflow/curl -X ' POST ' ' http://localhost:7001/vector/uploadfile ' -H ' accept: application/json ' -H ' Content-Type: multipart/form-data ' -F ' [email protected];type=application/msword ' { " filename " : [ { " index " : { " _index " : " test_context " , " _type " : " search " } }, { " ES_UPLOADED " : " JSON_FORMAT " , " CONTENT " : " This is a test Word document for the TemplatePackage example site. " } ] }python ./Langchain/workflow/text_loader.py*** type : < class ' list ' > / len : 1 data : [Document(metadata={'source': 'C:\Users\euiyoung.hwang\Git_Workspace\Vector_DB_with_LLM/Data/Sample.hwp'}, page_content='KTX 노선도ﺎĀ Ā※ KTX 소요 시간과 운임은 철도청의 사정에 따라 변동할 수 있습니다.Ā서울용산광명Ā천안아산대전서대전동대구익산논산김제밀양정읍구포장성광주부산광주송정나주목포ĀྠĀ 경부선 ྠĀ 호남선ĀĀKTX 소요시간호남선 (서울~천안아산 경부선과 동일)서울서대전논산익산김제정읍장성광주광주송정나주목포시:분6:226:517:15-7:38--8:06-8:38경부선서울용산광명천안아산대전동대구밀양구포부산시:분5:305:456:246:307:187:498:148:27KTX 운임안내최저운임8,100원 (월~목 요금제 기준)호남선 (단위:원)행신8,1008,10015,10024,50026,40028,80031,20032,80035,00037,10039,00040,40044,80040,000용산8,10013,30022,70024,70027,20029,60031,20033,40036,30037,50038,90043,30038,400광명11,30020,70022,60025,10028,60029,40031,60034,50036,40037,10041,50036,700천안아산9,40011,40014,00017,70019,10021,70024,70026,60027,70032,20027,300서대전8,1008,1008,30010,00012,70015,90018,10019,70024,30019,300계룡8,1008,1008,10010,70014,00016,10017,70022,50017,400논산8,1008,1008,10011,40013,50015,10020,20014,800익산8,1008,1008,1009,80011,40016,50011,100김제8,1008,1008,1009,70014,8009,300정읍8,1008,1008,10012,1008,100장성8,1008,1008,9008,100광주송정8,1008,100-나주8,100-목포-광주경부선 (단위:원)행신8,1008,10014,10022,80039,70044,20047,80049,100서울8,10012,70021,40038,40043,00046,60047,900광명10,50019,20036,50041,00044,60046,000천안아산8,70025,60029,40032,90034,200대전16,90022,10025,30026,700동대구8,1009,30010,800밀양8,1008,100구포8,100부산')] page_content : KTX 노선도ﺎĀ Ā※ KTX 소요 시간과 운임은 철도청의 사정에 따라 변동할 수 있습니다.Ā서울용산광명Ā천안아산대전서대전동대구익산논산김제밀양정읍구포장성광주부산광주송정나주목포ĀྠĀ 경부선 ྠĀ 호남선ĀĀKTX 소요시간호남선 (서울~천안아산 경부선과 동일)서울서대전논산익산김제정읍장성광주광주송정나주목포시:분6:226:517:15-7:38--8:06-8:38경부선서울용산광명천안아산대전동대구밀양구포부산시:분5:305:456:246:307:187:498:148:27KTX 운임안내최저운임8,100원 (월~목 요금제 기준)호남선 (단위:원)행신8,1008,10015,10024,50026,40028,80031,20032,80035,00037,10039,00040,40044,80040,000용산8,10013,30022,70024,70027,20029,60031,20033,40036,30037,50038,90043,30038,400광명11,30020,70022,60025,10028,60029,40031,60034,50036,40037,10041,50036,700천안아산9,40011,40014,00017,70019,10021,70024,70026,60027,70032,20027,300서대전8,1008,1008,30010,00012,70015,90018,10019,70024,30019,300계룡8,1008,1008,10010,70014,00016,10017,70022,50017,400논산8,1008,1008,10011,40013,50015,10020,20014,800익산8,1008,1008,1009,80011,40016,50011,100김제8,1008,1008,1009,70014,8009,300정읍8,1008,1008,10012,1008,100장성8,1008,1008,9008,100광주송정8,1008,100-나주8,100-목포-광주경부선 (단위:원)행신8,1008,10014,10022,80039,70044,20047,80049,100서울8,10012,70021,40038,40043,00046,60047,900광명10,50019,20036,50041,00044,60046,000천안아산8,70025,60029,40032,90034,200대전16,90022,10025,30026,700동대구8,1009,30010,800밀양8,1008,100구포8,100부산 *** [ { " index " : { " _index " : " test_context " , " _type " : " search " } }, { " ES_UPLOADED " : " JSON_FORMAT " , "CONTENT": "KTX 노선도ﺎĀ Ā※ KTX 소요 시간과 운임은 철도청의 사정에 따라 변동할 수 있습니다.Ā서울용산광명Ā천안아산대전서대전동대구익산논산김제밀양정읍구포장성광주부산광주송정나주목포ĀྠĀ 경부선 ྠĀ 호남선ĀĀKTX 소요시간호남선 (서울~천안아산 경부선과 동일)서울서대전논산익산김제정읍장성광주광주송정나주목포시:분6:226:517:15-7:38--8:06-8:38경부선서울용산광명천안아산대전동대구밀양구포부산시:분5:305:456:246:307:187:498:148:27KTX 운임안내최저운임8,100원 (월~목 요금제 기준)호남선 (단위:원)행신8,1008,10015,10024,50026,40028,80031,20032,80035,00037,10039,00040,40044,80040,000용산8,10013,30022,70024,70027,20029,60031,20033,40036,30037,50038,90043,30038,400광명11,30020,70022,60025,10028,60029,40031,60034,50036,40037,10041,50036,700천안아산9,40011,40014,00017,70019,10021,70024,70026,60027,70032,20027,300서대전8,1008,1008,30010,00012,70015,90018,10019,70024,30019,300계룡8,1008,1008,10010,70014,00016,10017,70022,50017,400논산8,1008,1008,10011,40013,50015,10020,20014,800익산8,1008,1008,1009,80011,40016,50011,100김제8,1008,1008,1009,70014,8009,300정읍8,1008,1008,10012,1008,100장성8,1008,1008,9008,100광주송정8,1008,100-나주8,100-목포-광주경부선 (단위:원)행신8,1008,10014,10022,80039,70044,20047,80049,100서울8,10012,70021,40038,40043,00046,60047,900광명10,50019,20036,50041,00044,60046,000천안아산8,70025,60029,40032,90034,200대전16,90022,10025,30026,700동대구8,1009,30010,800밀양8,1008,100구포8,100부산" } ]http://localhost:7001/docs
source .venv/bin/activatepoetry run py.test -v --junitxml=test-reports/junit/pytest.xml --cov-report html --cov tests/ ./pytest.sh$ ./pytest.sh tests t est_api.py::test_skip SKIPPED (no way of currently testing this) [ 50%] tests t est_api.py::test_api PASSED [100%] ---------- coverage: platform win32, python 3.11.7-final-0 ----------- Name Stmts Miss Cover Missing ---------------------------------------------------------------- config c onfig.py 8 4 50% 16-20, 33 config l og_config.py 32 1 97% 42 controller _ _init__.py 0 0 100% controller c luster_controller.py 14 0 100% injector.py 25 0 100% main.py 23 9 61% 38-57, 69 service _ _init__.py 0 0 100% service e s_search_handler.py 139 103 26% 32-88, 131-132, 139-155, 160-173, 178-191, 196-213, 219-254, 259-271, 276-287, 292-304, 309-315 service e s_util.py 21 16 24% 5-11, 16-19, 24-35, 40-41 service q uery_builder.py 42 27 36% 21-40, 48-51, 64-83, 89-98, 102-137 service s tatus_handler.py 13 2 85% 12, 19 tests _ _init__.py 0 0 100% tests c onftest.py 8 0 100% tests t est_api.py 9 1 89% 7 ---------------------------------------------------------------- TOTAL 334 163 51% $./circleci/config.yml ): Circleci adalah integrasi berkelanjutan dan platform pengiriman berkelanjutan yang membantu tim perangkat lunak bekerja lebih cerdas, lebih cepat. Dengan Circleci, setiap komit memulai pekerjaan baru di platform kami, dan kode dibangun, diuji, dan digunakan../.github/workflows/build-and-test.yml ): Tindakan GitHub adalah platform integrasi dan pengiriman kontinu (CI/CD) yang berkelanjutan yang memungkinkan Anda mengotomatisasi pipa build, uji, dan penyebaran Anda. Anda dapat membuat alur kerja yang membangun dan menguji setiap permintaan tarik ke repositori Anda, atau menggunakan permintaan tarik yang digabungkan ke produksi.# -- /etc/systemd/system/vector_interface_api.service [Unit] Description=Swagger ES Service [Service] User=devuser Group=devuser Type=simple ExecStart=/bin/bash /home/devuser/Git_Repo/service-start.sh ExecStop= /usr/bin/killall vector_interface_api [Install] WantedBy=default.target # Service command sudo systemctl daemon-reload sudo systemctl start vector_interface_api.service sudo systemctl status vector_interface_api.service sudo systemctl stop vector_interface_api.service sudo service vector_interface_api status/stop/start