Téléchargement langport - Téléchargement du code source langport

langport

Code Source AI

0.3.11

Télécharger

Chouchou

architecture

Langport est une plate-forme de service de grande langue open source. Notre objectif est de construire un service d'inférence LLM super rapide.

Ce projet est inspiré par LMSYS / FASTCHAT, nous espérons que la plate-forme de service est légère et rapide, mais FastChat comprend d'autres fonctionnalités telles que la formation et l'évaluation le compliquent.

Les caractéristiques de base comprennent:

Support de transformateurs à étreindre.
Support GGML (LLAMA.CPP).
Un système de service distribué pour les modèles de pointe.
Support de génération de streaming avec diverses stratégies de décodage.
Inférence par lots pour un débit plus élevé.
Prise en charge des modèles d'encodeur uniquement, de décodeur uniquement et d'encodeur.
API RESTFul compatible Openai.
API RESTful compatible en faussespilote.
API RESTful compatible en étreinte.
API RESTful compatible tabby.

Architectures de modèle de support

LLAMA, LLAMA2, GLM, BLOOM, OPT, GPT2, GPT NEO, GPT BIG CODE et ainsi de suite.

Modèles testés

Ningyu, lama, lama2, vicuna, chatglm, chatglm2, falcon, starcoder, wizardlm, interlm, openbuddy, firefly, codegen, phoenix, rwkv, stablelm et ainsi de suite.

Nouvelles

[2024/01/13] Présentez le ChatProto .
[2023/08/04] Inférence dynamique du lot.
[2023/07/16] Prise en charge la quantification INT4.
[2023/07/13] Paramètre LOGPROBS de génération de support.
[2023/06/18] Ajouter GGML (LLAMA.CPP GPT.CPP STARCODER.CPP ETC.) Soutien des travailleurs.
[2023/06/09] Ajouter un soutien de travailleur LLAMA.CPP.
[2023/06/01] Ajoutez un support de travailleur d'intégration de Bert FaceFface.
[2023/06/01] Ajoutez une prise en charge de l'API de génération de texte HuggingFace.
[2023/06/01] Ajouter une prise en charge de l'API Tabby.
[2023/05/23] Ajoutez un script de test de débit de chat.
[2023/05/22] Nouvelle architecture distribuée.
[2023/05/14] L'inférence du lot est prise en charge.
[2023/05/10] Le projet Langport a commencé.

Installer

Méthode 1: avec pip

pip install langport

ou:

pip install git+https://github.com/vtuber-plan/langport.git

Si vous avez besoin d'un travailleur de la génération GGML, utilisez cette commande:

pip install langport[ggml]

Si vous souhaitez utiliser GPU:

CT_CUBLAS=1 pip install langport[ggml]

Méthode 2: De la source

Cloner ce référentiel

git clone https://github.com/vtuber-plan/langport.git
cd langport

Installer le package

pip install --upgrade pip
pip install -e .

Démarrage rapide

Il est simple de démarrer un service API de chat local:

Tout d'abord, démarrez un processus de travailleur dans le terminal:

python -m langport.service.server.generation_worker --port 21001 --model-path < your model path >

Ensuite, démarrez un service API dans un autre terminal:

python -m langport.service.gateway.openai_api

Maintenant, vous pouvez utiliser l'API d'inférence par le protocole OpenAI.

Démarrer le serveur

Il est simple de démarrer un service API de chat de nœud unique:

python -m langport.service.server.generation_worker --port 21001 --model-path < your model path >
python -m langport.service.gateway.openai_api

Si vous avez besoin d'un seul serveur API Embeddings Node:

python -m langport.service.server.embedding_worker --port 21002 --model-path bert-base-chinese --gpus 0 --num-gpus 1
python -m langport.service.gateway.openai_api --port 8000 --controller-address http://localhost:21002

Si vous avez besoin de l'API Embeddings ou d'autres fonctionnalités, vous pouvez déployer un cluster d'inférence distribué:

python -m langport.service.server.dummy_worker --port 21001
python -m langport.service.server.generation_worker --model-path < your model path > --neighbors http://localhost:21001
python -m langport.service.server.embedding_worker --model-path < your model path > --neighbors http://localhost:21001
python -m langport.service.gateway.openai_api --controller-address http://localhost:21001

En pratique, la passerelle peut se connecter à n'importe quel nœud pour distribuer des tâches d'inférence:

python -m langport.service.server.dummy_worker --port 21001
python -m langport.service.server.generation_worker --port 21002 --model-path < your model path > --neighbors http://localhost:21001
python -m langport.service.server.generation_worker --port 21003 --model-path < your model path > --neighbors http://localhost:21001 http://localhost:21002
python -m langport.service.server.generation_worker --port 21004 --model-path < your model path > --neighbors http://localhost:21001 http://localhost:21003
python -m langport.service.server.generation_worker --port 21005 --model-path < your model path > --neighbors http://localhost:21001 http://localhost:21004
python -m langport.service.gateway.openai_api --controller-address http://localhost:21003 # 21003 is OK!
python -m langport.service.gateway.openai_api --controller-address http://localhost:21002 # Any worker is also OK!

Exécutez la génération de texte avec plusieurs GPU:

python -m langport.service.server.generation_worker --port 21001 --model-path < your model path > --gpus 0,1 --num-gpus 2
python -m langport.service.gateway.openai_api

Exécutez la génération de texte avec un travailleur GGML:

python -m langport.service.server.ggml_generation_worker --port 21001 --model-path < your model path > --gpu-layers < num layer to gpu (resize this for your VRAM) >

Exécutez Openai Forward Server:

python -m langport.service.server.chatgpt_generation_worker --port 21001 --api-url < url > --api-key < key >

Licence

Langport est publié sous la licence logicielle Apache.

Voir aussi

Langport-docs
Langport-source

Histoire des étoiles

Développer

Informations supplémentaires

Version 0.3.11
Type Code Source AI
Date de mise à jour 2025-09-09
taille 323.39KB
Provenant de Github

Applications connexes

ML stack

2025-07-01
awesome free chatgpt

2025-01-04
pywin_contextmenu

2025-08-31
promptl

2025-02-17
tick.chat

2025-09-16
FastLoRAChat

2025-09-03

Recommandé pour vous

chat.petals.dev

Autre code source

1.0.0
GPT Prompt Templates

Autre code source

1.0.0
GPTyped

Autre code source

GPTyped 1.0.5
ML stack

Code Source AI

1.0.0
awesome free chatgpt

Code Source AI

1.0.0
pywin_contextmenu

Code Source AI

Version update
Google Dorks

Autre code source

1.0
shepherd

Autre code source

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Autre code source

v1.1.0-rc-3

Actualités connexes Tout