uni api下载-uni uni api源代码下载

Uni-Api

英语|中国人

介绍

为了个人用途，一个/new-api过于复杂，因为个人不需要的许多商业功能。如果您不想要复杂的前端界面，并且更喜欢对更多型号的支持，则可以尝试使用Uni-API。这是一个统一大型语言模型API的管理项目，使您可以通过单个统一的API接口调用多个后端服务，将它们全部转换为OpenAI格式，并支持负载平衡。目前支持的后端服务包括：OpenAI，人类，双子座，顶点，Cohere，Groq，Cloudflare，OpenRouter等。

特征

没有用于配置API通道的前端纯配置文件。您只需编写文件即可运行自己的API站，并且该文档具有详细的配置指南，即对初学者友好。
多个后端服务的统一管理，以OpenAI格式为OpenAI，DeepSeek，OpenRouter和其他API等提供商提供支持。支持Openai Dalle-3图像生成。
同时支持人类，双子座，顶点AI，cohere，groq，Cloudflare。顶点同时支持Claude和Gemini API。
支持OpenAI，人类，双子座，顶点本机工具使用功能调用。
支持OpenAI，人类，双子座，顶点本地图像识别API。
支持四种类型的负载平衡。
1. 支持通道级的加权负载平衡，从而可以根据不同的通道权重分配请求。默认情况下不启用它，需要配置通道权重。
2. 支持顶点区域负载平衡和高并发性，可以将双子座和克劳德并发增加到（API *区域数量）次。自动启用，无需其他配置。
3. 除了顶点区域级负载平衡外，所有API都支持通道级的顺序负载平衡，从而增强了沉浸式翻译体验。默认情况下不启用它，并且需要配置SCHEDULING_ALGORITHM为round_robin 。
4. 支持单个通道中多个API键的自动API键级圆形旋转负载平衡。
支持自动重试，当API通道响应失败时，会自动重试下一个API通道。
支持通道冷却：当API频道响应失败时，该通道将自动排除并冷却一段时间，并且对通道的请求将停止。冷却期结束后，该模型将自动恢复直到再次失效为止，此时将再次冷却。
支持细粒度的模型超时设置，为每个型号提供不同的超时持续时间。
支持细粒度的许可控制。支持使用通配符设置可用于API密钥通道的特定型号的支持。
限制支持率，您可以将每分钟的最大请求设置为整数，例如2/分钟，每分钟2次，5/小时，每小时5次，10/10次，每天10次，每天10次，10月10日，10次，每月10次，每年10次，每年10次。默认值为60/min。
支持多个标准的OpenAI格式接口： /v1/chat/completions ， /v1/images/generations ， /v1/audio/transcriptions ， /v1/moderations ， /v1/models 。
支持OpenAI审核的道德审查，可以对用户信息进行道德审查。如果找到不适当的消息，将返回错误消息。这降低了提供者禁止后端API的风险。

用法方法

要启动Uni-API，必须使用配置文件。从配置文件开始有两种方法：

第一种方法是使用CONFIG_URL环境变量填充配置文件URL，当Uni-API启动时将自动下载。
第二种方法是将一个名为api.yaml的配置文件安装到容器中。

方法1：安装`api.yaml`配置文件以启动UNI-API

您必须提前填写配置文件才能启动uni-api ，并且必须使用名为api.yaml的配置文件启动uni-api ，您可以配置多个模型，每个模型都可以配置多个后端服务，并支持负载平衡。以下是可以运行的最小api.yaml配置文件的示例：

 providers :
  - provider : provider_name # Service provider name, such as openai, anthropic, gemini, openrouter, can be any name, required
    base_url : https://api.your.com/v1/chat/completions # Backend service API address, required
    api : sk-YgS6GTi0b4bEabc4C # Provider's API Key, required, automatically uses base_url and api to get all available models through the /v1/models endpoint.
  # Multiple providers can be configured here, each provider can configure multiple API Keys, and each API Key can configure multiple models.
api_keys :
  - api : sk-Pkj60Yf8JFWxfgRmXQFWyGtWUddGZnmi3KlvowmRWpWpQxx # API Key, user request uni-api requires API key, required
  # This API Key can use all models, that is, it can use all models in all channels set under providers, without needing to add available channels one by one.

api.yaml的详细高级配置：

 providers :
  - provider : provider_name # Service provider name, such as openai, anthropic, gemini, openrouter, can be any name, required
    base_url : https://api.your.com/v1/chat/completions # Backend service API address, required
    api : sk-YgS6GTi0b4bEabc4C # Provider's API Key, required
    model : # Optional, if model is not configured, all available models will be automatically obtained through base_url and api via the /v1/models endpoint.
      - gpt-4o # Usable model name, required
      - claude-3-5-sonnet-20240620 : claude-3-5-sonnet # Rename model, claude-3-5-sonnet-20240620 is the provider's model name, claude-3-5-sonnet is the renamed name, you can use a simple name to replace the original complex name, optional
      - dall-e-3

  - provider : anthropic
    base_url : https://api.anthropic.com/v1/messages
    api : # Supports multiple API Keys, multiple keys automatically enable polling load balancing, at least one key, required
      - sk-ant-api03-bNnAOJyA-xQw_twAA
      - sk-ant-api02-bNnxxxx
    model :
      - claude-3-5-sonnet-20240620 : claude-3-5-sonnet # Rename model, claude-3-5-sonnet-20240620 is the provider's model name, claude-3-5-sonnet is the renamed name, you can use a simple name to replace the original complex name, optional
    tools : true # Whether to support tools, such as generating code, generating documents, etc., default is true, optional

  - provider : gemini
    base_url : https://generativelanguage.googleapis.com/v1beta # base_url supports v1beta/v1, only for Gemini model use, required
    api : # Supports multiple API Keys, multiple keys automatically enable polling load balancing, at least one key, required
      - AIzaSyAN2k6IRdgw123
      - AIzaSyAN2k6IRdgw456
      - AIzaSyAN2k6IRdgw789
    model :
      - gemini-1.5-pro
      - gemini-1.5-flash-exp-0827 : gemini-1.5-flash # After renaming, the original model name gemini-1.5-flash-exp-0827 cannot be used, if you want to use the original name, you can add the original name in the model, just add the line below to use the original name
      - gemini-1.5-flash-exp-0827 # Add this line, both gemini-1.5-flash-exp-0827 and gemini-1.5-flash can be requested
    tools : true
    preferences :
      api_key_rate_limit : 15/min # Each API Key can request up to 15 times per minute, optional. The default is 999999/min. Supports multiple frequency constraints: 15/min,10/day
      # api_key_rate_limit: # You can set different frequency limits for each model
      #   gemini-1.5-flash: 15/min,1500/day
      #   gemini-1.5-pro: 2/min,50/day
      #   default: 4/min # If the model does not set the frequency limit, use the frequency limit of default
      api_key_cooldown_period : 60 # Each API Key will be cooled down for 60 seconds after encountering a 429 error. Optional, the default is 0 seconds. When set to 0, the cooling mechanism is not enabled. When there are multiple API keys, the cooling mechanism will take effect.
      api_key_schedule_algorithm : round_robin # Set the request order of multiple API Keys, optional. The default is round_robin, and the optional values are: round_robin, random. It will take effect when there are multiple API keys. round_robin is polling load balancing, and random is random load balancing.
      model_timeout : # Model timeout, in seconds, default 100 seconds, optional
        gemini-1.5-pro : 10 # Model gemini-1.5-pro timeout is 10 seconds
        gemini-1.5-flash : 10 # Model gemini-1.5-flash timeout is 10 seconds
        default : 10 # Model does not have a timeout set, use the default timeout of 10 seconds, when requesting a model not in model_timeout, the timeout is also 10 seconds, if default is not set, uni-api will use the default timeout set by the environment variable TIMEOUT, the default timeout is 100 seconds
      proxy : socks5://[username]:[password]@[ip]:[port] # Proxy address, optional. Supports socks5 and http proxies, default is not used.

  - provider : vertex
    project_id : gen-lang-client-xxxxxxxxxxxxxx # Description: Your Google Cloud project ID. Format: String, usually composed of lowercase letters, numbers, and hyphens. How to obtain: You can find your project ID in the project selector of the Google Cloud Console.
    private_key : " -----BEGIN PRIVATE KEY----- n xxxxx n -----END PRIVATE " # Description: Private key for Google Cloud Vertex AI service account. Format: A JSON formatted string containing the private key information of the service account. How to obtain: Create a service account in Google Cloud Console, generate a JSON formatted key file, and then set its content as the value of this environment variable.
    client_email : [email protected] # Description: Email address of the Google Cloud Vertex AI service account. Format: Usually a string like "[email protected]". How to obtain: Generated when creating a service account, or you can view the service account details in the "IAM and Admin" section of the Google Cloud Console.
    model :
      - gemini-1.5-pro
      - gemini-1.5-flash
      - gemini-1.5-pro : gemini-1.5-pro-search # Only supports using the gemini-1.5-pro-search model to request uni-api when using the Vertex Gemini API, to automatically use the Google official search tool.
      - claude-3-5-sonnet@20240620 : claude-3-5-sonnet
      - claude-3-opus@20240229 : claude-3-opus
      - claude-3-sonnet@20240229 : claude-3-sonnet
      - claude-3-haiku@20240307 : claude-3-haiku
    tools : true
    notes : https://xxxxx.com/ # You can put the provider's website, notes, official documentation, optional

  - provider : cloudflare
    api : f42b3xxxxxxxxxxq4aoGAh # Cloudflare API Key, required
    cf_account_id : 8ec0xxxxxxxxxxxxe721 # Cloudflare Account ID, required
    model :
      - ' @cf/meta/llama-3.1-8b-instruct ' : llama-3.1-8b # Rename model, @cf/meta/llama-3.1-8b-instruct is the provider's original model name, must be enclosed in quotes, otherwise yaml syntax error, llama-3.1-8b is the renamed name, you can use a simple name to replace the original complex name, optional
      - ' @cf/meta/llama-3.1-8b-instruct ' # Must be enclosed in quotes, otherwise yaml syntax error

  - provider : other-provider
    base_url : https://api.xxx.com/v1/messages
    api : sk-bNnAOJyA-xQw_twAA
    model :
      - causallm-35b-beta2ep-q6k : causallm-35b
      - anthropic/claude-3-5-sonnet
    tools : false
    engine : openrouter # Force the use of a specific message format, currently supports gpt, claude, gemini, openrouter native format, optional

api_keys :
  - api : sk-KjjI60Yf0JFWxfgRmXqFWyGtWUd9GZnmi3KlvowmRWpWpQRo # API Key, required for users to use this service
    model : # Models that can be used by this API Key, required. Default channel-level polling load balancing is enabled, and each request model is requested in sequence according to the model configuration. It is not related to the original channel order in providers. Therefore, you can set different request sequences for each API key.
      - gpt-4o # Usable model name, can use all gpt-4o models provided by providers
      - claude-3-5-sonnet # Usable model name, can use all claude-3-5-sonnet models provided by providers
      - gemini/* # Usable model name, can only use all models provided by providers named gemini, where gemini is the provider name, * represents all models
    role : admin

  - api : sk-pkhf60Yf0JGyJxgRmXqFQyTgWUd9GZnmi3KlvowmRWpWqrhy
    model :
      - anthropic/claude-3-5-sonnet # Usable model name, can only use the claude-3-5-sonnet model provided by the provider named anthropic. Models with the same name from other providers cannot be used. This syntax will not match the model named anthropic/claude-3-5-sonnet provided by other-provider.
      - <anthropic/claude-3-5-sonnet> # By adding angle brackets on both sides of the model name, it will not search for the claude-3-5-sonnet model under the channel named anthropic, but will take the entire anthropic/claude-3-5-sonnet as the model name. This syntax can match the model named anthropic/claude-3-5-sonnet provided by other-provider. But it will not match the claude-3-5-sonnet model under anthropic.
      - openai-test/text-moderation-latest # When message moderation is enabled, the text-moderation-latest model under the channel named openai-test can be used for moderation.
      - sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo/* # Support using other API keys as channels
    preferences :
      SCHEDULING_ALGORITHM : fixed_priority # When SCHEDULING_ALGORITHM is fixed_priority, use fixed priority scheduling, always execute the channel of the first model with a request. Default is enabled, SCHEDULING_ALGORITHM default value is fixed_priority. SCHEDULING_ALGORITHM optional values are: fixed_priority, round_robin, weighted_round_robin, lottery, random.
      # When SCHEDULING_ALGORITHM is random, use random polling load balancing, randomly request the channel of the model with a request.
      # When SCHEDULING_ALGORITHM is round_robin, use polling load balancing, request the channel of the model used by the user in order.
      AUTO_RETRY : true # Whether to automatically retry, automatically retry the next provider, true for automatic retry, false for no automatic retry, default is true. Also supports setting a number, indicating the number of retries.
      rate_limit : 15/min # Supports rate limiting, each API Key can request up to 15 times per minute, optional. The default is 999999/min. Supports multiple frequency constraints: 15/min,10/day
      # rate_limit: # You can set different frequency limits for each model
      #   gemini-1.5-flash: 15/min,1500/day
      #   gemini-1.5-pro: 2/min,50/day
      #   default: 4/min # If the model does not set the frequency limit, use the frequency limit of default
      ENABLE_MODERATION : true # Whether to enable message moderation, true for enable, false for disable, default is false, when enabled, it will moderate the user's message, if inappropriate messages are found, an error message will be returned.

  # Channel-level weighted load balancing configuration example
  - api : sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo
    model :
      - gcp1/* : 5 # The number after the colon is the weight, weight only supports positive integers.
      - gcp2/* : 3 # The size of the number represents the weight, the larger the number, the greater the probability of the request.
      - gcp3/* : 2 # In this example, there are a total of 10 weights for all channels, and 10 requests will have 5 requests for the gcp1/* model, 2 requests for the gcp2/* model, and 3 requests for the gcp3/* model.

    preferences :
      SCHEDULING_ALGORITHM : weighted_round_robin # Only when SCHEDULING_ALGORITHM is weighted_round_robin and the above channel has weights, it will request according to the weighted order. Use weighted polling load balancing, request the channel of the model with a request according to the weight order. When SCHEDULING_ALGORITHM is lottery, use lottery polling load balancing, request the channel of the model with a request according to the weight randomly. Channels without weights automatically fall back to round_robin polling load balancing.
      AUTO_RETRY : true

preferences : # Global configuration
  model_timeout : # Model timeout, in seconds, default 100 seconds, optional
    gpt-4o : 10 # Model gpt-4o timeout is 10 seconds, gpt-4o is the model name, when requesting models like gpt-4o-2024-08-06, the timeout is also 10 seconds
    claude-3-5-sonnet : 10 # Model claude-3-5-sonnet timeout is 10 seconds, when requesting models like claude-3-5-sonnet-20240620, the timeout is also 10 seconds
    default : 10 # Model does not have a timeout set, use the default timeout of 10 seconds, when requesting a model not in model_timeout, the default timeout is 10 seconds, if default is not set, uni-api will use the default timeout set by the environment variable TIMEOUT, the default timeout is 100 seconds
    o1-mini : 30 # Model o1-mini timeout is 30 seconds, when requesting models starting with o1-mini, the timeout is 30 seconds
    o1-preview : 100 # Model o1-preview timeout is 100 seconds, when requesting models starting with o1-preview, the timeout is 100 seconds
  cooldown_period : 300 # Channel cooldown time, in seconds, default 300 seconds, optional. When a model request fails, the channel will be automatically excluded and cooled down for a period of time, and will not request the channel again. After the cooldown time ends, the model will be automatically restored until the request fails again, and it will be cooled down again. When cooldown_period is set to 0, the cooling mechanism is not enabled.
  error_triggers : # Error triggers, when the message returned by the model contains any of the strings in the error_triggers, the channel will return an error. Optional
    - The bot's usage is covered by the developer
    - process this request due to overload or policy

安装配置文件并启动Uni-API Docker容器：

docker run --user root -p 8001:8000 --name uni-api -dit 
-v ./api.yaml:/home/api.yaml 
yym68686/uni-api:latest

方法两个：使用`CONFIG_URL`环境变量启动uni-api

根据方法One编写配置文件后，将其上传到云磁盘，获取文件的直接链接，然后使用CONFIG_URL环境变量来启动Uni-API Docker容器：

docker run --user root -p 8001:8000 --name uni-api -dit 
-e CONFIG_URL=http://file_url/api.yaml 
yym68686/uni-api:latest

环境变量

config_url：配置文件的下载地址，可以是本地文件或远程文件，可选的
超时：请求超时，默认值为100秒。超时可以控制一个通道未响应时切换到下一个通道所需的时间。选修的
disable_database：是否禁用数据库，默认值为false，可选

Vercel远程部署

单击上面的一单击部署按钮后，将环境变量CONFIG_URL设置为配置文件的直接链接，将DISABLE_DATABASE设置为true，然后单击创建创建项目。部署后，您需要在设置下的Vercel项目面板中手动将功能最大持续时间设置为60秒 - >函数，然后单击“部署”菜单，然后单击Redeploy到Redeploy，将超时设置为60秒。如果您不重新部署，则默认超时将保留在原始的10秒钟。请注意，您不应删除Vercel项目并重新创建它；取而代之的是，在当前部署的Vercel项目中的“部署”菜单中单击Redeploy，以使功能最大持续时间修改生效。

Ubuntu部署

在仓库版本中，查找相应的二进制文件的最新版本，例如一个名为uni-api-linux-x86_64-0.0.0.0.99.pex的文件。在服务器上下载二进制文件并运行它：

wget https://github.com/yym68686/uni-api/releases/download/v0.0.99/uni-api-linux-x86_64-0.0.99.pex
chmod +x uni-api-linux-x86_64-0.0.99.pex
./uni-api-linux-x86_64-0.0.99.pex

SERV00远程部署（FreeBSD 14.0）

首先，登录到面板，在其他服务中单击“选项卡”运行您自己的应用程序以启用自己的程序运行您自己的程序，然后转到面板端口预订以随机打开端口。

如果您没有自己的域名，请转到面板www网站并删除提供的默认域名。然后创建一个新的域，该域是您刚刚删除的域。单击高级设置后，将网站类型设置为代理域，代理端口应指向您刚打开的端口。请勿选择使用HTTP。

SSH登录到SERV00服务器，执行以下命令：

git clone --depth 1 -b main --quiet https://github.com/yym68686/uni-api.git
cd uni-api
python -m venv uni-api
tmux new -s uni-api
source uni-api/bin/activate
export CFLAGS= " -I/usr/local/include "
export CXXFLAGS= " -I/usr/local/include "
export CC=gcc
export CXX=g++
export MAX_CONCURRENCY=1
export CPUCOUNT=1
export MAKEFLAGS= " -j1 "
CMAKE_BUILD_PARALLEL_LEVEL=1 cpuset -l 0 pip install -vv -r requirements.txt
cpuset -l 0 pip install -r -vv requirements.txt

CTRL+BD退出TMUX，等待几个小时的安装完成，并且安装完成后，执行以下命令：

tmux attach -t uni-api
source uni-api/bin/activate
export CONFIG_URL=http://file_url/api.yaml
export DISABLE_DATABASE=true
# Modify the port, xxx is the port, modify it yourself, corresponding to the port opened in the panel Port reservation
sed -i ' ' ' s/port=8000/port=xxx/ ' main.py
sed -i ' ' ' s/reload=True/reload=False/ ' main.py
python main.py

使用CTRL+BD退出TMUX，使程序可以在后台运行。此时，您可以在其他聊天客户端中使用Uni-API。卷曲测试脚本：

curl -X POST https://xxx.serv00.net/v1/chat/completions 
-H ' Content-Type: application/json ' 
-H ' Authorization: Bearer sk-xxx ' 
-d ' {"model": "gpt-4o","messages": [{"role": "user","content": "Hello"}]} '

参考文档：

https://docs.serv00.com/python/

https://linux.do/t/topic/201181

https://linux.do/t/topic/218738

Docker本地部署

启动容器

docker run --user root -p 8001:8000 --name uni-api -dit 
-e CONFIG_URL=http://file_url/api.yaml  # If the local configuration file has already been mounted, there is no need to set CONFIG_URL
-v ./api.yaml:/home/api.yaml  # If CONFIG_URL is already set, there is no need to mount the configuration file
-v ./uniapi_db:/home/data  # If you do not want to save statistical data, there is no need to mount this folder
yym68686/uni-api:latest

或者，如果您想使用Docker组合，这里是Docker-Compose.yml示例：

 services :
  uni-api :
    container_name : uni-api
    image : yym68686/uni-api:latest
    environment :
      - CONFIG_URL=http://file_url/api.yaml # If a local configuration file is already mounted, there is no need to set CONFIG_URL
    ports :
      - 8001:8000
    volumes :
      - ./api.yaml:/home/api.yaml # If CONFIG_URL is already set, there is no need to mount the configuration file
      - ./uniapi_db:/home/data # If you do not want to save statistical data, there is no need to mount this folder

config_url是可以自动下载的远程配置文件的URL。例如，如果您不舒服地修改了某个平台上的配置文件，则可以将配置文件上传到托管服务，并为UNI-API提供直接链接以下载，即config_url。如果您使用的是本地安装的配置文件，则无需设置config_url。当不方便地安装配置文件时，使用config_url。

在后台运行Docker组合容器

docker-compose pull
docker-compose up -d

Docker Build

docker build --no-cache -t uni-api:latest -f Dockerfile --platform linux/amd64 .
docker tag uni-api:latest yym68686/uni-api:latest
docker push yym68686/uni-api:latest

一单击重新启动Docker图像

 set -eu
docker pull yym68686/uni-api:latest
docker rm -f uni-api
docker run --user root -p 8001:8000 -dit --name uni-api 
-e CONFIG_URL=http://file_url/api.yaml 
-v ./api.yaml:/home/api.yaml 
-v ./uniapi_db:/home/data 
yym68686/uni-api:latest
docker logs -f uni-api

恢复卷曲测试

curl -X POST http://127.0.0.1:8000/v1/chat/completions 
-H " Content-Type: application/json " 
-H " Authorization: Bearer ${API} " 
-d ' {"model": "gpt-4o","messages": [{"role": "user", "content": "Hello"}],"stream": true} '

PEX Linux包装：

VERSION= $( cat VERSION )
pex -D . -r requirements.txt 
    -c uvicorn 
    --inject-args ' main:app --host 0.0.0.0 --port 8000 ' 
    --platform linux_x86_64-cp-3.10.12-cp310 
    --interpreter-constraint ' ==3.10.* ' 
    --no-strip-pex-env 
    -o uni-api-linux-x86_64- ${VERSION} .pex

MacOS包装：

VERSION= $( cat VERSION )
pex -r requirements.txt 
    -c uvicorn 
    --inject-args ' main:app --host 0.0.0.0 --port 8000 ' 
    -o uni-api-macos-arm64- ${VERSION} .pex

赞助商

我们感谢以下赞助商的支持：

@PowerHunter：¥2000
@IOI：¥50

如何赞助我们

如果您想支持我们的项目，则可以通过以下方式赞助我们：

贝宝
USDT-TRC20，USDT-TRC20钱包地址： TLFbqSv5pDu5he43mVmK1dNx7yBMFeN7d8
微信
支付宝

谢谢您的支持！

常问问题

为什么错误Error processing request or performing moral check: 404: No matching model found总是出现的？

将enable_moderation设置为false将解决此问题。当enable_moderation为True时，API必须能够使用文本模型最终模型，并且如果您在提供商模型设置中未提供最终文本模型 - 最终，则会发生错误，表明找不到该模型。

如何优先考虑特定频道的请求，如何设置频道的优先级？

直接在API_KEYS中设置通道顺序。无需其他设置。示例配置文件：

 providers :
  - provider : ai1
    base_url : https://xxx/v1/chat/completions
    api : sk-xxx

  - provider : ai2
    base_url : https://xxx/v1/chat/completions
    api : sk-xxx

api_keys :
  - api : sk-1234
    model :
      - ai2/*
      - ai1/*

这样，首先请求AI2，如果失败，请求AI1。

各种调度算法背后的行为是什么？例如，fixed_priority，加权_ROUND_ROBIN，彩票，随机，round_robin？

所有调度算法都需要通过设置api_keys启用。（API）.preferences.preferences.scheduling_algorithm配置文件中的任何值：fixed_priority，firesed_round_robibin，lottery，lottery，lottery，lottery，andar inter，randy，round_robin。

fixe_priority：固定优先级计划。所有请求始终由首先具有用户请求的模型频道执行。如果发生错误，它将切换到下一个通道。这是默认的调度算法。
加权_ROUND_ROBIN：加权圆形旋转负载平衡，根据配置文件api_keys中设置的权重订单。
彩票：绘制圆形旋转负载平衡，根据配置文件api_keys中的权重。（API）.model随机随机请求模型的通道。
Round_robin：圆形载荷负载平衡，请求根据配置文件API_KEYS中的配置订单。您可以检查有关如何设置频道优先级的上一个问题。

base_url应该如何正确填充？

除了高级配置中显示的某些特殊频道外，所有OpenAI格式提供商都需要完全填写base_url，这意味着base_url必须以/v1/chat/completions结尾。如果您使用的是github型号，则应将base_url填充为https://models.inference.ai.ai.azure.com/chat/completions，而不是Azure的URL。

模型超时时间如何工作？渠道级超时设置和全局模型超时设置的优先级是什么？

频道级超时设置的优先级高于全局模型超时设置。优先顺序是：通道级模型超时设置>频道级默认超时设置>全局模型超时设置>全局默认超时设置>“环境变量timeout”。

通过调整模型超时时间，您可以避免某些通道的错误时间。如果遇到错误{'error': '500', 'details': 'fetch_response_stream Read Response Timeout'} ，请尝试增加模型超时时间。

API_KEY_RATE_LIMIT如何工作？如何为多个型号设置相同的速率限制？

如果要为GEMINI-1.5-PRO-LATEST，Gemini-1.5-Pro，Gemini-1.5-Pro-001，Gemini-1.5-Pro-002设置相同的频率限制，同时可以这样设置：

 api_key_rate_limit :
  gemini-1.5-pro : 1000/min

这将匹配所有包含Gemini-1.5-Pro字符串的模型。这四个型号的频率限制，Gemini-1.5-Pro-latest，Gemini-1.5-Pro，Gemini-1.5-Pro-001，Gemini-1.5-Pro-002，都将设置为1000/min。配置API_KEY_RATE_LIMIT字段的逻辑如下，这是一个示例配置文件：

 api_key_rate_limit :
  gemini-1.5-pro : 1000/min
  gemini-1.5-pro-002 : 500/min

目前，如果有型号Gemini-1.5-Pro-002的请求。

首先，Uni-API将尝试在API_KEY_RATE_LIMIT中精确匹配该模型。如果设置了Gemini-1.5-Pro-002的速率限制，则GEMINI-1.5-PRO-002的速率限制为500/min。如果此时请求的模型不是双子座1.5-Pro-002，而是Gemini-1.5-Pro-latt，因为API_KEY_RATE_RATE_LATE_LILATE_LIMIT没有针对Gemini-1.5-Pro最latest设置的速率限制，它将寻找具有同一型号的prefix Asgini-1.5-pro-latest for dem dem the gempro的限制，因此它将寻找任何模型，因此1000/min。

星历史

展开

uni api

Uni-Api

介绍

特征

用法方法

方法1：安装`api.yaml`配置文件以启动UNI-API

方法两个：使用`CONFIG_URL`环境变量启动uni-api

环境变量

Vercel远程部署

Ubuntu部署

SERV00远程部署（FreeBSD 14.0）

Docker本地部署

赞助商

如何赞助我们

常问问题

星历史

evolution api

欧洲大学

施法者大学

大学

破碎的大学

坦克大学

chat.petals.dev

GPT Prompt Templates

GPTyped

Google Dorks

shepherd

mongo express

Google Dorks

shepherd

mongo express

uni api

Uni-Api

介绍

特征

用法方法

方法1：安装api.yaml配置文件以启动UNI-API

方法两个：使用CONFIG_URL环境变量启动uni-api

环境变量

Vercel远程部署

Ubuntu部署

SERV00远程部署（FreeBSD 14.0）

Docker本地部署

赞助商

如何赞助我们

常问问题

星历史

方法1：安装`api.yaml`配置文件以启动UNI-API

方法两个：使用`CONFIG_URL`环境变量启动uni-api