ZerolanLiveRobot Download - ZerolanLiveRobot Source code download

ZerolanLiveRobot

AI Source Code

1.0.0

Download

Zerolan Live Robot 2.0

You may have heard of the famous Neurosama, or Mu Jimeng from China. Do you also want to have your own AI virtual image to accompany you to live broadcast, chat, and play games? The open source Zerolan Live Robot is working to realize your dreams! And this only requires a consumer graphics card!

Zerolan Live Robot is a multi-functional live broadcast robot (AI VTuber), which can automatically read barrage in the Bilibili live broadcast room, observe the designated windows of the computer screen, understand its screen content, manipulate game characters in Minecraft, and make emotional voice chat responses.

Its associated projects KonekoMinecraftBot, zerolan-core, zerolan-data, zerolan-ui.

Note

This project is under continuous development, the current version is 2.0 . You can follow the developer's Bilibili account Akagawa Tsurumi_Channel. It is training AI cat girl based on this project, and it will broadcast the latest progress from time to time.

Features and functions

The following briefly lists what this project supports:

Support items	Supported content
Live broadcast platform	Bilibili \| Twitch
Big language model	THUDM/GLM-4 \| THUDM/ChatGLM3 \| Qwen/Qwen-7B-Chat \| 01ai/Yi-6B-Chat \| augmxnt/shisa-7b-v1
Automatic speech recognition model	iic/speech_paraformer_asr
Speech synthesis model	RVC-Boss/GPT-SoVITS
Image subtitle model	Salesforce/blip-image-captioning-large
Optical character recognition model	paddlepaddle/PaddleOCR
Video subtitle model	iic/multi-modal_hitea_video-captioning_base_en
External callable tools	Firefox browser, Baidu Encyclopedia, Mengniang Encyclopedia
Game plug-in	Minecraft

Install and run

Caution

Zerolan Live Robot 2.0 version is incompatible with older versions 1.0, so you may need to reconfigure the environment and install dependencies.

The Zerolan framework consists of Zerolan Live Robot, Zerolan Core, Zerolan Data, and Zerolan UI. The following table briefly describes the uses of each project:

Project name	use
Zerolan Live Robot	The control framework of the live broadcast robot makes action responses through collecting environmental data and comprehensive analysis.
Zerolan Core	The core modules that provide AI inference services for live broadcast robots, such as the service-based Web API of large language models.
Zerolan Data	Defines the data format for exchange between services using network requests.
Zerolan UI	The GUI interface based on PyQT6 includes top pop-up windows and prompt sounds, etc.

Deploy core services

Important

This step is a must !

Please move here to complete the deployment of Zerolan Core, which relies heavily on this core service.

Install this project dependency

Run the command, which creates a virtual environment and activates it, and then automatically installs the dependency packages required by this project:

conda create --name ZerolanLiveRobot python=3.10
conda activate ZerolanLiveRobot
pip install -r requirements.txt

If you are in the dev development branch, you may need to install it manually:

pip install git+https://github.com/AkagawaTsurunaki/zerolan-ui.git@dev
pip install git+https://github.com/AkagawaTsurunaki/zerolan-data.git@dev

Modify configuration

Find the resources/config.template.yaml configuration file, change it to config.yaml , and then modify it to the configuration you need according to the comments in the configuration file.

Pipeline

In the pipeline configuration item, you need to note that server_url should contain the protocol, IP and port number, such as http://127.0.0.1:11001 , https://myserver.com:11451 , etc. This is the network address where you deploy Zerolan Core. Each type of model may have a different port.

Tip

Can the server only have one port? Then try to forward your request using Nginx.

Service

In the service configuration item, you need to note that host should only include the IP address, and port should only include the port number.

game.platform field supports minecraft , and the live_stream field supports bilibili , twitch , and youtube .

Tip

Obtain the documentation that may be used by the live broadcast platform API Key:

Bilibili: Get the information required for the Credential class

Twitch: Twitch Developers - Authentication

Youtube: Obtaining authorization credentials

Character

The value of character.chat.filter.strategy can be default .

character.chat.filter.bad_words can fill in a series of filter words.

character.chat.injected_history The array must be of an even number, that is, it must be the end of the message that the AI responds to.

character.chat.max_history specifies how many messages are retained at most, that is, the size of the message window.

character.speech.prompts_dir indicates where your TTS audio files are stored, and your file name should be in the format of [语言][情感标签]文本内容.wav . For example [zh][开心]哇！今天真是一个好天气.wav , where "language" only supports zh , en , and ja ; "emotional tags" are arbitrary, as long as the large language model can be distinguished; "text content" is the text content represented by the vocals in this audio.

External Tool

Caution

There may be a memory leak in the Microsoft Edge browser, so this project is not supported.

The optional value of external_tool.browser.driver is firefox .

external_tool.browser.profile_dir is to ensure that under the control of Selenium, your account login and other information will not be lost. Leaving a blank program will automatically detect the location (but it does not mean that it will definitely be found).

Start this project

Tip

It is recommended to use API testing tools such as Postman before starting to test whether the connection between the computer running this project and Zerolan Core is normal. Zerolan Live Robot provides some advice when pipeline connection errors, which still require you to troubleshoot manually.

Use the following command to run the main program of Zerolan Live Robot:

python main.py

* Minecraft support

Note

This step is optional .

This project and KonekoMinecraftBot implement a set of interfaces that can control robots in Minecraft games from this project. If you need it, please move here to view details.

Custom design robot

The older version of Zerolan Live Robot 1.0 used a simple polling by second to read environment information from cache lists in each service module. In the older version of Zerolan Live Robot 2.0, it was turned to an event-driven design pattern.

EventEmitter

In this project, the robot runs during the sending and processing of a series of events. In other words, without an event, the robot will not respond.

Each event Event contains an event name, which is essentially a string. All event names used in this project are defined in common.enumerator.EventEnum , and you can also expand and add your own event names. Let's take the event of processing user input voice as an example, its event is called EventEnum.SERVICE_VAD_SPEECH_CHUNK .

emitter is a global object used to handle event sending and listener execution. emitter always has the main thread. However, multiple threads will run at the same time during the entire system running, because each thread may have its own instance of EventEmitter.

Use the decorator @emitter.on(EventEnum.某个事件) to quickly register a listener. The listener can be either a synchronous function or an asynchronous function. When we need to send an event, we can use the asynchronous method emitter.emit(EventEnum.某个事件, *args, **kwargs) .

For example, when the system detects a human voice, the SERVICE_VAD_SPEECH_CHUNK event will be sent, and all listeners that register this event will be called to perform some processing:

 @ emitter . on ( EventEnum . SERVICE_VAD_SPEECH_CHUNK )
async def on_service_vad_speech_chunk ( speech : bytes , channels : int , sample_rate : int ):
    response = ... # 假设这里获得了语音识别的结果
    await emitter . emit ( EventEnum . PIPELINE_ASR , response ) # 发送自动语音识别事件

The listener here is on_service_vad_speech_chunk , which is essentially a function that will be called when SERVICE_VAD_SPEECH_CHUNK occurs and accepts several parameters. The parameters here are completely specified by the event sender.

Pipeline

Pipeline is an important implementation of communication with Zerolan Core. The use of pipelines is very simple. You only need to pass in a configuration object to get an available pipeline object. Then call the predict or stream_predict method in the pipeline object to use the AI model in Zerolan Core.

Taking the large language model as an example, specify the address of the target server (the address of your Zerolan Core open port), pass in LLMPipelineConfig object to LLMPipeline to establish the pipeline.

 config = LLMPipelineConfig ( server_url = "..." )
llm = LLMPipeline ( config )
query = LLMQuery ( text = "你好，你叫什么名字？" , history = [])
prediction = llm . predict ( query )
print ( prediction . response )

This should get a reply from the model.

If you want to know more implementation details, you can check the data definition in Zerolan Data, which may also need to be understood in combination with the implementation of the pipeline and the contents in the app.py file in Zerolan Core. Simply put, they are all HTTP-based.

Services

Module	effect	Supported content
browser	Selenium-based browser control	Firefox's browser open, search and close browser
device	Microphone, screenshot, speaker control	Tested on Windows only
filter	Dialogue Blocker	Simple matching filter
game	Game Control Plugin	See KonekoMinecraftBot for details
live_stream	Barrage reading of live broadcast platform	Bilibili, Twitch, Youtube
vad	Human voice audio detection	Audio detection mechanism based on energy threshold