AivisSpeech Download - AivisSpeech Source code download

AivisSpeech

AI Source Code

1.0.0

Download

AivisSpeech

? AivisSpeech: AI V oice I mitation S ystem - Text to Speech Software

AivisSpeech is Japanese speech synthesis software based on VOICEVOX's editor UI.
It incorporates the Japanese speech synthesis engine AivisSpeech Engine, which allows you to easily generate emotional voices.

? Download AivisSpeech / ? Download AivisSpeech Engine

To users
Operating environment
Supported speech synthesis models
- Supported model architecture
- Where to place the model file
Development Policy
Building a development environment
development
license

To users

If you are looking to use AivisSpeech, please visit the AivisSpeech official website.

This page contains information mainly for developers.
Below is a document for users:

How to use
Frequently Asked Questions / Q&A
inquiry

Operating environment

Compatible with PCs with Windows and macOS.
To start AivisSpeech, your PC needs at least 3GB of free memory (RAM).

Windows: Windows 10 (22H2 or later)/Windows 11
macOS: macOS 13 Ventura or later

Note

We have not actively verified the operation on Macs with Intel CPUs.
Macs with Intel CPUs are already out of production, and it is becoming more difficult to prepare a verification and build environment. We recommend using this product on a Mac with Apple Silicon as much as possible.

Warning

On Windows 10, we only check the operation of version 22H2.
There have been reports of cases where AivisSpeech Engine crashes and fails to start on older versions of Windows 10 that have no support.
From a security perspective, we strongly recommend that you use Windows 10 environments only update to the minimum version 22H2 before using the service.

Supported speech synthesis models

The AivisSpeech Engine, built into AivisSpeech, supports speech synthesis model files in the AIVMX ( Ai vis V oice M odel for ONN X ) ( .aivmx ) format.

AIVM ( Ai vis V oice M odel) / AIVMX ( Ai vis V oice M odel for ONN X ) is an open file format for AI speech synthesis models that combines pre-trained models, hyperparameters, style vectors, speaker metadata (names, overviews, licenses, icons, voice samples, etc.) into one file .

For more information about the AIVM specifications and AIVM/AIVMX files, please refer to the AIVM specifications developed in the Aivis Project.

Note

"AIVM" is also a general term for both AIVM/AIVMX format specifications and metadata specifications.
Specifically, the AIVM file is a model file in "Safetensors format with AIVM metadata added", and the AIVMX file is a model file in "ONNX format with AIVM metadata added".
"AIVM Metadata" refers to various metadata that is linked to a trained model as defined in the AIVM specification.

Important

The AivisSpeech Engine is also a reference implementation of the AIVM specification, but it is deliberately designed to support only AIVMX files.
This eliminates dependency on PyTorch, reduces installation size and provides fast CPU inference with ONNX Runtime.

Tip

AIVM Generator allows you to generate AIVM/AIVMX files from existing speech synthesis models and edit the metadata of existing AIVM/AIVMX files!

Supported model architecture

The AIVMX files for the following model architectures are available:

Style-Bert-VITS2
Style-Bert-VITS2 (JP-Extra)

Note

The AIVM metadata specification allows you to define multilingual speakers, but AivisSpeech Engine, like VOICEVOX ENGINE, only supports Japanese speech synthesis.
Therefore, even if you use a speech synthesis model that supports English or Chinese, speech synthesis other than Japanese cannot be performed.

Where to place the model file

Place the AIVMX files in the following folders for each OS:

Windows: C:Users(ユーザー名)AppDataRoamingAivisSpeech-EngineModels
macOS: ~/Library/Application Support/AivisSpeech-Engine/Models
Linux: ~/.local/share/AivisSpeech-Engine/Models

The actual folder path will be displayed as Models directory: in the logs immediately after starting AivisSpeech Engine.

Tip

When using AivisSpeech, you can easily add speech synthesis models from the AivisSpeech UI screen!
For end users, we recommend that you add a speech synthesis model using this method.

Important

The deployment folder for the development version (if you are running AivisSpeech Engine while not built with PyInstaller) is not AivisSpeech-Engine or below AivisSpeech-Engine-Dev .

Development Policy

VOICEVOX is a huge software that is still actively developed today.
Therefore, AivisSpeech is developing the latest version of VOICEVOX under the following policy:

To make it easier to follow the latest version of VOICEVOX, keep the changes to the minimum possible.
- Rebrand from VOICEVOX to AivisSpeech only where necessary
Do not refactor
- It is easy to expect conflicts with VOICEVOX, and you are not familiar with the entire code.
Even if a function is not used in AivisSpeech (such as the singing voice synthesis function), the code will not be deleted.
- This also helps to avoid conflicts
- Disabling unused codes is not deleted, but comment out
Documents are not updated due to difficulties in maintenance and follow-up
- As a result, none of the documents have been updated and do not reflect changes in AivisSpeech
Due to changes for AivisSpeech, it is difficult to maintain the test code, so test code will not be updated.
- In particular, the E2E test does not work properly due to the large changes in the UI.
- Some of the test code has been modified to suit AivisSpeech, but it has not been verified and is left unattended.

Building a development environment

The steps differ from the original VOICEVOX.
Node.js 20.12.2 must be installed beforehand.

 # 依存関係をすべてインストール
npm ci

# .env.development を .env にコピー
# # コピーした .env を編集する必要はない
cp .env.development .env

# macOS のみ、.env.production を編集
nano .env.production
--------------------
# executionFilePath を "AivisSpeech-Engine/run.exe" から "../Resources/AivisSpeech-Engine/run" に書き換える
# # executionFilePath は、npm run electron:build でビルドした製品ビルドの AivisSpeech の起動時に使用される
...
VITE_DEFAULT_ENGINE_INFOS= ` [
    {
        " uuid " : " 1b4a5014-d9fd-11ee-b97d-83c170a68ed3 " ,
        " name " : " AivisSpeech Engine " ,
        " executionEnabled " : true,
        " executionFilePath " : " ../Resources/AivisSpeech-Engine/run " ,
        " executionArgs " : [],
        " host " : " http://127.0.0.1:10101 "
    }
] `
...
--------------------

# 事前に別のターミナルで AivisSpeech Engine を起動しておく
# # AivisSpeech Engine の開発環境は別途構築する必要がある
cd ../AivisSpeech-Engine
poetry run task serve

development

The steps differ from the original VOICEVOX.

 # 開発環境で Electron 版 AivisSpeech を起動
npm run electron:serve

# 開発環境でブラウザ版 AivisSpeech を起動
npm run browser:serve

# Electron 版 AivisSpeech をビルド
npm run electron:build

# ブラウザ版 AivisSpeech (WIP) をビルド
npm run browser:build

# コードフォーマットを自動修正
npm run format

# コードフォーマットをチェック
npm run lint

# OpenAPI Generator による自動生成コードを更新
npm run openapi:generate

# 依存ライブラリのライセンス情報を生成
# # VOICEVOX と異なり、音声合成エンジンとのライセンス情報との統合は行わない
# # エディタ側で別途エンジンマニフェストから取得したライセンス情報を表示できるようにしているため不要
npm run license:generate

license

Of the dual licenses of the base VOICEVOX/VOICEVOX ENGINE, only LGPL-3.0 will be inherited independently.

The following and docs/ The following documents carry on the original VOICEVOX document without modification. It is not guaranteed that the contents of these documents will also apply to AivisSpeech.

VOICEVOX

This is the editor for VOICEVOX.

(The engine is VOICEVOX ENGINE, the core is VOICEVOX CORE, and the overall structure can be found here.)

To users

This is a development page. For information on how to use it, please visit the VOICEVOX official website.

For those who want to contribute to the project

The VOICEVOX project welcomes interested participants. We have a guide explaining the contribution procedure.

When you think of contributions, you might think of creating a program, but there are various ways to participate, such as writing a document, generating tests, and participating in discussions in proposals for improvements. We also offer tasks that welcome beginners, so we look forward to your participation.

VOICEVOX's editors use Electron, TypeScript, Vue, Vuex, and more, making the overall structure difficult to understand.
We will introduce the structure in how to walk the code, so we hope it will help you develop it.

When creating a pull request to resolve an issue, please either tell the Issue that you have begun working on it or create a Draft pull request first to avoid working on the same issue as someone else.

Development discussions and chats are held on the VOICEVOX unofficial Discord server. Please feel free to join us.

Design Guidelines

Please refer to UX/UI Design Policy.

Environmental construction

Please install the version of Node.js listed in .node-version.
Node.js management tools (such as nvs and Volta) make it easy to install and automatically switch Node.js.

After installing Node.js, fork this repository and git clone it.

Installing dependent libraries

Run the following command to install and update the dependent libraries:

npm ci

execution

Engine preparation

Copy .env.production to create .env , and specify vv-engine/run.exe in the product version of VOICEVOX in executionFilePath in VITE_DEFAULT_ENGINE_INFOS to vv-engine/run.exe in the product version of VOICEVOX to run.

If you have not changed the installation location on Windows, specify C:/Users/(ユーザー名)/AppData/Local/Programs/VOICEVOX/vv-engine/run.exe .
Please note that the path delimiter is not /

If you are using VOICEVOX.app for macOS, please specify /path/to/VOICEVOX.app/Resources/MacOS/vv-engine/run .

For Linux, specify vv-engine/run command included in the tar.gz version available from Releases. For the AppImage version, you can mount the file system using $ /path/to/VOICEVOX.AppImage --appimage-mount .

If you are running an engine API server separately from running the VOICEVOX editor, you do not need to specify executionFilePath , but instead, set executionEnabled to false . This applies even if you are running the product version of VOICEVOX.

If you want to change the destination endpoint of the engine API, change host in VITE_DEFAULT_ENGINE_INFOS .

Running Electron

 # 開発しやすい環境で実行
npm run electron:serve

# ビルド時に近い環境で実行
npm run electron:serve -- --mode production

The voice synthesis engine repository is here: https://github.com/VOICEVOX/voicevox_engine

Running a Storybook

You can develop components using Storybooks.

npm run storybook

You can view the Storybook for the main branch from VOICEVOX/preview-pages.
https://voicevox.github.io/preview-pages/preview/branch-main/storybook/index.html

Running the browser version (in development)

Start the separate speech synthesis engine and run the following to access the displayed localhost:

npm run browser:serve

Additionally, the build results for the main branch have been deployed in VOICEVOX/preview-pages.
https://voicevox.github.io/preview-pages/preview/branch-main/editor/index.html
Now you need to start the speech synthesis engine on your local PC.

Build

npm run electron:build

Build with Github Actions

You can build by turning on Actions in the repository that you forked and launching build.yml with workflow_dispatch. The deliverables will be uploaded to Release.

test

Unit Test

Run the tests under ./tests/unit/ and the Storybook tests.

npm run test:unit
npm run test-watch:unit # 監視モード
npm run test-ui:unit # VitestのUIを表示
npm run test:unit -- --update # スナップショットの更新

Note

The tests under ./tests/unit change the environment in which the test is run depending on the file name.

.node.spec.ts : Node.js environment
.browser.spec.ts : Browser environment (Chromium)
.spec.ts : Browser environment (emulatory using happy-dom)

Browser End to End Test

Run End to End tests such as UI and speech synthesis that do not require Electron functionality.

Note

Tests that rewrite settings for some engines are now only performed on CI (Github Actions).

npm run test:browser-e2e
npm run test-watch:browser-e2e # 監視モード
npm run test-watch:browser-e2e -- --headed # テスト中の UI を表示
npm run test-ui:browser-e2e # Playwright の UI を表示

You can also generate test patterns because you use Playwright. Run the following command while the browser version is running :

npx playwright codegen http://localhost:5173/  --viewport-size=1024,630

For more information, see Test generator in the Playwright documentation.

Visual Regression Testing for Storybook

Compare screenshots of Storybook components and show differences if any changes are made.

Note

This test can only be run on Windows.

npm run test:storybook-vrt
npm run test-watch:storybook-vrt # 監視モード
npm run test-ui:storybook-vrt # Playwright の UI を表示

Update screenshots

Browser End to End tests and Storybooks provide Visual Regression Testing. Currently, VRT testing is performed only on Windows. You can update your screenshot by following the steps below:

If you want to update with Github Actions

Enable GitHub Actions in the forked repository settings.
Under Actions > General > Workflow permissions in the repository settings, select Read and write permissions.
Commit the string [update snapshots] in the commit message.
```
git commit -m " UIを変更 [update snapshots] "
```
Once Github Workflow is complete, the updated screenshot will be committed.

After pulling, push an empty commit and run the test again.

git commit --allow-empty -m " （テストを再実行） "
git push

Note

You can automatically rerun the test by creating a token and adding it to Secrets.

Go to Fine-granted Tokens.
Enter an appropriate name, grant access toユーザー名/voicevox , and under Repository permissions, select Read and write under Contents.
Example configuration
Create a token and copy the string.
Open Settings > Secrets and variables > Actions > New repository secret inユーザー名/voicevox repository.
Enter PUSH_TOKEN in the name, paste the string above and add Secrets.

If you want to update locally

Only those that support the local PC's OS will be updated.

npm run test:browser-e2e -- --update-snapshots

Electron End to End Test

Run End to End tests that require Electron functionality, including engine startup and termination.

npm run test:electron-e2e
npm run test-watch:electron-e2e # 監視モード

Generating dependent library license information

Licensing information for dependent libraries is automatically generated when building with Github Workflow. You can generate it with the following command:

 # get licenses.json from voicevox_engine as engine_licenses.json

npm run license:generate -- -o voicevox_licenses.json
npm run license:merge -- -o public/licenses.json -i engine_licenses.json -i voicevox_licenses.json

Code Format

Format the code. Please execute it before sending a pull request.

npm run fmt

Lint (static analysis)

Static analysis of your code to prevent bugs. Please execute it before sending a pull request.

npm run lint

Typo check

I use typos to check my typos.

npm run typos

You can check the type. If there are any files that should be excluded from false judgments or checks, please follow the instructions in the configuration file to edit _typos.toml .

Type check

TypeScript type checking.

npm run typecheck

Markdownlint

Performs a Markdown grammar check.

npm run markdownlint

Shellcheck

Do a ShellScript grammar check. For installation instructions, please see here.

shellcheck ./build/ * .sh

OpenAPI generator

Run the following command while the speech synthesis engine is running.

curl http://127.0.0.1:50021/openapi.json > openapi.json

npx openapi-generator-cli generate 
    -i openapi.json 
    -g typescript-fetch 
    -o src/openapi/ 
    --additional-properties " modelPropertyNaming=camelCase,supportsES6=true,withInterfaces=true,typescriptThreePlus=true "

npm run fmt

Updating the OpanAPI generator

You can check and install a new version using the following command:

npx openapi-generator-cli version-manager list

Running debugging in VS Code

Under development builds such as npm scripts serve and electron:serve , sourcemap is output in the vite used for the build, so the source code and the output code are mapped.

Copying .vscode/launch.template.json to create .vscode/launch.json and .vscode/tasks.template.json .vscode/tasks.json enable tasks that allow for debugging by running development builds from VS Code.

license

It is a dual license with LGPL v3 and separate licenses that do not require the publication of the source code. If you would like to obtain a separate license, please ask Heho.
X Account: @hiho_karuta

Expand

Additional Information

Version 1.0.0
Type AI Source Code
Update Time 2025-08-23
size 10MB
From Github

Related Applications

ML stack

2025-07-01
awesome free chatgpt

2025-01-04
pywin_contextmenu

2025-08-31
promptl

2025-02-17
tick.chat

2025-09-16
FastLoRAChat

2025-09-03

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
ML stack

AI Source Code

1.0.0
awesome free chatgpt

AI Source Code

1.0.0
pywin_contextmenu

AI Source Code

Version update
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3

Related Information All