Awesome Music Generation下載 - Awesome Music Generation源代碼下載

很棒的音樂生成

歡迎來到MG ² ！

？我們已經更新了CLMP培訓，微調代碼和文檔！快來檢查一下〜[2024-11-09]

？我們發布了MelodySet數據集。 [2024-11-08]

？我們發布了Musicset數據集！快來嘗試一下〜？ [2024-11-05]

首先嘗試我們的演示！

→←單擊此處！

該存儲庫包含音樂發電Model Mg ²的實現，這是一種使用旋律來指導音樂發電的第一種小說方法，儘管一種非常簡單的方法和極為有限的資源，但仍取得了出色的性能。

任何人都可以使用此模型在Tiktok，YouTube短褲和Meta Reels等平台上為其簡短視頻生成個性化的背景音樂。此外，使用自己的私人音樂數據集微調模型非常成本效益。

影片

您可以觀看介紹視頻

→←單擊此處！

在線服務

現在您可以在我們的提示中嘗試使用自己的提示

→←單擊此處！

提示：要使用MG ²產生高質量的音樂，您需要製作詳細的描述性提示，以提供豐富的上下文和特定的音樂元素。

快速開始

要開始使用MG ² ，請按照以下步驟操作：

步驟1：克隆存儲庫

git clone https://github.com/shaopengw/Awesome-Music-Generation.git
cd Awesome-Music-Generation

步驟2：設置Conda環境

 # Create and activate the environment from the provided environment file
conda env create -f environment.yml
conda activate MMGen_quickstart

步驟3：從HuggingFace下載檢查點

 # Ensure that the checkpoints are stored in the following directory structure
Awesome-Music-Generation/
└── data/
    └── checkpoints/

步驟4：修改Quick_start.sh腳本中的PythonPath環境變量

 # Update the paths to reflect your local environment setup
# Replace:
export PYTHONPATH=/mnt/sda/quick_start_demonstration/Awesome-Music-Generation: $PYTHONPATH
export PYTHONPATH=/mnt/sda/quick_start_demonstration/Awesome-Music-Generation/data: $PYTHONPATH
# With:
export PYTHONPATH=/your/local/path/Awesome-Music-Generation: $PYTHONPATH
export PYTHONPATH=/your/local/path/Awesome-Music-Generation/data: $PYTHONPATH

步驟5：分配腳本執行權限

chmod +x quick_start.sh

步驟6：執行快速啟動腳本

bash quick_start.sh

讓腳本運行幾分鐘。完成後，結果將在以下目錄中可用：

Awesome-Music-Generation/log/latent_diffusion/quick_start/quick_start

數據集

Musicset

我們介紹了新提出的Musicset數據集，其中大約有150,000個高質量的10秒音樂 - 音樂文本對。

CLMP的數據集結構

我們建議在訓練擴散模塊之前，提出CLMP（對比性語言訓練預處理），以使文本描述，音樂波形和旋律對齊。我們將WebDataSet用作音樂波形和文本描述的數據裝載機，並使用另一個數據級加載器進行旋律。 Musicset已被構成以下是CLMP的傳播：

 # Ensure that the training data packaged with Webdataset format is orginized as following:
clmp/
└── dataset/
    └── MusicSet/
        └──train/pretrain0.tar
                 pretrain1.tar
                 pretrain2.tar
                 ...
        └──valid/
        └──test/

擴散模塊的數據集結構

擴散模塊的數據集結構如下：

（注意到您必須將.flac文件轉換為.wav格式。）

Awesome-Music-Generation/
└── data/
    └── dataset/
       └── audioset/
           └── wav/00040020.wav
                   00009570.wav
                   ...
       └── metadata/dataset_root.json
           └── MusicSet/
               └── datafiles/train.json
                             valid.json
                             test.json

以下是dataset_root.json的示例：

{
    " MusicSet " : " /mnt/data/wmz/Awesome-Music-Generation/data/dataset/audioset " ,
    " comments " : {},
    " metadata " : {
      " path " : {
        " MusicSet " : {
          " train " : " ./data/dataset/metadata/MusicSet/datafiles/train.json " ,
          " test " : " ./data/dataset/metadata/MusicSet/datafiles/test.json " ,
          " val " : " ./data/dataset/metadata/MusicSet/datafiles/valid.json " ,
          " class_label_indices " : " "
        }
      }
    }
  }

以下是train.json的示例：

{
    " data " : [
        {
            " wav " : " wav/00040020.wav " ,
            " seg_label " : " " ,
            " labels " : " " ,
            " caption " : " The song starts with the high and fuzzy tone of an alarm bell beeping until a button is pressed, which triggers the grungy sound of an electric guitar being played in a rock style. " , " The beat then counts to four, enhancing the overall rhythm. "
        },
        {
            " wav " : " wav/00009570.wav " ,
            " seg_label " : " " ,
            " labels " : " " ,
            " caption " : " This lively song features a male vocalist singing humorous lyrics over a medium-fast tempo of 106. " , " 0 beats per minute. " , " Accompanied by keyboard harmony, acoustic guitar, steady drumming, and simple bass lines, the catchy tune is easy to sing along with. " , " Set in the key of B major, the chord sequence includes Abm7, F#/G#, and Emaj7. " , " With its spirited and animated feel, this fun track is sure to keep listeners engaged from start to finish. "
        }
    ]
}

旋律

我們將發布旋律，其中包含用於音樂播放和音樂台的加工旋律。我們使用基本式觸發提取旋律，並使用旋律三重態組織它們。 MelodySet是每個波形文件的子集.wav具有相應的旋律文件.txt ，帶有相同的文件名前綴。例如， 00040020.wav對應於00040020.txt ，所有旋律都放在單個目錄中。

音樂波形和文本描述的原理與Musicset相同。因此，我們僅顯示旋律部分的數據集結構如下：

your_path/
└── melody_text/00040020.txt
               00009570.txt

以下是旋律的示例，由旋律三重態組成：

<G4>,<114>,<79>|<A4>,<119>,<81>|<B2>,<159>,<0>|<G4>,<117>,<62>|<A4>,<91>,<77>|<D3>,<202>,<0>|<B4>,<92>,<72>|<A4>,<95>,<77>|<B4>,<98>,<80>|<G3>,<200>,<0>|<A4>,<151>,<30>|<G4>,<95>,<77>|<A4>,<93>,<82>|<F#3>,<146>,<0>|<A2>,<201>,<0>|<G2>,<116>,<117>|<G3>,<149>,<0>|<B2>,<122>,<75>|<D3>,<110>,<77>|<B4>,<206>,<0>|<B4>,<113>,<111>|<B3>,<90>,<95>|<A3>,<110>,<57>|<E5>,<113>,<41>|<G3>,<177>,<0>|<D#5>,<119>,<73>|<B3>,<119>,<32>|<C4>,<108>,<78>|<E5>,<111>,<49>|<F#5>,<117>,<82>|<E5>,<111>,<78>|<F#5>,<114>,<82>|<G3>,<151>,<0>|<G5>,<95>,<73>|<F#5>,<91>,<81>|<G5>,<92>,<78>|<A3>,<143>,<43>|<E4>,<202>,<0>|<F#5>,<152>,<30>|<E5>,<98>,<86>|<D#4>,<139>,<8>|<B3>,<142>,<0>|<F#5>,<94>,<68>|<B3>,<111>,<120>|<G3>,<114>,<84>|<B3>,<118>,<83>|<E3>,<122>,<81>|<G5>,<231>,<0>|<E4>,<234>,<0>|<F#5>,<118>,<63>|<E5>,<114>,<79>|<G3>,<118>,<37>|<D5>,<122>,<76>|<C#5>,<119>,<78>|<E5>,<119>,<77>|<B3>,<100>,<78>|<B4>,<123>,<57>|<E5>,<112>,<71>|<A3>,<209>,<0>|<G5>,<123>,<105>|<A4>,<154>,<0>|<F#5>,<124>,<73>|<A3>,<136>,<22>|<C#4>,<205>,<0>|<E5>,<125>,<28>|<F#5>,<121>,<74>|<A5>,<115>,<72>|<D3>,<144>,<0>|<E3>,<95>,<81>|<E5>,<122>,<62>|<A5>,<115>,<76>|<F#3>,<106>,<84>|<D5>,<117>,<48>|<C5>,<125>,<74>|<D3>,<102>,<74>|<B4>,<120>,<50>|<A4>,<123>,<76>|<B4>,<116>,<80>|<D5>,<117>,<79>|<D4>,<319>,<0>|<A4>,<113>,<65>|<C4>,<114>,<42>|<D5>,<116>,<78>|<B3>,<108>,<84>|<G4>,<114>,<43>

培訓和微調

假設您已經經歷了快速入門指南，那麼讓我們研究培訓和微調過程！

conda activate MMGen_quickstart

CLMP

本節涵蓋了CLMP的培訓和微調過程。

 cd your_path/MMGen_train/modules/clmp

訓練

在運行培訓腳本之前，請根據需要查看和更新（至關重要的）路徑/MMGEN_TRAIN/模塊/CLMP/Traine.sh 。該文件包含必要的培訓詳細信息。

bash training.sh

微調

同樣，在進行微調之前，請審查和更新（至關重要） Awesome -Music Generation/MMGEN_TRAIN/MMGEN_TRAIN/模塊/clmp/fine_tuning.sh中的路徑。

bash fine_tuning.sh

CLMP嵌入提取和FAISS指數構建

在CLMP模型訓練或微調之後，您需要生成嵌入並構建FAISS指數，以在潛在的擴散訓練階段實現有效的相似性搜索。遵循這個兩步的過程：

通過將以下標誌添加到您的培訓配置：

--collect-audio-melody-feature True

使用此標誌執行培訓或微調腳本：

bash training.sh  # or fine_tuning.sh

該模型將在以下目錄中生成音頻和旋律功能嵌入：

your_path/Awesome-Music-Generation/MMGen_train/modules/clmp/faiss_indexing/clmp_embeddings

構建Faiss索引導航到索引目錄並執行索引構造腳本：

 cd your_path/Awesome-Music-Generation/MMGen_train/modules/clmp/faiss_indexing

 # you should modify the path of embeddings in this script
python build_faiss_indices.py

該腳本將在：

your_path/Awesome-Music-Generation/MMGen_train/modules/clmp/faiss_indexing/faiss_indices

擴散模塊

在擴散模塊的培訓或填充之前，您應該準備所需的文件並替換腳本中相應的文件路徑。

首先，您應該設置模式。在腳本中， MMGen_train/train/latent_diffusion.py為了評估目的，請僅設置only_validation = True ;為了培訓目的，請僅設置only_validation = False 。

然後，您應該為Melody Vector數據庫準備所需的文件，包括.faiss和.npy ，可以在HuggingFace中找到。請替換腳本中的.faiss和.npy的路徑MMGen_train/modules/latent_diffusion/ddpm.py

 # change the melody_npy and melody.faiss to the local path
        melody_npy = np.load( " MMGen/melody.npy " )
        melody_builder = FaissDatasetBuilder(melody_npy)
        melody_builder.load_index( " MMGen/melody.faiss " )

之後，您可以運行以下命令從頭開始訓練：

python3 MMGen_train/train/latent_diffusion.py -c MMGen_train/config/train.yaml

關於培訓數據集，請參閱數據集部分

預驗證模型的填充

您也可以使用我們驗證的模型進行淡化，該檢查點是mg2-diffusion-checkpoint.ckpt ，可以在此處找到。

然後，您可以運行以下命令來捕獲自己的模型：

python3 MMGen_train/train/latent_diffusion.py -c MMGen_train/config/train.yaml --reload_from_ckpt data/checkpoints/mg2-diffusion-checkpoint.ckpt

指出不允許MG ²用於商業用途。

待辦事項清單

隨時探索存儲庫並做出貢獻！

致謝

我們真誠地承認了以下開源代碼庫的開發人員。這些資源是無價的火花，在現實世界中激發了創新和進步？

https://github.com/compvis/stable-diffusion
https://github.com/haoheliu/audioldm-training-finetuning
https://github.com/laion-ai/clap
https://github.com/jik876/hifi-gan
https://github.com/facebookresearch/faiss
https://mtg.github.io/mtg-jamendo-dataset

這項研究得到了根據第2020YFC0832702號贈款的關鍵技術研究和開發計劃的支持，以及中國國家自然科學基金會根據7191010107002，62376227，61906159，6233024，62302400，62176014，62176014和Grant No. 20233 33 33333333376227，61906227，61906159，61906159，61906227，62376227，62376227，62376227，62376227，62376227，62376227，62376227，62376227，62376227，62376227，62302400西南金融與經濟學大學的UA人才項目。

引用

 @article { wei2024melodyneedmusicgeneration ,
      title = { Melody Is All You Need For Music Generation } , 
      author = { Shaopeng Wei and Manzhen Wei and Haoyu Wang and Yu Zhao and Gang Kou } ,
      year = { 2024 } ,
      eprint = { 2409.20196 } ,
      archivePrefix = { arXiv } ,
      primaryClass = { cs.SD } ,
      url = { https://arxiv.org/abs/2409.20196 } , 
}