DeepFilterNet下載 - DeepFilterNet源代碼下載

DeepFilternet

使用深層過濾的全頻段音頻（48kHz）的低複雜性語音增強框架（48kHz）。

對於PipeWire集成為虛擬噪聲抑制麥克風，在這裡查看。

演示

deepfilternet-demo-new.mp4

運行演示（僅Linux）使用：

cargo +nightly run -p df-demo --features ui --bin df-demo --release

消息

新的DeepFilternet演示： DeepFilternet：感知動機的實時演講增強
- 論文：https：//arxiv.org/abs/2305.08227
- 視頻：https：//youtu.be/eo7n96ywnye
新的多幀過濾紙：助聽器的深層多幀過濾
- 論文：https：//arxiv.org/abs/2305.08225
實時版本和LADSPA插件
- 預先編譯的二進制，沒有python依賴性。用法： deep-filter audio-file.wav
- LADSPA插件帶有PipeWire Filter-Chain集成，以減少麥克風的實時噪音。
DeepFilternet2紙： DeepFilternet2：靠近嵌入式設備的全帶音頻的實時語音增強
- 論文：https：//arxiv.org/abs/2205.05474
- 樣本：https：//rikorose.github.io/deepfilternet2-smples/
- 演示：https：//huggingface.co/spaces/hshr/deepfilternet2
原始DeepFilternet紙： DeepFilternet：基於Deep Feltering的全頻段音頻的低複雜性語音增強框架
- 論文：https：//arxiv.org/abs/2110.05588
- 樣本：https：//rikorose.github.io/deepfilternet-samples/
- 演示：https：//huggingface.co/spaces/hshr/deepfilternet
- 視頻講座：https：//youtu.be/it90gbqky6k

用法

深層過濾器

從發布頁面下載預編譯的深濾器二進製文件。您可以使用deep-filter來抑制噪聲.wav音頻文件中的噪聲。當前，僅支持具有48kHz的採樣率的WAV文件。

USAGE:
    deep-filter [OPTIONS] [FILES]...

ARGS:
    < FILES > ...

OPTIONS:
    -D, --compensate-delay
            Compensate delay of STFT and model lookahead
    -h, --help
            Print help information
    -m, --model < MODEL >
            Path to model tar.gz. Defaults to DeepFilterNet2.
    -o, --out-dir < OUT_DIR >
            [default: out]
    --pf
            Enable postfilter
    -v, --verbose
            Logging verbosity
    -V, --version
            Print version information

如果您想將Pytorch後端使用用於GPU處理，請參見下面的Python使用情況。

DeepFilternet框架

該框架支持Linux，MacOS和Windows。訓練僅在Linux下進行測試。該框架的結構如下：

libDF包含用於數據加載和增強的RUST代碼。
DeepFilterNet包含DeepFilternet代碼訓練，評估和可視化以及預驗證的模型權重。
pyDF包含Libdf STFT/ISTFT處理環的Python包裝器。
pyDF-data包含LIBDF數據集功能的Python包裝器，並提供了Pytorch數據加載程序。
ladspa包含一個LADSPA插件，用於實時抑制噪聲。
models包含用於在DeepFilternet（Python）或Libdf/deep-Filter（Rust）中使用的預測模型

DeepFilternet Python：PYPI

通過PIP安裝DeepFilternet Python輪：

 # Install cpu/cuda pytorch (>=1.9) dependency from pytorch.org, e.g.:
pip install torch torchaudio -f https://download.pytorch.org/whl/cpu/torch_stable.html
# Install DeepFilterNet
pip install deepfilternet
# Or install DeepFilterNet including data loading functionality for training (Linux only)
pip install deepfilternet[train]

使用DeepFilternet運行來增強嘈雜的音頻文件

 # Specify an output directory with --output-dir [OUTPUT_DIR]
deepFilter path/to/noisy_audio.wav

手動安裝

通過Rusup安裝貨物。建議使用conda或virtualenv 。請閱讀評論，僅執行所需的命令。

python依賴和libdf的安裝：

 cd path/to/DeepFilterNet/  # cd into repository
# Recommended: Install or activate a python env
# Mandatory: Install cpu/cuda pytorch (>=1.8) dependency from pytorch.org, e.g.:
pip install torch torchaudio -f https://download.pytorch.org/whl/cpu/torch_stable.html
# Install build dependencies used to compile libdf and DeepFilterNet python wheels
pip install maturin poetry

#  Install remaining DeepFilterNet python dependencies
# *Option A:* Install DeepFilterNet python wheel globally within your environment. Do this if you want use
# this repos as is, and don't want to develop within this repository.
poetry -C DeepFilterNet install -E train -E eval
# *Option B:* If you want to develop within this repo, install only dependencies and work with the repository version
poetry -C DeepFilterNet install -E train -E eval --no-root
export PYTHONPATH= $PWD /DeepFilterNet # And set the python path correctly

# Build and install libdf python package required for enhance.py
maturin develop --release -m pyDF/Cargo.toml
# *Optional*: Install libdfdata python package with dataset and dataloading functionality for training
# Required build dependency: HDF5 headers (e.g. ubuntu: libhdf5-dev)
maturin develop --release -m pyDF-data/Cargo.toml
# If you have troubles with hdf5 you may try to build and link hdf5 statically:
maturin develop --release --features hdf5-static -m pyDF-data/Cargo.toml

使用命令行的DeepFilternet

使用DeepFilternet運行來增強嘈雜的音頻文件

$ python DeepFilterNet/df/enhance.py --help
usage: enhance.py [-h] [--model-base-dir MODEL_BASE_DIR] [--pf] [--output-dir OUTPUT_DIR] [--log-level LOG_LEVEL] [--compensate-delay]
                  noisy_audio_files [noisy_audio_files ...]

positional arguments:
  noisy_audio_files     List of noise files to mix with the clean speech file.

optional arguments:
  -h, --help            show this help message and exit
  --model-base-dir MODEL_BASE_DIR, -m MODEL_BASE_DIR
                        Model directory containing checkpoints and config.
                        To load a pretrained model, you may just provide the model name, e.g. ` DeepFilterNet ` .
                        By default, the pretrained DeepFilterNet2 model is loaded.
  --pf                  Post-filter that slightly over-attenuates very noisy sections.
  --output-dir OUTPUT_DIR, -o OUTPUT_DIR
                        Directory in which the enhanced audio files will be stored.
  --log-level LOG_LEVEL
                        Logger verbosity. Can be one of (debug, info, error, none)
  --compensate-delay, -D
                        Add some paddig to compensate the delay introduced by the real-time STFT/ISTFT implementation.

# Enhance audio with original DeepFilterNet
python DeepFilterNet/df/enhance.py -m DeepFilterNet path/to/noisy_audio.wav

# Enhance audio with DeepFilterNet2
python DeepFilterNet/df/enhance.py -m DeepFilterNet2 path/to/noisy_audio.wav

在Python腳本中使用DeepFilternet

 from df import enhance , init_df

model , df_state , _ = init_df ()  # Load default model
enhanced_audio = enhance ( model , df_state , noisy_audio )

請參閱此處的完整示例。

訓練

入口點為DeepFilterNet/df/train.py 。它期望包含HDF5數據集的數據目錄以及數據集配置JSON文件。

因此，您首先需要以HDF5格式創建數據集。每個數據集通常僅具有訓練，驗證或測試集的噪聲，語音或RIR。

 # Install additional dependencies for dataset creation
pip install h5py librosa soundfile
# Go to DeepFilterNet python package
cd path / to / DeepFilterNet / DeepFilterNet
# Prepare text file (e.g. called training_set.txt) containing paths to .wav files
#
# usage: prepare_data.py [-h] [--num_workers NUM_WORKERS] [--max_freq MAX_FREQ] [--sr SR] [--dtype DTYPE]
#                        [--codec CODEC] [--mono] [--compression COMPRESSION]
#                        type audio_files hdf5_db
#
# where:
#   type: One of `speech`, `noise`, `rir`
#   audio_files: Text file containing paths to audio files to include in the dataset
#   hdf5_db: Output HDF5 dataset.
python df / scripts / prepare_data . py - - sr 48000 speech training_set . txt TRAIN_SET_SPEECH . hdf5

所有數據集應在一個數據集文件夾中用於火車腳本。

數據集配置文件應包含3個條目：“火車”，“有效”，“測試”。其中每個包含數據集列表（例如語音，噪聲和RIR數據集）。您可以使用多個語音或噪聲數據集。可選地，可以指定可用於超過/不足數據集的抽樣因子。假設您有一個具有瞬態噪聲的特定數據集，並希望通過過採樣來增加非平穩噪聲的數量。在大多數情況下，您想將此因素設置為1。

數據集配置示例：

dataset.cfg

{
  "train" : [
    [
      " TRAIN_SET_SPEECH.hdf5 " ,
      1.0
    ],
    [
      " TRAIN_SET_NOISE.hdf5 " ,
      1.0
    ],
    [
      " TRAIN_SET_RIR.hdf5 " ,
      1.0
    ]
  ],
  "valid" : [
    [
      " VALID_SET_SPEECH.hdf5 " ,
      1.0
    ],
    [
      " VALID_SET_NOISE.hdf5 " ,
      1.0
    ],
    [
      " VALID_SET_RIR.hdf5 " ,
      1.0
    ]
  ],
  "test" : [
    [
      " TEST_SET_SPEECH.hdf5 " ,
      1.0
    ],
    [
      " TEST_SET_NOISE.hdf5 " ,
      1.0
    ],
    [
      " TEST_SET_RIR.hdf5 " ,
      1.0
    ]
  ]
}

最後，啟動培訓腳本。如果不存在用於日誌記錄，一些音頻樣本，模型檢查點和配置，則培訓腳本可能會創建模型base_dir 。如果找不到配置文件，它將創建默認配置。有關配置文件，請參見DeepFilternet/Pretraining_models/deepfilternet。

 # usage: train.py [-h] [--debug] data_config_file data_dir base_dir
python df / train . py path / to / dataset . cfg path / to / data_dir / path / to / base_dir /

引文指南

為了複製任何指標，我們建議通過pip install deepfilternet使用Python實現。

如果您使用此框架，請引用： DeepFilternet：基於深層過濾的全樂隊音頻的低複雜性語音增強框架

 @inproceedings { schroeter2022deepfilternet ,
  title = { {DeepFilterNet}: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering } , 
  author = { Schröter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas } ,
  booktitle = { ICASSP 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) } ,
  year = { 2022 } ,
  organization = { IEEE }
}

如果您使用DeepFilternet2模型，請引用： DeepFilternEt2：在嵌入式設備上進行實時演講，以進行全頻段音頻

 @inproceedings { schroeter2022deepfilternet2 ,
  title = { {DeepFilterNet2}: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio } ,
  author = { Schröter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas } ,
  booktitle = { 17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022) } ,
  year = { 2022 } ,
}

如果您使用DeepFilternet3模型，請引用： DeepFilternet：感知動機的實時演講增強

 @inproceedings { schroeter2023deepfilternet3 ,
  title = { {DeepFilterNet}: Perceptually Motivated Real-Time Speech Enhancement } ,
  author = { Schröter, Hendrik and Rosenkranz, Tobias and Escalante-B., Alberto N. and Maier, Andreas } ,
  booktitle = { INTERSPEECH } ,
  year = { 2023 } ,
}

如果使用多幀波束形成算法。請引用助聽器的深層多幀過濾

 @inproceedings { schroeter2023deep_mf ,
  title = { Deep Multi-Frame Filtering for Hearing Aids } ,
  author = { Schröter, Hendrik and Rosenkranz, Tobias and Escalante-B., Alberto N. and Maier, Andreas } ,
  booktitle = { INTERSPEECH } ,
  year = { 2023 } ,
}