DeepFilterNet 다운로드 - DeepFilterNet 소스 코드 다운로드

Deepfilternet

깊은 필터링에서 풀 밴드 오디오 (48kHz)를위한 낮은 복잡성 음성 향상 프레임 워크.

가상 노이즈 억제 마이크로 PipeWire 통합의 경우 여기를 참조하십시오.

데모

DeepFilternet-Demo-New.mp4

데모 (Linux 만)를 실행하려면 사용 :

cargo +nightly run -p df-demo --features ui --bin df-demo --release

소식

새로운 DeepFilternet 데모 : DeepFilternet : 지각 적으로 동기를 부여한 실시간 음성 향상
- 종이 : https://arxiv.org/abs/2305.08227
- 비디오 : https://youtu.be/eo7n96ywnye
새로운 멀티 프레임 필터링 용지 : 보청기 용 딥 멀티 프레임 필터링
- 종이 : https://arxiv.org/abs/2305.08225
실시간 버전 및 Ladspa 플러그인
- 사전 컴파일 된 이진, 파이썬 의존성 없음. 사용법 : deep-filter audio-file.wav
- 마이크의 실시간 노이즈 감소를위한 Pipewire 필터 체인 통합이 포함 된 Ladspa 플러그인.
DeepFilternet2 용지 : DeepFilternet2 : 풀 밴드 오디오를위한 임베디드 장치의 실시간 음성 향상을 향해
- 종이 : https://arxiv.org/abs/2205.05474
- 샘플 : https://rikorose.github.io/deepfilternet2-samples/
- 데모 : https://huggingface.co/spaces/hshr/deepfilternet2
원본 DeepFilternet Paper : DeepFilternet : 깊은 필터링을 기반으로 한 전체 대역 오디오를위한 낮은 복잡성 음성 향상 프레임 워크
- 종이 : https://arxiv.org/abs/2110.05588
- 샘플 : https://rikorose.github.io/deepfilternet-samples/
- 데모 : https://huggingface.co/spaces/hshr/deepfilternet
- 비디오 강의 : https://youtu.be/it90gbqky6k

용법

깊은 필터

릴리스 페이지에서 사전 컴파일 된 딥 필터 바이너리를 다운로드하십시오. deep-filter 사용하여 Noisy .wav 오디오 파일의 노이즈를 억제 할 수 있습니다. 현재 샘플링 속도가 48kHz 인 WAV 파일 만 지원됩니다.

USAGE:
    deep-filter [OPTIONS] [FILES]...

ARGS:
    < FILES > ...

OPTIONS:
    -D, --compensate-delay
            Compensate delay of STFT and model lookahead
    -h, --help
            Print help information
    -m, --model < MODEL >
            Path to model tar.gz. Defaults to DeepFilterNet2.
    -o, --out-dir < OUT_DIR >
            [default: out]
    --pf
            Enable postfilter
    -v, --verbose
            Logging verbosity
    -V, --version
            Print version information

GPU 처리를 위해 Pytorch 백엔드를 사용하려면 Python 사용에 대해서는 아래를 참조하십시오.

DeepFilternet 프레임 워크

이 프레임 워크는 Linux, MacOS 및 Windows를 지원합니다. 교육은 Linux에서만 테스트됩니다. 프레임 워크는 다음과 같이 구성됩니다.

libDF 에는 데이터로드 및 증강에 사용되는 Rust 코드가 포함되어 있습니다.
DeepFilterNet 에는 심해의 모델 무게뿐만 아니라 DeepFilternet 코드 교육, 평가 및 시각화가 포함되어 있습니다.
pyDF 에는 libdf stft/istft 처리 루프의 파이썬 래퍼가 포함되어 있습니다.
pyDF-data 에는 LIBDF 데이터 세트 기능의 파이썬 래퍼가 포함되어 있으며 Pytorch 데이터 로더를 제공합니다.
ladspa 에는 실시간 노이즈 억제를위한 Ladspa 플러그인이 포함되어 있습니다.
models 에는 DeepFilternet (Python) 또는 LIBDF/Deep Filter (Rust)에서 사용을 위해 사전에 걸려있는 모델이 포함되어 있습니다.

DeepFilternet Python : PYPI

PIP를 통해 DeepFilternet Python Wheel을 설치하십시오.

 # Install cpu/cuda pytorch (>=1.9) dependency from pytorch.org, e.g.:
pip install torch torchaudio -f https://download.pytorch.org/whl/cpu/torch_stable.html
# Install DeepFilterNet
pip install deepfilternet
# Or install DeepFilterNet including data loading functionality for training (Linux only)
pip install deepfilternet[train]

DeepFilternet Run을 사용하여 시끄러운 오디오 파일을 향상시킵니다

 # Specify an output directory with --output-dir [OUTPUT_DIR]
deepFilter path/to/noisy_audio.wav

수동 설치

Rustup을 통해화물을 설치하십시오. conda 또는 virtualenv 사용 권장. 의견을 읽고 필요한 명령 만 실행하십시오.

파이썬 의존성 및 libdf 설치 :

 cd path/to/DeepFilterNet/  # cd into repository
# Recommended: Install or activate a python env
# Mandatory: Install cpu/cuda pytorch (>=1.8) dependency from pytorch.org, e.g.:
pip install torch torchaudio -f https://download.pytorch.org/whl/cpu/torch_stable.html
# Install build dependencies used to compile libdf and DeepFilterNet python wheels
pip install maturin poetry

#  Install remaining DeepFilterNet python dependencies
# *Option A:* Install DeepFilterNet python wheel globally within your environment. Do this if you want use
# this repos as is, and don't want to develop within this repository.
poetry -C DeepFilterNet install -E train -E eval
# *Option B:* If you want to develop within this repo, install only dependencies and work with the repository version
poetry -C DeepFilterNet install -E train -E eval --no-root
export PYTHONPATH= $PWD /DeepFilterNet # And set the python path correctly

# Build and install libdf python package required for enhance.py
maturin develop --release -m pyDF/Cargo.toml
# *Optional*: Install libdfdata python package with dataset and dataloading functionality for training
# Required build dependency: HDF5 headers (e.g. ubuntu: libhdf5-dev)
maturin develop --release -m pyDF-data/Cargo.toml
# If you have troubles with hdf5 you may try to build and link hdf5 statically:
maturin develop --release --features hdf5-static -m pyDF-data/Cargo.toml

명령 줄에서 DeepFilternet을 사용하십시오

DeepFilternet Run을 사용하여 시끄러운 오디오 파일을 향상시킵니다

$ python DeepFilterNet/df/enhance.py --help
usage: enhance.py [-h] [--model-base-dir MODEL_BASE_DIR] [--pf] [--output-dir OUTPUT_DIR] [--log-level LOG_LEVEL] [--compensate-delay]
                  noisy_audio_files [noisy_audio_files ...]

positional arguments:
  noisy_audio_files     List of noise files to mix with the clean speech file.

optional arguments:
  -h, --help            show this help message and exit
  --model-base-dir MODEL_BASE_DIR, -m MODEL_BASE_DIR
                        Model directory containing checkpoints and config.
                        To load a pretrained model, you may just provide the model name, e.g. ` DeepFilterNet ` .
                        By default, the pretrained DeepFilterNet2 model is loaded.
  --pf                  Post-filter that slightly over-attenuates very noisy sections.
  --output-dir OUTPUT_DIR, -o OUTPUT_DIR
                        Directory in which the enhanced audio files will be stored.
  --log-level LOG_LEVEL
                        Logger verbosity. Can be one of (debug, info, error, none)
  --compensate-delay, -D
                        Add some paddig to compensate the delay introduced by the real-time STFT/ISTFT implementation.

# Enhance audio with original DeepFilterNet
python DeepFilterNet/df/enhance.py -m DeepFilterNet path/to/noisy_audio.wav

# Enhance audio with DeepFilterNet2
python DeepFilterNet/df/enhance.py -m DeepFilterNet2 path/to/noisy_audio.wav

파이썬 스크립트 내에서 DeepFilternet을 사용하십시오

 from df import enhance , init_df

model , df_state , _ = init_df ()  # Load default model
enhanced_audio = enhance ( model , df_state , noisy_audio )

전체 예를 보려면 여기를 참조하십시오.

훈련

진입 점은 DeepFilterNet/df/train.py 입니다. HDF5 데이터 세트가 포함 된 데이터 디렉토리와 데이터 세트 구성 JSON 파일을 기대합니다.

따라서 먼저 데이터 세트를 HDF5 형식으로 작성해야합니다. 각 데이터 세트는 일반적으로 훈련, 검증 또는 테스트 노이즈, 음성 또는 RIR 만 보유합니다.

 # Install additional dependencies for dataset creation
pip install h5py librosa soundfile
# Go to DeepFilterNet python package
cd path / to / DeepFilterNet / DeepFilterNet
# Prepare text file (e.g. called training_set.txt) containing paths to .wav files
#
# usage: prepare_data.py [-h] [--num_workers NUM_WORKERS] [--max_freq MAX_FREQ] [--sr SR] [--dtype DTYPE]
#                        [--codec CODEC] [--mono] [--compression COMPRESSION]
#                        type audio_files hdf5_db
#
# where:
#   type: One of `speech`, `noise`, `rir`
#   audio_files: Text file containing paths to audio files to include in the dataset
#   hdf5_db: Output HDF5 dataset.
python df / scripts / prepare_data . py - - sr 48000 speech training_set . txt TRAIN_SET_SPEECH . hdf5

모든 데이터 세트는 열차 스크립트의 하나의 데이터 세트 폴더로 제공되어야합니다.

데이터 세트 구성 파일에는 "Train", "Valid", "Test"의 3 개의 항목이 포함되어야합니다. 각각에는 데이터 세트 목록이 포함되어 있습니다 (예 : 음성, 소음 및 RIR 데이터 세트). 여러 음성 또는 소음 데이터 세트를 사용할 수 있습니다. 선택적으로, 데이터 세트를 과도하게 샘플링하는 데 사용될 수있는 샘플링 계수를 지정할 수 있습니다. 일시적 소음이있는 특정 데이터 세트가 있으며 과도하게 샘플링하여 비 정지 소음의 양을 늘리고 싶습니다. 대부분의 경우이 요소를 1로 설정하려고합니다.

데이터 세트 구성 예 :

dataset.cfg

{
  "train" : [
    [
      " TRAIN_SET_SPEECH.hdf5 " ,
      1.0
    ],
    [
      " TRAIN_SET_NOISE.hdf5 " ,
      1.0
    ],
    [
      " TRAIN_SET_RIR.hdf5 " ,
      1.0
    ]
  ],
  "valid" : [
    [
      " VALID_SET_SPEECH.hdf5 " ,
      1.0
    ],
    [
      " VALID_SET_NOISE.hdf5 " ,
      1.0
    ],
    [
      " VALID_SET_RIR.hdf5 " ,
      1.0
    ]
  ],
  "test" : [
    [
      " TEST_SET_SPEECH.hdf5 " ,
      1.0
    ],
    [
      " TEST_SET_NOISE.hdf5 " ,
      1.0
    ],
    [
      " TEST_SET_RIR.hdf5 " ,
      1.0
    ]
  ]
}

마지막으로 교육 스크립트를 시작하십시오. 교육 스크립트는 로깅, 일부 오디오 샘플, 모델 체크 포인트 및 구성에 사용되지 않은 경우 모델 base_dir 생성 할 수 있습니다. 구성 파일이 없으면 기본 구성이 생성됩니다. 구성 파일은 DeepFilternet/Pretrained_Models/DeepFilternet을 참조하십시오.

 # usage: train.py [-h] [--debug] data_config_file data_dir base_dir
python df / train . py path / to / dataset . cfg path / to / data_dir / path / to / base_dir /

인용 가이드

메트릭을 재현하기 위해 pip install deepfilternet 통해 Python 구현을 사용하도록 권장합니다.

이 프레임 워크를 사용하는 경우, DeepFilternet : 깊은 필터링을 기반으로 한 전체 대역 오디오를위한 복잡성 음성 향상 프레임 워크를 인용하십시오.

 @inproceedings { schroeter2022deepfilternet ,
  title = { {DeepFilterNet}: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering } , 
  author = { Schröter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas } ,
  booktitle = { ICASSP 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) } ,
  year = { 2022 } ,
  organization = { IEEE }
}

DeepFilternet2 모델을 사용하는 경우, DeepFilternet2 : 풀 밴드 오디오를위한 임베디드 장치의 실시간 음성 향상을 향해

 @inproceedings { schroeter2022deepfilternet2 ,
  title = { {DeepFilterNet2}: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio } ,
  author = { Schröter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas } ,
  booktitle = { 17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022) } ,
  year = { 2022 } ,
}

DeepFilternet3 모델을 사용하는 경우 : DeepFilternet : 지각 적으로 동기 부여 실시간 음성 향상을 인용하십시오.

 @inproceedings { schroeter2023deepfilternet3 ,
  title = { {DeepFilterNet}: Perceptually Motivated Real-Time Speech Enhancement } ,
  author = { Schröter, Hendrik and Rosenkranz, Tobias and Escalante-B., Alberto N. and Maier, Andreas } ,
  booktitle = { INTERSPEECH } ,
  year = { 2023 } ,
}

멀티 프레임 빔 포맷 알고리즘을 사용하는 경우 보청기에 대한 깊은 멀티 프레임 필터링을 인용하십시오

 @inproceedings { schroeter2023deep_mf ,
  title = { Deep Multi-Frame Filtering for Hearing Aids } ,
  author = { Schröter, Hendrik and Rosenkranz, Tobias and Escalante-B., Alberto N. and Maier, Andreas } ,
  booktitle = { INTERSPEECH } ,
  year = { 2023 } ,
}