MelNet
1.0.0
Implementation of MelNet: A Generative Model for Audio in the Frequency Domain
pip install -r requirements.txtconfig/. For other datasets, fill out your own YAML file according to the other provided ones.data.extension within the YAML file.python trainer.py -c [config YAML file path] -n [name of run] -t [tier number] -b [batch size] -s [TTS]
-s flag is a boolean for determining whether to train a TTS tier. Since a TTS tier only differs at tier 1, this flag is ignored when [tier number] != 0 . Warning: this flag is toggled True no matter what follows the flag. Ignore it if you're not planning to use it.chkpt/.inference.yaml must be provided under config/.inference.yaml must specify the number of tiers, the names of the checkpoints, and whether or not it is a conditional generation.python inference.py -c [config YAML file path] -p [inference YAML file path] -t [timestep of generated mel spectrogram] -n [name of sample] -i [input sentence for conditional generation]
[sample rate] : [hop length of FFT].-i flag is optional, only needed for conditional generation. Surround the sentence with "" and end with ..MIT License