解説: OpenVINO Model Zoo WaveRNN(composite) - たれぱんのびぼーろく

OpenVINO Model Zooにある wavernn (composite) というモデルの解説.

概要

mel2waveのWaveRNN Vocoder¹.
fatchordタイプ、すなわちResNetベースのPreNetをもちMoLパラメータを出力する.
fatchordのLJSpeech学習済みモデル（ljspeech.wavernn.mol.800k.zip）をONNX形式で配布²^,³.
wavernn_upsampler.onnx と wavernn_rnn.onnx に分割されているのでcomposite⁴.

コード

ONNXの再現手順が書かれており、以下がソースとなるコード群.

openvinotoolkit/open_model_zoo - wavernn (composite): ONNX化コードとそれ用のgit patchが置いてある
as-ideas/ForwardTacotron: Vocoder用にfatchordのWaveRNNを持っている
fatchord/WaveRNN: モデルの発祥 & pretrained PyTorchモデル配布元

“WaveRNN performs waveform regression from mel-spectrogram.” from the model zoo↩
“The model was trained on LJSpeech dataset” from the model zoo↩
“We provide pre-trained models in ONNX format for user convenience.” from the model zoo↩
“the model is divided into two parts: wavernn_upsampler.onnx, wavernn_rnn.onnx.” from the model zoo↩