"Investigating Deep Neural Networks and their Interpretability in the Domain of Voice Conversion"

Pre-experiment

This table shows examples of the output quality between using only target domain codes in G as opposed to both source-and-target domain codes. These samples are from the VCC2018 dataset.

Source Target Reference Converted Synthesised (source-and-target codes) Converted Synthesised (target codes)

Experiment 1

Example samples from training on the VCTK dataset.

Source Target Reference Converted Synthesised

Experiment 2

Example VCC2018 samples from using transfer learning, where the original dataset used to train a model was VCTK.

Source Target Reference Converted Synthesised

Experiment 3

Example samples from networks with varying frozen layers.

Source Target Reference Converted Synthesised