Investigating Deep Neural Structures and their Interpretability in the Domain of Voice Conversion

Pre-experiment

This table shows examples of the output quality between using only target domain codes in G as opposed to both source-and-target domain codes. These samples are from the VCC2018 dataset.

Source	Target Reference	Converted Synthesised (source-and-target codes)	Converted Synthesised (target codes)

Experiment 1

Example samples from training on the VCTK dataset.

Source	Target Reference	Converted Synthesised

Experiment 2

Example VCC2018 samples from using transfer learning, where the original dataset used to train a model was VCTK.

Source	Target Reference	Converted Synthesised

Experiment 3

Example samples from networks with varying frozen layers.

Source	Target Reference	Converted Synthesised

"Investigating Deep Neural Networks and their Interpretability in the Domain of Voice Conversion"

Pre-experiment

Experiment 1

Experiment 2

Experiment 3