Project Overview

Diff-SVC is an open-source Singing Voice Conversion project using the Diffusion model.

GitHub repository: https://github.com/prophesier/diff-svc Author: prophesier

If you are posting content created with this project on social media, it's recommended to include the GitHub repository link in the description, or at least label it as AI/SVC/Diff-SVC somewhere.

It is your responsibility to follow any rules and local laws that may apply when creating and sharing models.

Pipeline Overview

In this guide, we will go through the 4 stages of the Diff-SVC pipeline: Dataset Preparation, Preprocessing, Training, and Inference.

24kHz or 44.1kHz

Some commands and configs will be slightly different depending on the vocoder you choose. The 44.1kHz vocoder will output audio at a higher sampling rate than the 24kHz one, which usually leads to higher audio quality. So it's recommended to use 44.1kHz to train new models.

SVC & SVS

Singing Voice Conversion aims to convert one singer's voice to another's while Singing Voice Synthesis aims to generate singing voices from music scores and lyrics.

Example:
SVS:
VOCALOID
UTAU
Synthesizer V
DiffSinger
...
SVC
Vocaloid 6 (VocaloChanger feature)
Diff-SVC
...

Diff-SVC is an SVC project, hence it takes vocal audio instead of midi and lyric as input. Check the partner project DiffSinger if you are looking for an open-source SVS project.

Relationship between Diff-SVC and Diffsinger

DiffSinger (paper & code) -> DiffSinger (OpenVPI's Fork) -> Diff-SVC (by@prophesier)

Diff-SVC is based on DiffSinger, DiffSinger (openvpi maintenance version), and soft-vc.

This project has NO connection with the paper of the same name DiffSVC, please do not confuse them!

NextRequirements

Last updated 2 years ago

Was this helpful?