Project Overview

  • Diff-SVC is an open-source Singing Voice Conversion project using the Diffusion model.

GitHub repository: https://github.com/prophesier/diff-svc Author: prophesier

If you are posting content created with this project on social media, it's recommended to include the GitHub repository link in the description, or at least label it as AI/SVC/Diff-SVC somewhere.

Pipeline Overview

Pipeline Overview (Simplified)

In this guide, we will go through the 4 stages of the Diff-SVC pipeline: Dataset Preparation, Preprocessing, Training, and Inference.

24kHz or 44.1kHz

Some commands and configs will be slightly different depending on the vocoder you choose. The 44.1kHz vocoder will output audio at a higher sampling rate than the 24kHz one, which usually leads to higher audio quality. So it's recommended to use 44.1kHz to train new models.

SVC & SVS

Singing Voice Conversion aims to convert one singer's voice to another's while Singing Voice Synthesis aims to generate singing voices from music scores and lyrics.

Example:

  • SVS:

    • VOCALOID

    • UTAU

    • Synthesizer V

    • DiffSinger

    • ...

  • SVC

    • Vocaloid 6 (VocaloChanger feature)

    • Diff-SVC

    • ...

Diff-SVC is an SVC project, hence it takes vocal audio instead of midi and lyric as input. Check the partner project DiffSinger if you are looking for an open-source SVS project.

Relationship between Diff-SVC and Diffsinger

DiffSinger (paper & code) -> DiffSinger (OpenVPI's Fork) -> Diff-SVC (by@prophesier)

Diff-SVC is based on DiffSinger, DiffSinger (openvpi maintenance version), and soft-vc.

Last updated

Was this helpful?