Have you tried zero-shot singing voice conversion? Does it only require modifying the spk_embed?
Have you tried zero-shot singing voice conversion? Does it only require modifying the spk_embed?