Voice Conversion is a technique to modify one’s voice to sound like that of another. With the latest machine learning algorithms, voice conversion technology is now ready for many real-world applications such as personalized speech synthesis, spoofing attacks, and dubbing of soundtracks for movies and computer games. In this talk, I will introduce the fundamentals of voice conversion, and the recent advancements in the field through live demonstrations. I will present my technical contributions that cover both sparse representation, and deep learning solutions to high quality voice conversion. I will also provide my perspectives on the technology challenges and future directions moving forward.
Berrak Sisman obtained her B.Sc and M.Sc degrees from the FMV Isik University, Turkey in 2015 and 2016. Since 2016, she has been a Ph.D candidate at the Department of Electrical and Computer Engineering, the National University of Singapore (NUS), and a research scholar at Institute for Infocomm Research, A*STAR. In 2018, she was an exchange student in Nara Institute of Science and Technology (NAIST) and RIKEN Japan. In 2019, she was a visiting scholar to the Centre for Speech Technology Research, Edinburgh University.
Ms. Sisman’s research interests include speech synthesis, voice conversion, and speech information processing. She has published in leading journals and conferences, including IEEE/ACM Transactions on Audio, Speech and Language Processing, ASRU, INTERSPEECH and ICASSP. In 2019, she led the NUS team to participate in ZeroSpeech 2019 – TTS without T international competition. Ms. Sisman is expected to obtain her Ph.D degree from NUS in AY2019/2020.