Berrak Sisman

Home / People / Faculty / Berrak Sisman

Assistant Professor

Office #: 1.702-03
Pillar / Cluster: Information Systems Technology and Design
Research Areas:Artificial and Augmented Intelligence, Data Science


Dr. Berrak Sisman received her PhD in Electrical and Computer Engineering from the National University of Singapore (NUS), fully-funded by A*STAR Graduate Academy. She has worked as researcher, and then as postdoctoral research fellow at NUS (October 2019 – March 2020). She is currently an affiliated researcher at Human Language Technology Lab at National University of Singapore where she serves as the team leader. She has been a visiting researcher at Columbia University in the City of New York in 2020. During her PhD studies, she was attached to RIKEN Advanced Intelligence Project (Japan) in 2018. She was also a visiting scholar at The Centre for Speech Technology Research (CSTR), and an exchange PhD student at University of Edinburgh, Scotland in 2019.

Dr. Berrak Sisman’s research interests include speech information processing, machine learning, speech synthesis and voice conversion. She has published in leading journals and conferences, including IEEE/ACM Transactions on Audio, Speech and Language Processing, ASRU, INTERSPEECH and ICASSP. She was awarded APSIPA PhD Forum Best Presentation Award for her presentation titled “Limited Data Voice Conversion from Sparse Representation to GANs and WaveNet” in 2018, Hawaii, United States. In 2019, she has represented NUS team’s participation to ‘ZeroSpeech 2019: TTS without T’, and received top scores.

She has served as the Local Arrangement Co-chair of IEEE ASRU 2019, Chair of Young Female Researchers Mentoring @ASRU2019, and Chair of the INTERSPEECH Student Events in 2018 and 2019. She is also assigned as the Publication Chair of IEEE ICASSP 2022. She serves as a reviewer at IEEE Signal Processing Letters, INTERSPEECH, and IEEE Transactions on Emerging Topics in Computational Intelligence.

Research Interests

Artificial and Augmented Intelligence, Machine Learning, Data Science, Speech Information Processing, Speech and Singing Voice Synthesis, Voice Conversion


Postdoc positions are available! Please apply by sending your CV and 1-2 representative publications. Candidates should hold a Ph.D degree and show evidence of a strong publication record and/or project experience in speech synthesis, ASR and machine learning.

If you want to pursue PhD with me, please apply to SUTD PhD program. International candidates can also apply to SINGA scholarship for fully-funded PhD position.  For further information, please send me an email.



  • Electrical and Computer Engineering, National University of Singapore (2016-2019)
    Supervisors: Prof. Haizhou Li (IEEE Fellow), Prof. Tan Kay Chen (IEEE Fellow)
  • PhD Exchange, The University of Edinburgh, Scotland (2019)
    Supervisor: Prof. Simon King
  • PhD Research Attachment, RIKEN Center for Advanced Intelligence Project, Japan (2018)
    Supervisor: Prof. Satoshi Nakamura (IEEE Fellow)

Awards & Recognition

  • A*STAR Singapore International Graduate Award (2016)
  • Best Presentation Award, APSIPA APSIPA PhD Forum 2018, Hawaii, United States
  • Invitation by Google Speech Summit in London (2018)
  • ISCA Grant 2018
  • Invitation by RIKEN Center for Advanced Intelligence Project (2018)
  • Top scores in ZeroSpeech 2019 Challenge: TTS without T, INTERSPEECH 2019, Austria
    Team: RIKEN (Japan), National University of Singapore, Nara Institute of Science and Technology (Japan)

Selected Publications

  • B. Sisman, H. Li ‘Generative Adversarial Networks for Singing Voice Conversion with and without parallel data’ accepted by Speaker Odyssey 2020, Tokyo, Japan
  • Z. Kun, B. Sisman, H. Li ‘Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data’ arXiv:2002.00198 [eess.AS] (accepted Speaker Odyssey 2020, Tokyo, Japan)
  • R. Liu, B. Sisman, J Li, F Bao, G Gao, H Li, ‘ WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss’, arXiv:2002.00417 [eess.AS] (accepted by Speaker Odyssey 2020, Tokyo, Japan)
  • R. Liu, B. Sisman, J Li, F Bao, G Gao, H Li, ‘Teacher-Student Training for Robust Tacotron-based TTS’ IEEE ICASSP 2020
  • B. Sisman, M. Zhang, M. Dong, H. Li ‘On the Study of Generative Adversarial Networks for Cross-lingual Voice Conversion’ IEEE ASRU 2019
  • B. Sisman, M. Zhang, H.Li, ‘Group Sparse Representation with WaveNet Vocoder Adaptation for Spectrum and Prosody Conversion’ IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2019 (DOI: 10.1109/TASLP.2019.2910637)
  • B. Sisman, K. Vijayan, M. Dong, H. Li ‘SINGAN: Singing Voice Conversion with Generative Adversarial Networks’ APSIPA ASC 2019
  • B. Sisman and H. Li, ‘Singing Voice Conversion with Generative Adversarial Networks”, UKSpeech 2019
  • A. Tjandra, B. Sisman, M. Zhang, S. Sakriani, H. Li, S. Nakamura, ‘VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019’ INTERSPEECH 2019
  • R. Liu, B. Sisman, J Li, F Bao, G Gao, H Li, ‘Teacher-Student Training for Robust Tacotron-based TTS’ arXiv:1911.02839 (submitted to IEEE ICASSP 2020)
  • B. Sisman, M. Zhang, S. Sakriani, H. Li, S. Nakamura, ‘Adaptive WaveNet Vocoder for Residual Compensation in GAN-based Voice Conversion’, IEEE SLT 2018
  • B. Sisman, H. Li, ‘Limited Data Voice Conversion from Sparse Representation to GANs and WaveNet’, APSIPA ASC 2018 PhD Forum [Best Presentation Award].
  • B. Sisman, M. Zhang, H. Li, ‘A Voice Conversion Framework with Tandem Feature Sparse Representation and Speaker-Adapted WaveNet Vocoder’, INTERSPEECH 2018
  • B. Sisman, H. Li, ‘Wavelet Analysis of Speaker Dependent and Independent Prosody for Voice Conversion’, INTERSPEECH 2018
  • B. Sisman, G. Lee, H. Li, ‘Phonetically Aware Exemplar-Based Prosody Transformation’, Speaker Odyssey 2018
  • M. Zhang, B. Sisman, S. S. Rallabandi, H. Li, L. Zhao, ‘Error Reduction Network for DBLSTM-based Voice Conversion’, APSIPA ASC 2018
  • J Xiao, S Yang, M Zhang, B Sisman, D Huang, L Xie, M Dong, H Li, ‘The I2R-NWPU-NUS Text-to-Speech System for Blizzard Challenge 2018’, INTERSPEECH Blizzard Challenge 2018 Workshop
  • G. Xiaoxue, B.Sisman, R. K. Das, K. Vijayan, ‘NUS-HLT Spoken Lyrics and Singing (SLS) Corpus’, International Conference on Orange Technologies (ICOT 2018)
  • B. Sisman, H. Li, K. C. Tan, ‘Transformation of prosody in voice conversion’, APSIPA ASC 2017
  • B. Sisman, H. Li, K. C. Tan ‘Sparse Representation of Phonetic Features for Voice Conversion with and without Parallel Data’, IEEE ASRU 2017
  • B. Sisman, G. Lee, H. Li, ‘On the analysis and evaluation of prosody conversion techniques’, IALP 2017