Berrak Sisman

Home / People / Faculty / Berrak Sisman
Go back to faculty list

Assistant Professor

Email: 
Website: http://ece.nus.edu.sg/hlt/berrak/
Office #: 1.702-03
Pillar / Cluster: Information Systems Technology and Design
Research Areas:Artificial and Augmented Intelligence, Interactive Computing, Data Science

Biography

Dr. Berrak Sisman is a tenure-track Assistant Professor at Singapore University of Technology and Design (SUTD).  She is also an Affiliated Researcher at Human Language Technology Lab at the National University of Singapore, where she serves as the team leader. She received Ph.D. degree in Electrical and Computer Engineering from the National University of Singapore, Singapore, in 2020.  Prior to joining SUTD, she was a Postdoctoral Research Fellow with the National University of Singapore, and a Visiting Researcher with Columbia University, New York, United States. She was also a Visiting Scholar with The Centre for Speech Technology Research (CSTR), University of Edinburgh in 2019. She was attached to RIKEN Advanced Intelligence Project, Japan in 2018.

Dr. Berrak Sisman’s research interests include speech information processing, machine learning, speech synthesis and voice conversion. She has published in leading journals and conferences, including IEEE/ACM Transactions on Audio, Speech and Language Processing, ASRU, INTERSPEECH, and ICASSP. She was awarded APSIPA Ph.D. Forum Best Presentation Award for her presentation titled “Limited Data Voice Conversion from Sparse Representation to GANs and WaveNet” in 2018, Hawaii, United States. In 2019, she has represented the NUS team’s participation in ‘ZeroSpeech 2019: TTS without T’, and received top scores.

She has served as the Local Arrangement Co-chair of IEEE ASRU 2019, Chair of Young Female Researchers Mentoring @ASRU2019, and Chair of the INTERSPEECH Student Events in 2018 and 2019. She is currently an Associate TC Member of IEEE SLTC and serving as a reviewer at  IEEE Signal Processing Letters,  IEEE/ACM Transactions on Audio, Speech, and Language Processing, ICASSP, and INTERSPEECH. She will serve as the Speech Synthesis Area Chair at INTERSPEECH 2021, and as the Publication Chair at ICASSP 2022.

Research Interests

Artificial and Augmented Intelligence, Machine Learning, Data Science, Speech Information Processing, Speech and Singing Voice Synthesis, Voice Conversion, Emotion

Positions

Education

PhD

  • Electrical and Computer Engineering, National University of Singapore (2016-2020)
    Supervisors: Prof. Haizhou Li (IEEE Fellow), Prof. Tan Kay Chen (IEEE Fellow)
  • Ph.D. Exchange, The University of Edinburgh, Scotland (2019)
    Supervisor: Prof. Simon King (IEEE Fellow)
  • Ph.D. Exchange, RIKEN Center for Advanced Intelligence Project, Japan (2018)
    Supervisor: Prof. Satoshi Nakamura (IEEE Fellow)

Team (Students & Postdocs)

  1. Dr. Rui Liu (Postdoc, SUTD Speech & ML Team)
  2. Ms. Zongyang Du (Ph.D. student, SUTD Speech & ML Team)
  3. Ms. Huiqing Lin (MSc student, SUTD Speech & ML Team)

I’m also mentoring the following PhD students from HLT@NUS: 1) Zhou Kun, 2) Junchen Lu, and 3) Sergey Nikonorov. Please check google scholar for our publications.

Awards, Recognition, Invited Talks

  • Tutorial presentation at Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2020
  • Top scores in ZeroSpeech 2019 Challenge: TTS without T, INTERSPEECH 2019, Austria
    Team: RIKEN (Japan), National University of Singapore, Nara Institute of Science and Technology (Japan)
  • Invitation by RIKEN Center for Advanced Intelligence Project (2018)
  • Best Presentation Award, APSIPA APSIPA Ph.D. Forum 2018, Hawaii, United States
  • Invitation by Google Speech Summit in London (2018)
  • ISCA Grant 2018
  • A*STAR Singapore International Graduate Award (2016)

Selected Publications

  • B Sisman, J Yamagishi, S King, H Li, ‘An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning ‘ IEEE/ACM Transactions on Audio, Speech and Language Processing,  November 2020.
  • B. Sisman, M. Zhang, H.Li, ‘Group Sparse Representation with WaveNet Vocoder Adaptation for Spectrum and Prosody Conversion’  IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2019.
  • R Liu, B Sisman, F Bao, G Gao, H Li  ‘Exploiting Morphological and Phonological Features to Improve Prosodic Phrasing for Mongolian Speech Synthesis ‘, IEEE/ACM Transactions on Audio, Speech and Language Processing, November 2020.
  • R Liu, B Sisman, F Bao, G Gao, H Li  ‘Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS’, IEEE Signal Processing Letters, September 2020.
  • K Zhou, B Sisman, H Li ‘VAW-GAN for Disentanglement and Recomposition of Emotional Elements in Speech’ accepted by IEEE Spoken Language Technology Workshop (SLT 2021).
  • M. Zhang, B. Sisman, L. Zhao, H. Li ‘DeepConversion: Voice conversion with limited parallel training data’ Speech Communication, July 2020.
  • B. Sisman, H. Li ‘Generative Adversarial Networks for Singing Voice Conversion with and without parallel data’ Speaker Odyssey 2020, Tokyo, Japan.
  • Z. Kun, B. Sisman, H. Li ‘Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data’ Speaker Odyssey 2020, Tokyo, Japan.
  • R. Liu, B. Sisman, J Li, F Bao, G Gao, H Li, ‘ WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss’,  Speaker Odyssey 2020, Tokyo, Japan.
  • R. Liu, B. Sisman, J Li, F Bao, G Gao, H Li, ‘Teacher-Student Training for Robust Tacotron-based TTS’ IEEE ICASSP 2020.
  • B. Sisman, M. Zhang, M. Dong, H. Li ‘On the Study of Generative Adversarial Networks for Cross-lingual Voice Conversion’ IEEE ASRU 2019.
  • B. Sisman, M. Zhang, H.Li, ‘Group Sparse Representation with WaveNet Vocoder Adaptation for Spectrum and Prosody Conversion’ IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2019 (DOI: 10.1109/TASLP.2019.2910637)
  • B. Sisman, K. Vijayan, M. Dong, H. Li ‘SINGAN: Singing Voice Conversion with Generative Adversarial Networks’ APSIPA ASC 2019.
  • A. Tjandra, B. Sisman, M. Zhang, S. Sakriani, H. Li, S. Nakamura, ‘VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019’ INTERSPEECH 2019.
  • B. Sisman, M. Zhang, S. Sakriani, H. Li, S. Nakamura, ‘Adaptive WaveNet Vocoder for Residual Compensation in GAN-based Voice Conversion’, IEEE SLT 2018.
  • B. Sisman, H. Li, ‘Limited Data Voice Conversion from Sparse Representation to GANs and WaveNet’, APSIPA ASC 2018 PhD Forum [Best Presentation Award].
  • B. Sisman, M. Zhang, H. Li, ‘A Voice Conversion Framework with Tandem Feature Sparse Representation and Speaker-Adapted WaveNet Vocoder’, INTERSPEECH 2018
  • B. Sisman, H. Li, ‘Wavelet Analysis of Speaker Dependent and Independent Prosody for Voice Conversion’, INTERSPEECH 2018
  • B. Sisman, G. Lee, H. Li, ‘Phonetically Aware Exemplar-Based Prosody Transformation’, Speaker Odyssey 2018
  • B. Sisman, H. Li, K. C. Tan ‘Sparse Representation of Phonetic Features for Voice Conversion with and without Parallel Data’, IEEE ASRU 2017
Go back to faculty list