Chang ZENG 曾畅 曾 暢 (ソウ チョウ)

Chang ZENG 曾畅 曾 暢 (ソウ チョウ)

Ph.D. Candidate

National Institute of Informatics & SOKENDAI

Biography

I am a Ph.D. candidate with 5 years of speech signal processing/sequence-to-sequence (S2S)/deep learning experiences. I have explored speaker recognition in universities and speech recognition in the industry. My research/work interest includes speech/speaker recognition and generative speech AI such as TTS and voice conversion.

Download my resumé .

Interests
  • Artificial Intelligence
  • Speech Signal Processing
  • Singing Voice / Speech Synthesis
  • Speech Recognition
  • Language Processing
Education
  • PhD in Informatics

    National Institute of Informatics & SOKENDAI

  • MEng in Electrical Engineer and Information Systems (EEIS), 2020

    The University of Tokyo

  • BSc in Measurement and Control Technology and Instruments, 2016

    Tianjin University

Publications

Quickly discover relevant content by filtering publications.
(2024). HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot Text-to-Speech with Model and Data Scaling. In ArXiv.

PDF Cite Project DOI ArXiv

(2023). Cross-Modal Audio-Visual Co-Learning for Text-Independent Speaker Verification. In ICASSP 2023.

PDF Cite Project DOI ICASSP

(2023). SSI-Net: A Multi-Stage Speech Signal Improvement System for ICASSP 2023 SSI Challenge. In ICASSP 2023.

Cite Project DOI ICASSP Link

(2022). Deep Spectro-temporal Artifacts for Detecting Synthesized Speech. In DDAM 2022 Workshop.

PDF Cite Project DOI ACMMM Link

(2022). Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection. In Interspeech 2022.

PDF Cite Project DOI INTERSPEECH Link

(2021). DeepLip: A Benchmark for Deep Learning-Based Audio-Visual Lip Biometrics. In ASRU 2021.

Cite Project DOI ASRU Link

Projects

*
Generative AI
Multi-Modality GenAI.
Speaker Recognition / Antispoofing
Speaker recognition and antispoofing project.

Skills

Python

100%

C++

80%

Audio Signal Processing

100%

Audio Generative AI

90%

Speech Recognition

90%

PyTorch

100%

Experience

 
 
 
 
 
RevComm Inc
Researcher
Sep 2023 – Mar 2024 Remote

Responsibilities include:

  • Speech signal processing
  • Speech recognition
  • Speech synthesis
  • Generative AI
 
 
 
 
 
Bombax XiaoIce Technology Co., Ltd
Avatar Researcher (intern)
Jul 2022 – Jul 2023 Remote

Responsibilities include:

  • Speech signal processing
  • Singing voice synthesis
  • Speech synthesis
 
 
 
 
 
National Insitute of Informatics
Research Assistant
Jul 2021 – Aug 2023 Tokyo

Responsibilities include:

  • Speech signal processing
  • Speaker recognition
  • Antispoofing
 
 
 
 
 
Alibaba
Speech Recognition Researcher
Apr 2020 – Nov 2020 Hangzhou

Responsibilities include:

  • Speech signal processing
  • Speaker recognition
  • Speech recognition
  • Self-supervised learning
  • Spoken term detection
 
 
 
 
 
Alibaba
Speech Recognition Researcher (intern)
Jul 2019 – Oct 2019 Beijing

Responsibilities include:

  • Speech signal processing
  • Speaker recognition