<div><div>Hi Adapters,</div><div><br></div><div>I'm Zhiling, a senior undergraduate student, and I'll give my first presentation today. I just ended an internship at the TTS(Text-To-Speech) group of Bytedance AI lab. And I'd like to share some knowledge about TTS that I've learned there.</div><div><br></div><div>As all of you, my research interest is NLP. However, I have to learn TTS from scratch after being surprisingly picked by the TTS group for my intern application. But after I broke my limited NLP-only view,  I found many relations and similarities between TTS and NLP. TTS frontend is a typical example, where NLP acts as its backbone.</div><div><br></div><div>My topic is "TTS frontend：NLP as backbone of Audio synthesis". </div><div>First, I'll introduce some basics of TTS system from an NLPer's perspective. </div><div>Then, I'll dig deep into TTS frontend by introducing some recent works by the Bytedance TTS group.</div><div><br></div><div>Some ideas may be uncommon, but useful for the NLP community. Hopefully you'll draw some inspirations from my presentation.</div><div><br></div><div>Related papers:</div><div>https://arxiv.org/abs/1911.04111 | A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis</div><div>https://arxiv.org/abs/1911.04128 | A hybrid text normalization system using multi-head self-attention for mandarin</div><div><br></div><div><br></div><div>Time: Wed 4:30pm</div><div><br></div><div>Venue: SEIEE 3-414</div><div><br></div><div><br></div><div>Best regards,</div><div><br></div><div>Zhiling</div></div>