Back To Schedule
Tuesday, November 19 • 12:00 - 12:25
Introducing your robot lead singer (Demo 12.30 - 14.00)

Log in to save this to your schedule, view media, leave feedback and see who's attending!

In the last two years speech/singing synthesis technology has changed beyond recognition. Being able to create seamless copies of voices is a reality, and the manipulation of voice quality using synthesis techniques can now produce dynamic audio content that is impossible to differentiate from natural spoken output. More specifically, it will allow us to create artificial singing that is better than many human singers, graft expressive techniques from one singer to another, and using analysis-by-synthesis categorise and evaluate singing far beyond simple pitch estimation. In this talk we approach this expanding field in a modern context, give some examples, delve into the multi-faceted nature of singing user interface and obstacles still to overcome, illustrate a novel avenue of singing modification, and discuss the future trajectory of this powerful technology from Text-to-Speech to music and audio engineering platforms.

Many people are now enjoying producing original music by generating vocal tracks with software such as VOCALOID, UTAU and Realivox. However, quality has meant that such systems rarely play a part in professional music production. Recent advances in speech synthesis technology will render this an issue of the past, by offering extremely high quality audio output indistinguishable from natural voice. Yet how we interface and use such technology as a tool to support the artistic musical process is still in its infancy.
Users of these packages are presented with a classic piano roll interface to control the voice contained within, an environment inspired by MIDI-based synthesisers dating back to the early 1980s. Other singing synthesisers accept text input, musical score, MIDI, comprise of a suite of DSP routines with audio as input, and/or opt for the manipulation of a pre recorded sample library. In light of all these options however, currently available commercial singing synthesis generally struggles to offer the level of control over musical singing expression or style typically exploited by most real professional vocalists and composers.

The recent unveiling of CereVoice as a singing synthesiser demonstrates the ability to generate singing from spoken modelled data producing anything from a comically crooning Donald Trump to a robot-human duet on the ``Tonight Show starring Jimmy Fallon''. CereVoice's heritage as a mature Text-to-Speech technology with emotional and characterful control over its parametric speech synthesis engine offers novel insight in the ideal input that balances control and quality for its users. We exploit our unique position at the crossroads of speech technology, music information retrieval, and audio DSP to help illustrate our journey from speech to singing.
Voice technology is changing at breakneck speed. How we apply and interface with this technology in the musical domain is at a cusp. It is the ADC community that will, in the end, dictate how these new techniques are incorporated into the music technology of the future.

There will be a demo during the lunch break 12.30 - 14.00 (Tuesday 19 Nov) 

avatar for Christopher Buchanan

Christopher Buchanan

Audio Development Engineer, CereProc
Chris Buchanan is Audio Development Engineer for CereProc. He graduated with Distinction in the Acoustics & Music Technology MSc degree at the University of Edinburgh in 2016, after 3 years as a signal processing geophysicist with French seismic imaging company CGG. He also holds... Read More →

Tuesday November 19, 2019 12:00 - 12:25 GMT
Queenhithe Room Puddle Dock, London EC4V 3DB