Equipment Corner July-August 2007

by Alyssa Quintero on Sun, 2007-07-01 09:16

Voice banking and speech synthesis technology

While there’s no easy way to prepare for losing your voice, voice banking enables people with ALS to capture their voices via recordings that can be used on Windows-based computers and alternative augmentative communication (AAC) devices.

To start, you can record your voice on a Windows-based computer using the sound recorder accessory and saving your recordings in waveform (.wav) files. Many people record signature phrases like “I love you,” or stories for their children and grandchildren, songs and laughter.

Advances in AAC technology allow some programs to use your prerecorded voice on your speech-generating device. You can import the audio files from your PC, or record your voice directly onto some AAC devices.

Speech-language pathologists (SLPs) suggest thinking about voice banking — in any form — before experiencing any detectable changes in speech.

Laurie Sterling, an SLP at the MDA/ALS Center at the Methodist Neurological Institute in Houston, explained, “It’s never too early to start [voice banking] following a diagnosis because it’s difficult to determine when there will be speech changes and to what degree.”

ModelTalker replicates your voice

The ModelTalker Speech Synthesis System’s voice-banking technology uses representative segments of a person’s recorded speech [from a syllable to a sentence in length] to create a unique synthetic voice, which can be used on a speech communication device.

People are attracted to this technology because it captures recognizable characteristics of the person’s original voice, explained H. Timothy Bunnell, the project’s principal investigator and head of the speech research lab at the Alfred I. duPont Hospital for Children in Wilmington, Del.

“We take natural speech that’s been recorded, chop it into small pieces that we can mix and match, and put them together in a variety of ways to produce utterances that were never recorded but that sound like they’re the same quality and from the speaker who did the original recordings,” Bunnell said.


Although the software’s still under development, Bunnell said it’s been used by a number of people with ALS. It can be downloaded at no cost. The software requires a Windows-based PC with audio capabilities and a head-mounted microphone.

ModelTalker requires recording about 1,600 utterances, which equals about 45 minutes of speech.

“We’re recording a fraction of the amount of speech that’s recorded for the commercial [AAC] systems, so it doesn’t sound as natural as they do,” Bunnell explained. “But it’s substantially better than the DECtalk voices that people used to listen to.”

Developed in the early 1980s by the Digital Equipment Corp., DECtalk turned text into human-like speech, albeit with a robotic quality.

Is that really you?

The ModelTalker Speech Synthesis System’s voice-banking technology uses representative segments of a person’s recorded speech [from a syllable to a sentence in length] to create a unique synthetic voice, which can be used on a speech communication device.

Don Taylor of Collierville, Tenn., who received a diagnosis of ALS in January 2005, hasn’t completely lost his voice, but he believes it’s only a matter of time.

When he learned about the ModelTalker project from his speech therapist nearly a year ago, Taylor immersed himself in a monthlong voice capture project using ModelTalker.

“A person’s voice is special and unique to his family and friends, and ALS robs you of this important trait,” said Taylor, 50, via e-mail. “I’m really pleased with my synthesized voice.”

Only family members can understand Taylor’s speech at this point, so he expects to be using his new voice full time.

Don Taylor
Don Taylor is happy with his ModelTalker voice.

Taylor recorded 1,649 utterances using his laptop computer and a USB microphone. He’s worked with computers for more than 20 years, so he had an easy time with the software, but users also can e-mail questions to the lab (via the ModelTalker site).

He recommends working on the project when your voice is strong, typically first thing in the morning. For three hours each day, he recorded 50 to 100 utterances. Because he was experiencing speech problems when he began recording, he had to repeat each utterance about four times to achieve good quality recording.

InvTool, the system’s computer-assisted voice-recording software, uses performance meters to provide feedback on pitch, loudness and pronunciation, ensuring that the recordings remain as similar as possible. Uniformity is essential to making a high-quality synthetic voice.

After uploading to the lab’s database, Taylor’s voice was analyzed and converted into his synthetic voice. ModelTalker provides users with a link to download the voice files.

The synthetic voice can be used with any speech communication system that’s SAPI 5.0 compatible. SAPI, or Speech Application Programming Interface, was created by Microsoft to use speech-recognition and speech-synthesis systems within the Windows operating system.

Otherwise, ModelTalker can be used as a stand-alone text-to-speech application that lets you type text into a window on an AAC device or computer, then hear it spoken using your synthetic voice.

Currently, Taylor’s using his ModelTalker voice on a laptop computer, along with EZ Keys and E-triloquist speech communication software. Both work with the Windows operating system, and E-triloquist can be downloaded at no cost to people with ALS.

“When I type words on my computer, people hear my voice instead of a fake computerlike voice, and that’s why ModelTalker is so valuable,” Taylor emphasized.

His wife, Hing, and three children are pleased he took the time to record his voice “because when it’s totally gone, the ModelTalker project will allow them to hear my real voice even if it’s coming out of a computer speaker.”

Before Taylor received his synthetic voice, he’d recorded over 100 phrases using the Windows Sound Recorder. His favorite recording is the “Happy Birthday” song.

He recommends that people with ALS record their laughter, too. You can’t record laughter using ModelTalker, but he said that “it’s super simple on any computer, and your family will appreciate it.”

It's not perfect, makes no guarantees

Because ModelTalker is experimental, beta-test software, there are no guarantees. Bunnell cautions users to have realistic expectations about their new synthetic voice, because it won’t sound exactly like their real voice.

To get a taste of the synthetic ModelTalker voices, visit the Web site to hear sample female and male voices — the original voice and its synthetic derivative. Most voices come from people with ALS.

Taylor said, “I have to admit that my synthesized voice is not perfect, but I should have started recording my voice with ModelTalker six months earlier when my voice was strong.”

Alyssa Quintero
No votes yet
MDA cannot respond to questions asked in the comments field. For help with questions, contact your local MDA office or clinic or email See comment policy