Fun With Speech Recognition in WPF
At last week’s CapArea.NET meeting, I demonstrated using the built in speech recognition of Windows Vista with a demo compass application. [source code]
I spoke the direction I wanted the needle to point and the computer would recognize the command and point the arrow. After a few commands, the computer tells me to “stop bossing it around.”

It was simple but it illustrated several points. One, speech can add value to you applications. Two, it’s easy to add. Three, it’s free.
Best of all, it’s fun.
First, you’ll need to add a reference to the System.Speech library. This is where all the speech recognition and speech synthesis classes live.

Once your project has the references, add the following using statements to your code behind.
1: using System.Speech.Recognition;
2: using System.Speech.Synthesis;
1: this._speechSynthesizer = new SpeechSynthesizer();
2: this._speechRecognizer = new SpeechRecognizer();
3: 4: this._speechRecognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(_speechRecognizer_SpeechRecognized);
5: this._speechRecognizer.Enabled = true;
1: private void _speechRecognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
2: {3: string directionResult = e.Result.Text;
4: 5: // Set Window Title
6: this.Title = directionResult;
7: 8: Storyboard directionStoryboard = this.Resources[directionResult] as Storyboard;
9: 10: if (directionStoryboard != null)
11: { 12: directionStoryboard.Begin(); 13: }14: else
15: {16: this.Title = "Not a storyboard";
17: } 18: }It’s in that event that we get passed the results of the recognition inside the SpeecRecognizedEventArgs and you’ll see I drop that into a string and set the Window’s Title property to display what the system interpreted the speech to be.
On line 8, I use the recognized string to get the appropriate Storyboard. I saved myself some time by cleverly naming them. ;)
1: <Storyboard x:Key="South">
2: <DoubleAnimationUsingKeyFrames BeginTime="00:00:00" Storyboard.TargetName="pthArrow" Storyboard.TargetProperty="(UIElement.RenderTransform).(TransformGroup.Children)[2].(RotateTransform.Angle)">
3: <SplineDoubleKeyFrame KeyTime="00:00:00.7000000" Value="179.048"/>
4: </DoubleAnimationUsingKeyFrames>
5: </Storyboard>
6: <Storyboard x:Key="West">
7: <DoubleAnimationUsingKeyFrames BeginTime="00:00:00" Storyboard.TargetName="pthArrow" Storyboard.TargetProperty="(UIElement.RenderTransform).(TransformGroup.Children)[2].(RotateTransform.Angle)">
8: <SplineDoubleKeyFrame KeyTime="00:00:00.7000000" Value="-89.818"/>
9: </DoubleAnimationUsingKeyFrames>
10: </Storyboard>
Don’t worry if the Storyboard syntax doesn’t make sense to you, I could talk about Silverlight and WPF animation all day, but here the focus is on Speech, not XAML.
When you run the application, you may get the Speech Setup Tutorial if you’ve never run speech recognition before.
You don’t have to run through the tutorial, but I recommend you do as it will demonstrate the power of the engine built right in to the OS.
The system also uses the tutorial to set up your microphone, adjust your settings and start learning your voice.
Once you get past the tutorial (it takes bout 10 minutes), you’ll notice the speech recognition tool bar on your desktop.
Stop! Grammar Time
In order to increase the reliability of the sample app, I added a grammar to limit the number of possibilities the speech recognizer had.
You want to do this to narrow down the potential results from millions of words to dozens. Narrowing the recognition pool increases the accuracy.
Grammars can get quite complex and there even is a W3C standard (SRGS) for defining them.
However, since we’re dealing with a compass, we really only need eight points: the four directions (North, West, South, East) and the four in between points.
1: private Choices GetChoices()
2: {3: Choices choices = new Choices();
4: 5: choices.Add("North");
6: choices.Add("West");
7: choices.Add("East");
8: choices.Add("South");
9: 10: choices.Add("NorthWest");
11: choices.Add("SouthWest");
12: choices.Add("NorthEast");
13: choices.Add("SouthEast");
14: 15: return choices;
16: }I use the following code to load the grammar into my recognizer.
1: Choices choices = GetChoices(); 2: 3: GrammarBuilder grammarBuilder = new GrammarBuilder(choices);
4: Grammar grammarDirections = new Grammar(grammarBuilder);
5: 6: this._speechRecognizer.LoadGrammar(grammarDirections);
Talk to Me
The code to make the computer speak is actually much easier.
In fact, it can come down to one line of code (two if you count the call to the constructor):
this._speechSynthesizer.Speak("Stop bossing me around!");
I wrote a blog post a little while back just on speech synthesis and it’s own demo app.
Now, you know that it’s actually quite easy to add a little bit of NUI (Natural User Interface) to your applications.

7 Comments
Wabbletini said
Nice app! Good to see voice is now easy to run in an application, a whole new world opens!
frank said
Glad I could help inspire. :)
Darren said
I can't run the very first line of code:
this._speechSynthesizer = new SpeechSynthesizer();
this._speechRecognizer = new SpeechRecognizer();
Gives me and error, saying:"the program doesn't contain a definition for ..."
if you read this comment please email me the answer.
thanks. Very useful article here...
frank said
Make sure you reference the right DLL.
Vinod said
Source code link is broken
frank said
Vinod,
Thanks for the heads up.
I need to fix that.
dc said
Hi Frank,
The source code link is still broken.