Coffee Talk 2: Speech Recognition Through the Browser

This application shows how an interactive video (e.g., a web ad) can interpret and branch based on user speech, using no plug-ins (just Flash).

Click the green phone button to start. If you have a microphone on your computer (and it is switched on), the video will respond to your voice (carefully watch the prompts at the bottom). This is new technology and a little fragile, so if it doesn't work, reach out to us and we'll make it work.

In a sense the voice interactions are "voice buttons" in as far as buttons could have been used, but instead we stream the user's voice for a short time to our server, transcode it, then evaluate it against a simple, limited voice grammar in a speech engine (we are using LumenVox for this application), then we send the result/determination back to the browser, all in real time. Compare it to the "telephone the browser" voice application also on this blog, which uses actual telephony. Both are voice apps but they feel quite different. Through-the-browser speech recognition is almost (but not quite) free, unlike telephone speech recognition which always involves a per-minute charge.