I’ve been playing with a Raspberry Pi, Logitech USB mic, USB-powered speakers and a USB SDR TV tuner stick, combined with Stephen Hickson’s fantastic voicecommand system. (The Pi is the rainbox-striped box at the back, the flat black thing in the middle is a powered USB hub, into which is plugged the black TV tuner stick, which has an aerial (positioned on the small jar). The Rubix Cube is just there for scale.)
First, speech is sent as raw audio to a Google API which returns text. It would be better to do this on the Pi, but my experiments (with Julius) have shown that Google (with their gargantuan computing grid) is much better in terms of both speed and accuracy. (Since the microphone has an on/off button, audio is only sent to Google when I so choose.)
Second, the text is compared with a list of known commands (see the voicecommand website for more details). If a match is found, the corresponding script is run. (This is how the ‘weather forecast’ and ‘have I got mail’ commands work.) If no match is found, the text is sent off to Wolfram Alpha, which returns a text answer.
Finally, the results from Wolfram Alpha, or the appropriate script, are sent off to another Google API to turn them into an audio file, which is then played out over the speakers. I have tried using espeak, but again, Google’s API currently does a better job.
The whole thing is reasonably fast, given everything that is involved. Occasional internet latency spikes delay responses from the script for 10 seconds or so, but in my experience they are rare.
The live aircraft information is received using the Software Defined Radio (SDR) technique, using a RTL2838 TV tuner USB stick with rtl-sdr and dump1090 software, which provides a nice json interface over http. A python script queries this interface on demand and computes the nearest couple of aircraft to my location, then gathers some supplementary information from the internet before reading the response.
The scripts that make all this work are available on github.
Future plans include: adding commands to play music, add items to a shopping list, read news headlines and much more. My four year old daughter’s most recent request was for the AnswerBox to gain wings and fly around the room on request. There’s probably a python library for that. Hmmm….