View source for Google Voice Recognition
From Openmoko
You do not have permission to edit this page, for the following reasons:
You can view and copy the source of this page:
Template used on this page:
Return to Google Voice Recognition.
You do not have permission to edit this page, for the following reasons:
You can view and copy the source of this page:
Template used on this page:
Return to Google Voice Recognition.
Convert an Audio-File into text file via voice recognition.
NOTE: The shell script mentioned below can be used on any Linux-Operating System with some software requirements, because the speech recognition is not performed on the local machine. |
Because the performance of your Freerunner is too poor for voice recognition, the Google Voice API can be used to convert an recorded Audio file into a text file. Be aware that the audio file will be transmitted to Google and the recognition is not performed on FR. This implies, that you need to have Internet access on your freerunner FR to submit the audio file.
NOTE: You must be aware of the fact, that the follow script is running on your freerunner but it is not a standalone voice recognition software and so you might not want to use this tool for private audio files. |
For using the Google Voice API and the script you need to have the following package installed on your freerunner:
Install the packages from the repositories of the freerunner Distributions.
NOTE: The return code of German audio files needs capitalization of nouns, because all words are return in small caps. A ispell or aspell correction of the message.txt might improve the recognized text. |
The script code googlevoice.sh can be tested on any Linux machine with SoX, SED, WGET installed. Modifiy the script according to your demands and storage of your audio files
#!/bin/sh echo "1 SoX Sound Exchange - Convert WAV to FLAC with 16000" sox message.wav message.flac rate 16k echo "2 Submit to Google Voice Recognition" wget -q -U "Mozilla/5.0" --post-file message.flac --header="Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=de-de&client=chromium" > message.ret echo "3 SED Extract recognized text" cat message.ret | sed 's/.*utterance":"//' | sed 's/","confidence.*//' > message.txt echo "4 Remove Temporary Files" rm message.flac rm message.ret echo "5 Show Text " cat message.txt
The parameter lang=de-de is indicating, that the Google Voice API is expecting a German language audio file. Replace lang=de-de by lang=en-us to submit an audio file in US-English.
The script googlevoicepar.sh with a command line parameter can be used if you want to use multiple input files for batch file recognition. You will call this script with the basename e.g. message0, message1,... by
#!/bin/sh LANGUAGE="de-de" echo "1 SoX Sound Exchange - Convert $1.wav to $1.flacC with 16000" sox $1.wav $1.flac rate 16k echo "2 Submit to Google Voice Recognition $1.flac" wget -q -U "Mozilla/5.0" --post-file $1.flac --header="Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=${LANGUAGE}&client=chromium" > $1.ret echo "3 SED Extract recognized text" cat $1.ret | sed 's/.*utterance":"//' | sed 's/","confidence.*//' > $1.txt echo "4 Remove Temporary Files" rm $1.flac rm $1.ret echo "5 Show Text " cat $1.txt
Google API will always need a internet access. For the development of OpenSource standalone Software on Linux it might be good to have an OpenSource-Webinterface to collect Audio Samples for improving the user independent Speech Recognition Profiles HMM for Speech Recognition of large vocabulary and different languages.