speech
|
You can use this Sphinx Knowledge Base Tool – VERSION 3.
This will be a quick explanation of how to add new words for Spanish speech recognition. First of all, you have to know that you need two files. These are the language model (file.lm) and the lexical model (file.dic) referred to variously as the 'pronouncing dictionary' or simply the ‘dictionary’. For more information, you can check this link about how to build a new Language Model. If you want to know information about LMs and DIC files and other things, you can check this link.
In order to build the dictionary file (dic) you can add the words manually that you need, using this file as reference. You only have to copy the words with the respective secuence of phones to the file .dic that we want to extend. A word example: **convocar k o n b o k a r **
In the case of language model, you have to use the SRI Language Modeling Toolkit (SRILM). You can download it here.
If you have followed this steps correctly, the compiled binaries of SRILM should be found inside the following folder:
Prepare a reference text that will be used to generate the language model. In our case, we can use for example "follow-me-spanish.txt" To create the language model file, you can do this:
Now, you will have the LM file. Remember to copy and replace the old dictionary files located in /speech/share/speechRecognition/conf/dictionary/
by the newers.