Skip to content
Kaarel Kaljurand edited this page Jun 13, 2013 · 3 revisions

Answers to some frequently asked questions, mostly related to changing Inimesed to support other languages than Estonian. In principle three things need to be changed:

  • acoustic models
  • dictionary (grapheme-to-phoneme converter)
  • JSGF grammar

How do I create and use another acoustic model?

The directory assets/hmm/ should contain the acoustic models in the same format as the Estonian models from https://github.com/alumae/et-pocketsphinx-tutorial/tree/master/models/hmm

Just copy your models into this directory and recompile.

Developing the acoustic models is complex and requires training data, but you might be able to find free existing models for your language. They might need to be retrained though to be sufficiently fast on a mobile platform.

Read the Sphinx documentation or ask their mailing list for instructions on how to create acoustic models.

How do I create and use another language model?

Inimesed creates a simple JSGF grammar automatically based on the device's address book. If you are developing a different application (e.g. with more voice commands) then you need to create the JSGF grammar yourself. Or you might consider using an n-gram language model instead.

Inimesed creates a JSGF file based on the content of the contacts database here:

https://github.com/Kaljurand/Inimesed/blob/master/app/src/ee/ioc/phon/android/inimesed/ContactsGrammar.java

Note that any semantic interpretation, i.e. deciding to which contact(s) the utterance maps takes place in Java after the recognition.

How do I create and use another dictionary (grapheme-to-phoneme converter)?

The dictionary is a one-to-one mapping (bijection) between grammar atoms and phonetic sequences, i.e. any ambiguity/synonymy handling is done in Java. Whenever the app is started the dictionary and grammar files are recreated and stored on the SD card from where Pocketsphinx picks them up.

The complexity of automatically creating a dictionary depends on the complexity of language. For Estonian a simple grapheme-to-phoneme translator can already give very good results. For English it's a different story.

The dictionary is built here:

https://github.com/Kaljurand/Inimesed/blob/master/app/src/ee/ioc/phon/android/inimesed/Persons.java

using the mapping to phonetic symbols described here:

https://github.com/Kaljurand/Inimesed/blob/master/app/src/ee/ioc/phon/android/inimesed/PhonMapper.java

The phonetic symbols in the dictionary have to agree with the acoustic model. And the current PhonMapper is specific to Estonian. There is an open issue to generalize this more (which would also make plugging in other languages easier), see: https://github.com/Kaljurand/Inimesed/issues/1

The app does not work. Why?

It is likely that the reason can be found by looking at the logs.

Pocketsphinx's own log will reveal problems regarding acoustic models, grammars, etc. It is stored onto the SD card into the file:

Android/data/ee.ioc.phon.android.inimesed/files/pocketsphinx.log

(The SD card layout used by Inimesed is defined in https://github.com/Kaljurand/Inimesed/blob/master/app/src/ee/ioc/phon/android/inimesed/DataFiles.java.)

Note that the log file gets deleted every time the app is destroyed. So to actually see it make sure that Inimesed is running. Launch Inimesed and either use adb shell to locate the file, or press HOME on the device (which does not destroy the app) and use a file manager on the device to locate the file. You could also temporarily comment out the lines:

mDf.deleteJsgf();
mDf.deleteLogfile();
mDf.deleteRawLogDir();

in https://github.com/Kaljurand/Inimesed/blob/master/app/src/ee/ioc/phon/android/inimesed/InimesedActivity.java. This way the log does not get deleted onDestroy.

Inimesed also uses the Android's logger which can be switched on by setting DEBUG to true in https://github.com/Kaljurand/Inimesed/blob/master/app/src/ee/ioc/phon/android/inimesed/Log.java. Now you can monitor the log messages using adb logcat.