list of supported words in the model
The purpose of this project is to minimize the complexity and setup required for Satellite clients in a Rhasspy environment. The solution is designed to work 100% offline and does not require an internet connection. It offers a YAML configuration file as the only method of configuration, with no need for a UI or web interface.
The project leverages the Tensorflow framework to provide a solution for generating and training custom wake words in any language. This feature is based on extensive research from Google's KWS project. The prototype will listen for the wake word and send audio bytes to a Rhasspy server. It will also respond to PlayBytes requests from the Rhasspy server, making it an ideal solution for integration into a home automation setup.
It is important to note that the prototype assumes the use of an MQTT broker and a pre-configured Rhasspy environment for handling audio requests. Additionally, the project assumes the use of HomeAssistant to handle all intents.
As this project is just getting started; I will refine and update the documentation over the next few months.
I've included a Dockerfile with a pre-trained model with the speech commands for testing out the interfaces and configurations. You can even run this from any system with a microphone and speakers for testing and evaluation.
From the root of the checkout run the following to build an image:
docker build . -t alice -f Docker/Dockerfile
docker run --name alice --device /dev/snd:/dev/snd -dit alice
docker attach alice
cd /root
./demo.sh
or you can just check it out and run: inside the Docker folder is a requirements.txt for the dependencies. You will need to copy the tflite into your data folder {alice_data/tflite}
python -m pip install 'alice_satellite @ git+https://github.com/UmbrellaCodr/alice_satellite@main'
python -m alice_satellite -v -h
python -m alice_satellite -h
usage: alice [-h] [--data DATA] [-d] [-v] [--audio_input AUDIO_INPUT] [--audio_output AUDIO_OUTPUT]
{list,config,analyze,listen,detect,satellite,train,predict,generate,morph,verify,info,mqtt,tts,transcribe} ...
positional arguments:
{list,config,analyze,listen,detect,satellite,train,predict,generate,morph,verify,info,mqtt,tts,transcribe}
supported sub commands
list list audio devices
config save a config file
analyze morph samples
listen requires a trained model
detect requires a trained model
satellite main mode, listens for wake word communicates to rhasspy
train train the model
predict classify wav file
generate generate samples
morph morph samples
verify verify samples
info dump data folder
mqtt mqtt util
tts text to speech
transcribe transcribe audio file
options:
-h, --help show this help message and exit
--data DATA set the default data location
-d, --debug maybe print something helpful
-v, --verbose log more
--audio_input AUDIO_INPUT
specify in device from {list}
--audio_output AUDIO_OUTPUT
specify out device from {list}
{alice_data} defaults to the current location where you are invoking the model from unless specified by --data
- list Will show all the in/out devices currently detected by the application
- config This will generate a config.yml into your {alice_data} folder
- analyze This uses whisper to validate generated samples match an alternative machine model
- listen This will process input from the microphone and compare it against the current model
- detect This is for debugging and validating detecting of a keyword
- satellite This is the main mode will describe down bellow
- train this is used to train a new tensorflow model with any generates samples
- generate this is a mode to generate and prepare samples for new keywords used by the train command
- morph takes your samples and shifts the audio around in the window for training
- verify allows you to listen to your recorded samples
- info Will show you a summary of the current tflite model and samples
- mqtt This allows you to subscribe and listen to mqtt topics on the network mostly used for debugging
- tts this allow you to pass a string and get an audio file back from Rhasspy
- transcribe this allows you to pass an audio file to Rhasspy to see if it detected an intent
Satellite mode allows you to setup an array of required words or a single word based on the parameters passed
This will indicate that both 2 and 3 need to be spoken within a 3 second window before having the wake word detected, for example "Hey Alice" where Hey is at index 2 and Alice is at index 3
-i 2 3
You can also pass -m 15 where if index 15 is matched it will enable wake word detection. You also have the ability to adjust the threshold by passing --threshold
python -m alice_satellite -v satellite -h
usage: alice satellite [-h] [-i INDEX [INDEX ...]] [-m MATCH] [-t THRESHOLD] [-w]
options:
-h, --help show this help message and exit
-i INDEX [INDEX ...], --index INDEX [INDEX ...]
index of words to required for wake word detection
-m MATCH, --match MATCH
single index to match for wake word detection
-t THRESHOLD, --threshold THRESHOLD
threshold for keyword match
-w, --whisper enable whisper transcription
. = no noise detected
* = noise detected
* yellow = one of the keywords detected under threshold
* green = one of the keywords detected
detected = when both keywords were presented in a 3 second window
+ = it is streaming audio to Rhasspy
P = Rhasspy sent us a sound to play
S = Rhasspy sent us a start/stop listening request
To configure the mqtt settings you need a configuration file you can create a default one by running the config command. {alice_data/config.yml}
We support all of the following mqtt settings here: mqtt settings
mqtt:
hostname: test.mosquitto.org
Example:
mqtt:
hostname: test.mosquitto.org
password: password
port: 1883
username: alice
sudo vim /lib/systemd/system/alice.service
[Unit]
Description=Alice Satellite
After=multi-user.target
[Service]
Type=simple
User=pi
ExecStart=/usr/bin/python3 -m alice_satellite --data /home/pi/alice_data satellite -i 2 3 -m 4
Restart=on-abort
[Install]
WantedBy=multi-user.target
sudo chmod 644 /lib/systemd/system/alice.service
sudo systemctl daemon-reload
sudo systemctl enable alice.service
sudo systemctl start alice.service
For every change that we do on the /lib/systemd/system folder we need to execute a daemon-reload (third line of previous code). If we want to check the status of our service, you can execute:
sudo systemctl status alice.service
Check status
sudo systemctl status alice.service
Start service
sudo systemctl start alice.service
Stop service
sudo systemctl stop alice.service
Check service's log
sudo journalctl -f -u alice.service