Objectives
- Voice Control of Media Front Ends and Automation Hub.
- Resilient to noise interference
- Low bandwith
- Low latency
Ideas
Simple Offline Control
- Pocketsphinx with very limited vocabulary
- Every command is keyword triggered
- Quick timeouts
Hybrid Pocketsphinx Google API
- Recognise trigger using pocketsphinx
- Acknowledge with beeps
- Need to manage mixer controls
- Pass commands to online STT engine (http://wit.ai)
- Process and control Kodi and openHAB
- Fall-back to Simple Offline Control
Hardware
Software
Prerequisites
Support packages
sudo apt-get install alsa-utils python-pip python-yaml python-dateutil python-pyaudio
sudo pip install apscheduler # need never versions, apt versions are too old
ALSA playback
sudo modprobe snd_usb_audio # USB mic, loads as card1 on RPi (after snd-bcm2835)
THIS DOES NOT WORK:
options snd-usb-audio index=0
options snd-bcm2835 index=1
Don't even bother trying to force index=1 for snd-bcm2835
, it doesn't support the index parameter:
osmc@osmc:~$ /sbin/modinfo snd-bcm2835
filename: /lib/modules/4.3.3-3-osmc/kernel/sound/arm/snd-bcm2835.ko
alias: platform:bcm2835_alsa
license: GPL
description: Alsa driver for BCM2835 chip
author: Dom Cobley
srcversion: 46AE410DEA6D239DB70D2C9
alias: of:N*T*Cbrcm,bcm2835-audio*
depends: snd-pcm,snd
intree: Y
vermagic: 4.3.3-3-osmc preempt mod_unload modversions ARMv6
parm: force_bulk:Force use of vchiq bulk for audio (bool)
Let snd-bcm2835
be card0 and load snd-usb-audio
as card1:
osmc@osmc:~$ cat /etc/modprobe.d/jasper.conf
options snd-usb-audio index=1
Then configure defaults in .asoundrc
accordingly.
osmc@osmc:~$ arecord -l
**** List of CAPTURE Hardware Devices ****
card 1: CameraB409241 [USB Camera-B4.09.24.1], device 0: USB Audio [USB Audio]
Subdevices: 1/1
Subdevice #0: subdevice #0
Audio configuration for PS3 Eye
The PS3 Eye is a camera with a 4-channel array mic.
Local ~/.asoundrc
## Suggested by http://julius.sourceforge.jp/forum/viewtopic.php?f=9&t=66
pcm.array {
type hw
card 0
}
pcm.array_gain {
type softvol
slave {
pcm "array"
}
control {
name "Mic Gain"
count 2
}
min_dB -10.0
max_dB 5.0
}
pcm.cap {
type plug
slave {
pcm "array_gain"
channels 4
}
route_policy sum
}
pcm.!default {
type asym
playback.pcm {
type plug
slave.pcm {
@func getenv
vars [ ALSAPCM ]
default "hw:0,0"
}
}
capture.pcm {
type plug
slave.pcm "cap"
}
}
Jasper
Project : http://jasperproject.github.io/
Passive STT : pocketsphinx
Active STT : wit.ai
TTS : Flite
Integrates STT and TTS systems. Python-based.
Configuration
~/.jasper/profile.yml
...
stt_passive_engine: sphinx
stt_engine: witai
witai-stt:
access_token: A0VERY0LONG0ALPHA0NUMERIC0STRING
tts_engine: flite-tts
flite-tts:
voice: slt
...
For split active and passive STT we need pocketsphinx and related packages.
[RPi2][7][?][7] Installation
For [RPi2][7][?]7 we can use packages from Debian experimental:
sudo su -c "echo 'deb http://ftp.debian.org/debian experimental main contrib non-free' > /etc/apt/sources.list.d/experimental.list"
sudo apt-get update
sudo apt-get -t experimental install cmuclmtk phonetisaurus m2m-aligner mitlm libfst-tools libfst1-plugins-base libfst-dev
[RPi1][9][?][9] Installation
For [RPi1][9][?]9 we can't use packages from Debian experimental so must build from source or install from elsewhere.
Install cognomen packages
add repo
sudo su -c "echo 'deb http://cognomen.co.uk/apt/debian jessie main' > /etc/apt/sources.list.d/cognomen.list"
import pgp key
gpg --keyserver keyserver.ubuntu.com --recv FC88E181D61C9391C4A49682CF36B219807AA92B && gpg --export --armor keymaster@cognomen.co.uk | sudo apt-key add -
update
sudo apt-get update
sudo apt-get install pocketsphinx pocketsphinx-hmm-en-hub4wsj python-pocketsphinx python-yaml phonetisaurus m2m-aligner mitlm libfst-tools libfst1-plugins-base libfst-dev cmuclmtk python-semantic
Building [RPi1][9][?][9] dependencies from source
Trying to Cross Compile
Don't need crosstool-ng
can use prebuilt raspberrypi-tools x86-32 linaro cross compiler.
Naïve openfst cross-compile
export PATH=~/src/raspberrypi-tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin:$PATH
./configure --host arm-linux-gnueabihf --enable-compact-fsts --enable-const-fsts --enable-far --enable-lookahead-fsts --enable-pdt
make -j 8
Cross compilation works but Debian RaspberryPi Packaging doesn't.
Build natively
apt-get source phonetisaurus m2m-aligner mitlm openfst
for each
dpkg-buildpackage -us -uc -rfakeroot
Install to repo
On the system with the signing keys:
sshfs yuggoth:/ yuggoth-ssh
cd yuggoth-ssh/var/www/data/cognomen.co.uk/apt/debian
for i in *.deb
do
reprepro includedeb jessie "$i"
done
Other methods
wit.ai Standalone
Not used by jasper.
sudo apt-get install libsox2
wget https://github.com/wit-ai/witd/releases/download/v0.1/witd-armv6
chmod a+x witd-armv6
./witd-armv6
Voice Command for [RPi][16][?][16]
CMU Sphinx, [PocketSphinx][17][?][17], [KodiVC][18][?][18]
sudo apt-get install build-essential sshfs automake libtool
RaspBMC/Kodi uses pulseaudio so use that for kodivc.
sudo apt-get install bison libpulse-dev
[KodiVC][18][?][18]: github
Google Voice API
V1 API probably doesn't work any more. V2 needs at least a new API key (limited to 50 calls per day).
Old script
From http://blog.oscarliang.net/raspberry-pi-voice-recognition-works-like-siri/ :
!/bin/bash
echo "Recording... Press Ctrl+C to Stop."
arecord -D "plughw:1,0" -q -f cd -t wav | ffmpeg -loglevel panic -y -i - -ar 16000 -acodec flac file.flac > /dev/null 2>&1
echo "Processing..."
wget -q -U "Mozilla/5.0" --post-file file.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" | cut -d\" -f12 >stt.txt
echo -n "You Said: "
cat stt.txt
rm file.flac > /dev/null 2>&1