mirror of
https://github.com/openhab/openhab-addons.git
synced 2025-01-10 15:11:59 +01:00
[rustpotterks] initial contribution (#12606)
* [rustpotterks] initial contribution Signed-off-by: Miguel Álvarez <miguelwork92@gmail.com>
This commit is contained in:
parent
48c14e613c
commit
11aa3207a6
@ -391,6 +391,7 @@
|
|||||||
/bundles/org.openhab.voice.picotts/ @FlorianSW
|
/bundles/org.openhab.voice.picotts/ @FlorianSW
|
||||||
/bundles/org.openhab.voice.pollytts/ @hillmanr
|
/bundles/org.openhab.voice.pollytts/ @hillmanr
|
||||||
/bundles/org.openhab.voice.porcupineks/ @GiviMAD
|
/bundles/org.openhab.voice.porcupineks/ @GiviMAD
|
||||||
|
/bundles/org.openhab.voice.rustpotterks/ @GiviMAD
|
||||||
/bundles/org.openhab.voice.voicerss/ @JochenHiller @lolodomo
|
/bundles/org.openhab.voice.voicerss/ @JochenHiller @lolodomo
|
||||||
/bundles/org.openhab.voice.voskstt/ @GiviMAD
|
/bundles/org.openhab.voice.voskstt/ @GiviMAD
|
||||||
/bundles/org.openhab.voice.watsonstt/ @GiviMAD
|
/bundles/org.openhab.voice.watsonstt/ @GiviMAD
|
||||||
|
@ -1956,6 +1956,11 @@
|
|||||||
<artifactId>org.openhab.voice.porcupineks</artifactId>
|
<artifactId>org.openhab.voice.porcupineks</artifactId>
|
||||||
<version>${project.version}</version>
|
<version>${project.version}</version>
|
||||||
</dependency>
|
</dependency>
|
||||||
|
<dependency>
|
||||||
|
<groupId>org.openhab.addons.bundles</groupId>
|
||||||
|
<artifactId>org.openhab.voice.rustpotterks</artifactId>
|
||||||
|
<version>${project.version}</version>
|
||||||
|
</dependency>
|
||||||
<dependency>
|
<dependency>
|
||||||
<groupId>org.openhab.addons.bundles</groupId>
|
<groupId>org.openhab.addons.bundles</groupId>
|
||||||
<artifactId>org.openhab.voice.voicerss</artifactId>
|
<artifactId>org.openhab.voice.voicerss</artifactId>
|
||||||
|
20
bundles/org.openhab.voice.rustpotterks/NOTICE
Normal file
20
bundles/org.openhab.voice.rustpotterks/NOTICE
Normal file
@ -0,0 +1,20 @@
|
|||||||
|
This content is produced and maintained by the openHAB project.
|
||||||
|
|
||||||
|
* Project home: https://www.openhab.org
|
||||||
|
|
||||||
|
== Declared Project Licenses
|
||||||
|
|
||||||
|
This program and the accompanying materials are made available under the terms
|
||||||
|
of the Eclipse Public License 2.0 which is available at
|
||||||
|
https://www.eclipse.org/legal/epl-2.0/.
|
||||||
|
|
||||||
|
== Source Code
|
||||||
|
|
||||||
|
https://github.com/openhab/openhab-core
|
||||||
|
|
||||||
|
== Third-party Content
|
||||||
|
|
||||||
|
io.github.givimad: rustpotter-java
|
||||||
|
* License: Apache 2.0 License
|
||||||
|
* Project: https://github.com/GiviMAD/rustpotter
|
||||||
|
* Source: https://github.com/GiviMAD/rustpotter-java
|
74
bundles/org.openhab.voice.rustpotterks/README.md
Normal file
74
bundles/org.openhab.voice.rustpotterks/README.md
Normal file
@ -0,0 +1,74 @@
|
|||||||
|
# Rustpotter Keyword Spotter
|
||||||
|
|
||||||
|
This voice service allows you to use the open source library Rustpotter as your keyword spotter in openHAB.
|
||||||
|
[Rustpotter](https://github.com/GiviMAD/rustpotter) is a free and open-source keywords spotter written in rust.
|
||||||
|
|
||||||
|
Rustpotter provides personal on-device wake word detection. You need to generate a model for your keyword using audio samples.
|
||||||
|
|
||||||
|
Important: No voice data listened by this service will be uploaded to the Cloud.
|
||||||
|
The voice data is processed offline, locally on your openHAB server by Rustpotter.
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
After installing, you will be able to access the service options through the openHAB configuration page in UI (**Settings / Other Services - Rustpotter Keyword Spotter**) to edit them:
|
||||||
|
|
||||||
|
* **Threshold** - Configures the detector threshold, is the min score (in range 0. to 1.) that some wake word template should obtain to trigger a detection. Defaults to 0.5.
|
||||||
|
* **Averaged Threshold** - Configures the detector averaged threshold, is the min score (in range 0. to 1.) that the audio should obtain against a combination of the wake word templates, the detection will be aborted if this is not the case. This way it can prevent to run the comparison of the current frame against each of the wake word templates which saves cpu. If set to 0 this functionality is disabled.
|
||||||
|
* **Eager mode** - Enables eager mode. End detection as soon as a result is over the score, instead of waiting to see if the next frame has a higher score.
|
||||||
|
* **Noise Detection Mode** - Use build-in noise detection to reduce computation on absence of noise. Configures the difficulty to consider a frame as noise (the required noise level).
|
||||||
|
* **Noise Detection Sensitivity** - Noise/silence ratio in the last second to consider noise is detected. Defaults to 0.5.
|
||||||
|
* **VAD Mode** - Use a voice activity detector to reduce computation in the absence of vocal sound.
|
||||||
|
* **VAD Sensitivity** - Voice/silence ratio in the last second to consider voice is detected.
|
||||||
|
* **VAD Delay** - Seconds to disable the vad detector after voice is detected. Defaults to 3.
|
||||||
|
* **Comparator Ref** - Configures the reference for the comparator used to match the samples.
|
||||||
|
* **Comparator Band Size** - Configures the band-size for the comparator used to match the samples.
|
||||||
|
|
||||||
|
|
||||||
|
In case you would like to setup the service via a text file, create a new file in `$OPENHAB_ROOT/conf/services` named `rustpotterks.cfg`
|
||||||
|
|
||||||
|
Its contents should look similar to:
|
||||||
|
|
||||||
|
```
|
||||||
|
org.openhab.voice.rustpotterks:threshold=0.5
|
||||||
|
org.openhab.voice.rustpotterks:averagedthreshold=0.2
|
||||||
|
org.openhab.voice.rustpotterks:comparatorRef=0.22
|
||||||
|
org.openhab.voice.rustpotterks:comparatorBandSize=6
|
||||||
|
org.openhab.voice.rustpotterks:eagerMode=true
|
||||||
|
org.openhab.voice.rustpotterks:noiseDetectionMode=hard
|
||||||
|
org.openhab.voice.rustpotterks:noiseDetectionSensitivity=0.5
|
||||||
|
org.openhab.voice.rustpotterks:vadMode=aggressive
|
||||||
|
org.openhab.voice.rustpotterks:vadSensitivity=0.5
|
||||||
|
org.openhab.voice.rustpotterks:vadDelay=3
|
||||||
|
```
|
||||||
|
|
||||||
|
## Magic Word Configuration
|
||||||
|
|
||||||
|
The magic word to spot is gathered from your 'Voice' configuration.
|
||||||
|
|
||||||
|
You can generate your own wake word model by using the [Rustpotter CLI](https://github.com/GiviMAD/rustpotter-cli).
|
||||||
|
|
||||||
|
You can also download the models used as examples on the [rustpotter web demo](https://givimad.github.io/rustpotter-worklet-demo/) from [this folder](https://github.com/GiviMAD/rustpotter-worklet-demo/tree/main/static).
|
||||||
|
|
||||||
|
To use a wake word model, you should place the file under '\<openHAB userdata\>/rustpotter' and configure your magic word to match the file name replacing spaces with '_' and adding the extension '.rpw'.
|
||||||
|
As an example, the file generated for the keyword "ok openhab" will be named 'ok_openhab.rpw'.
|
||||||
|
|
||||||
|
The service will only work if it's able to find the correct rpw for your magic word configuration.
|
||||||
|
|
||||||
|
|
||||||
|
## Default Keyword Spotter and Magic Word Configuration
|
||||||
|
|
||||||
|
You can setup your preferred default keyword spotter and default magic word in the UI:
|
||||||
|
|
||||||
|
* Go to **Settings**.
|
||||||
|
* Edit **System Services - Voice**.
|
||||||
|
* Set **Rustpotter Keyword Spotter** as **Default Keyword Spotter**.
|
||||||
|
* Choose your preferred **Magic Word** for your setup.
|
||||||
|
* Choose optionally your **Listening Switch** item that will be switch ON during the period when the dialog processor has spotted the keyword and is listening for commands.
|
||||||
|
|
||||||
|
In case you would like to setup these settings via a text file, you can edit the file `runtime.cfg` in `$OPENHAB_ROOT/conf/services` and set the following entries:
|
||||||
|
|
||||||
|
```
|
||||||
|
org.openhab.voice:defaultKS=rustpotterks
|
||||||
|
org.openhab.voice:keyword=hey openhab
|
||||||
|
org.openhab.voice:listeningItem=myItemForDialog
|
||||||
|
```
|
24
bundles/org.openhab.voice.rustpotterks/pom.xml
Normal file
24
bundles/org.openhab.voice.rustpotterks/pom.xml
Normal file
@ -0,0 +1,24 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
|
<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://maven.apache.org/POM/4.0.0"
|
||||||
|
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
|
||||||
|
|
||||||
|
<modelVersion>4.0.0</modelVersion>
|
||||||
|
|
||||||
|
<parent>
|
||||||
|
<groupId>org.openhab.addons.bundles</groupId>
|
||||||
|
<artifactId>org.openhab.addons.reactor.bundles</artifactId>
|
||||||
|
<version>3.3.0-SNAPSHOT</version>
|
||||||
|
</parent>
|
||||||
|
|
||||||
|
<artifactId>org.openhab.voice.rustpotterks</artifactId>
|
||||||
|
|
||||||
|
<name>openHAB Add-ons :: Bundles :: Voice :: Rustpotter Keyword Spotter</name>
|
||||||
|
|
||||||
|
<dependencies>
|
||||||
|
<dependency>
|
||||||
|
<groupId>io.github.givimad</groupId>
|
||||||
|
<artifactId>rustpotter-java</artifactId>
|
||||||
|
<version>1.0.0</version>
|
||||||
|
</dependency>
|
||||||
|
</dependencies>
|
||||||
|
</project>
|
@ -0,0 +1,9 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
|
<features name="org.openhab.voice.rustpotterks-${project.version}" xmlns="http://karaf.apache.org/xmlns/features/v1.4.0">
|
||||||
|
<repository>mvn:org.openhab.core.features.karaf/org.openhab.core.features.karaf.openhab-core/${ohc.version}/xml/features</repository>
|
||||||
|
|
||||||
|
<feature name="openhab-voice-rustpotterks" description="Rustpotter Keyword Spotter" version="${project.version}">
|
||||||
|
<feature>openhab-runtime-base</feature>
|
||||||
|
<bundle start-level="80">mvn:org.openhab.addons.bundles/org.openhab.voice.rustpotterks/${project.version}</bundle>
|
||||||
|
</feature>
|
||||||
|
</features>
|
@ -0,0 +1,72 @@
|
|||||||
|
/**
|
||||||
|
* Copyright (c) 2010-2022 Contributors to the openHAB project
|
||||||
|
*
|
||||||
|
* See the NOTICE file(s) distributed with this work for additional
|
||||||
|
* information.
|
||||||
|
*
|
||||||
|
* This program and the accompanying materials are made available under the
|
||||||
|
* terms of the Eclipse Public License 2.0 which is available at
|
||||||
|
* http://www.eclipse.org/legal/epl-2.0
|
||||||
|
*
|
||||||
|
* SPDX-License-Identifier: EPL-2.0
|
||||||
|
*/
|
||||||
|
package org.openhab.voice.rustpotterks.internal;
|
||||||
|
|
||||||
|
import org.eclipse.jdt.annotation.NonNullByDefault;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* The {@link RustpotterKSConfiguration} class contains fields mapping thing configuration parameters.
|
||||||
|
*
|
||||||
|
* @author Miguel Álvarez - Initial contribution
|
||||||
|
*/
|
||||||
|
@NonNullByDefault
|
||||||
|
public class RustpotterKSConfiguration {
|
||||||
|
/**
|
||||||
|
* Configures the detector threshold, is the min score (in range 0. to 1.) that some wake word template should
|
||||||
|
* obtain to trigger a detection. Defaults to 0.5.
|
||||||
|
*/
|
||||||
|
public float threshold = 0.5f;
|
||||||
|
/**
|
||||||
|
* Configures the detector averaged threshold, is the min score (in range 0. to 1.) that the audio should obtain
|
||||||
|
* against a
|
||||||
|
* combination of the wake word templates, the detection will be aborted if this is not the case. This way it can
|
||||||
|
* prevent to
|
||||||
|
* run the comparison of the current frame against each of the wake word templates which saves cpu.
|
||||||
|
* If set to 0 this functionality is disabled.
|
||||||
|
*/
|
||||||
|
public float averagedThreshold = 0.2f;
|
||||||
|
/**
|
||||||
|
* Terminate the detection as son as one result is above the score,
|
||||||
|
* instead of wait to see if the next frame has a higher score.
|
||||||
|
*/
|
||||||
|
public boolean eagerMode = true;
|
||||||
|
/**
|
||||||
|
* Use build-in noise detection to reduce computation on absence of noise.
|
||||||
|
* Configures the difficulty to consider a frame as noise (the required noise level).
|
||||||
|
*/
|
||||||
|
public String noiseDetectionMode = "disabled";
|
||||||
|
/**
|
||||||
|
* Noise/silence ratio in the last second to consider noise is detected. Defaults to 0.5.
|
||||||
|
*/
|
||||||
|
public float noiseSensitivity = 0.5f;
|
||||||
|
/**
|
||||||
|
* Seconds to disable the vad detector after voice is detected. Defaults to 3.
|
||||||
|
*/
|
||||||
|
public int vadDelay = 3;
|
||||||
|
/**
|
||||||
|
* Voice/silence ratio in the last second to consider voice is detected.
|
||||||
|
*/
|
||||||
|
public float vadSensitivity = 0.5f;
|
||||||
|
/**
|
||||||
|
* Use a voice activity detector to reduce computation in the absence of vocal sound.
|
||||||
|
*/
|
||||||
|
public String vadMode = "disabled";
|
||||||
|
/**
|
||||||
|
* Configures the reference for the comparator used to match the samples.
|
||||||
|
*/
|
||||||
|
public float comparatorRef = 0.22f;
|
||||||
|
/**
|
||||||
|
* Configures the band-size for the comparator used to match the samples.
|
||||||
|
*/
|
||||||
|
public int comparatorBandSize = 6;
|
||||||
|
}
|
@ -0,0 +1,45 @@
|
|||||||
|
/**
|
||||||
|
* Copyright (c) 2010-2022 Contributors to the openHAB project
|
||||||
|
*
|
||||||
|
* See the NOTICE file(s) distributed with this work for additional
|
||||||
|
* information.
|
||||||
|
*
|
||||||
|
* This program and the accompanying materials are made available under the
|
||||||
|
* terms of the Eclipse Public License 2.0 which is available at
|
||||||
|
* http://www.eclipse.org/legal/epl-2.0
|
||||||
|
*
|
||||||
|
* SPDX-License-Identifier: EPL-2.0
|
||||||
|
*/
|
||||||
|
package org.openhab.voice.rustpotterks.internal;
|
||||||
|
|
||||||
|
import org.eclipse.jdt.annotation.NonNullByDefault;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* The {@link RustpotterKSConstants} class defines common constants, which are
|
||||||
|
* used across the whole binding.
|
||||||
|
*
|
||||||
|
* @author Miguel Álvarez - Initial contribution
|
||||||
|
*/
|
||||||
|
@NonNullByDefault
|
||||||
|
public class RustpotterKSConstants {
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Service name
|
||||||
|
*/
|
||||||
|
public static final String SERVICE_NAME = "Rustpotter";
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Service id
|
||||||
|
*/
|
||||||
|
public static final String SERVICE_ID = "rustpotterks";
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Service category
|
||||||
|
*/
|
||||||
|
public static final String SERVICE_CATEGORY = "voice";
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Service pid
|
||||||
|
*/
|
||||||
|
public static final String SERVICE_PID = "org.openhab." + SERVICE_CATEGORY + "." + SERVICE_ID;
|
||||||
|
}
|
@ -0,0 +1,250 @@
|
|||||||
|
/**
|
||||||
|
* Copyright (c) 2010-2022 Contributors to the openHAB project
|
||||||
|
*
|
||||||
|
* See the NOTICE file(s) distributed with this work for additional
|
||||||
|
* information.
|
||||||
|
*
|
||||||
|
* This program and the accompanying materials are made available under the
|
||||||
|
* terms of the Eclipse Public License 2.0 which is available at
|
||||||
|
* http://www.eclipse.org/legal/epl-2.0
|
||||||
|
*
|
||||||
|
* SPDX-License-Identifier: EPL-2.0
|
||||||
|
*/
|
||||||
|
package org.openhab.voice.rustpotterks.internal;
|
||||||
|
|
||||||
|
import static org.openhab.voice.rustpotterks.internal.RustpotterKSConstants.*;
|
||||||
|
|
||||||
|
import java.io.File;
|
||||||
|
import java.io.IOException;
|
||||||
|
import java.nio.file.Path;
|
||||||
|
import java.util.Locale;
|
||||||
|
import java.util.Map;
|
||||||
|
import java.util.Set;
|
||||||
|
import java.util.concurrent.ScheduledExecutorService;
|
||||||
|
import java.util.concurrent.atomic.AtomicBoolean;
|
||||||
|
|
||||||
|
import org.eclipse.jdt.annotation.NonNullByDefault;
|
||||||
|
import org.eclipse.jdt.annotation.Nullable;
|
||||||
|
import org.openhab.core.OpenHAB;
|
||||||
|
import org.openhab.core.audio.AudioFormat;
|
||||||
|
import org.openhab.core.audio.AudioStream;
|
||||||
|
import org.openhab.core.common.ThreadPoolManager;
|
||||||
|
import org.openhab.core.config.core.ConfigurableService;
|
||||||
|
import org.openhab.core.config.core.Configuration;
|
||||||
|
import org.openhab.core.voice.KSErrorEvent;
|
||||||
|
import org.openhab.core.voice.KSException;
|
||||||
|
import org.openhab.core.voice.KSListener;
|
||||||
|
import org.openhab.core.voice.KSService;
|
||||||
|
import org.openhab.core.voice.KSServiceHandle;
|
||||||
|
import org.openhab.core.voice.KSpottedEvent;
|
||||||
|
import org.osgi.framework.Constants;
|
||||||
|
import org.osgi.service.component.ComponentContext;
|
||||||
|
import org.osgi.service.component.annotations.Activate;
|
||||||
|
import org.osgi.service.component.annotations.Component;
|
||||||
|
import org.osgi.service.component.annotations.Modified;
|
||||||
|
import org.slf4j.Logger;
|
||||||
|
import org.slf4j.LoggerFactory;
|
||||||
|
|
||||||
|
import io.github.givimad.rustpotter_java.Endianness;
|
||||||
|
import io.github.givimad.rustpotter_java.NoiseDetectionMode;
|
||||||
|
import io.github.givimad.rustpotter_java.RustpotterJava;
|
||||||
|
import io.github.givimad.rustpotter_java.RustpotterJavaBuilder;
|
||||||
|
import io.github.givimad.rustpotter_java.VadMode;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* The {@link RustpotterKSService} is a keyword spotting implementation based on rustpotter.
|
||||||
|
*
|
||||||
|
* @author Miguel Álvarez - Initial contribution
|
||||||
|
*/
|
||||||
|
@NonNullByDefault
|
||||||
|
@Component(configurationPid = SERVICE_PID, property = Constants.SERVICE_PID + "=" + SERVICE_PID)
|
||||||
|
@ConfigurableService(category = SERVICE_CATEGORY, label = SERVICE_NAME
|
||||||
|
+ " Keyword Spotter", description_uri = SERVICE_CATEGORY + ":" + SERVICE_ID)
|
||||||
|
public class RustpotterKSService implements KSService {
|
||||||
|
private static final String RUSTPOTTER_FOLDER = Path.of(OpenHAB.getUserDataFolder(), "rustpotter").toString();
|
||||||
|
private final Logger logger = LoggerFactory.getLogger(RustpotterKSService.class);
|
||||||
|
private final ScheduledExecutorService executor = ThreadPoolManager.getScheduledPool("OH-voice-rustpotterks");
|
||||||
|
private RustpotterKSConfiguration config = new RustpotterKSConfiguration();
|
||||||
|
static {
|
||||||
|
Logger logger = LoggerFactory.getLogger(RustpotterKSService.class);
|
||||||
|
File directory = new File(RUSTPOTTER_FOLDER);
|
||||||
|
if (!directory.exists()) {
|
||||||
|
if (directory.mkdir()) {
|
||||||
|
logger.info("rustpotter dir created {}", RUSTPOTTER_FOLDER);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@Activate
|
||||||
|
protected void activate(ComponentContext componentContext, Map<String, Object> config) {
|
||||||
|
modified(config);
|
||||||
|
}
|
||||||
|
|
||||||
|
@Modified
|
||||||
|
protected void modified(Map<String, Object> config) {
|
||||||
|
this.config = new Configuration(config).as(RustpotterKSConfiguration.class);
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public String getId() {
|
||||||
|
return SERVICE_ID;
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public String getLabel(@Nullable Locale locale) {
|
||||||
|
return SERVICE_NAME;
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public Set<Locale> getSupportedLocales() {
|
||||||
|
return Set.of();
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public Set<AudioFormat> getSupportedFormats() {
|
||||||
|
return Set
|
||||||
|
.of(new AudioFormat(AudioFormat.CONTAINER_WAVE, AudioFormat.CODEC_PCM_SIGNED, null, null, null, null));
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public KSServiceHandle spot(KSListener ksListener, AudioStream audioStream, Locale locale, String keyword)
|
||||||
|
throws KSException {
|
||||||
|
logger.debug("Loading library");
|
||||||
|
try {
|
||||||
|
RustpotterJava.loadLibrary();
|
||||||
|
} catch (IOException e) {
|
||||||
|
throw new KSException("Unable to load rustpotter lib: " + e.getMessage());
|
||||||
|
}
|
||||||
|
var audioFormat = audioStream.getFormat();
|
||||||
|
var frequency = audioFormat.getFrequency();
|
||||||
|
var bitDepth = audioFormat.getBitDepth();
|
||||||
|
var channels = audioFormat.getChannels();
|
||||||
|
var isBigEndian = audioFormat.isBigEndian();
|
||||||
|
if (frequency == null || bitDepth == null || channels == null || isBigEndian == null) {
|
||||||
|
throw new KSException(
|
||||||
|
"Missing stream metadata: frequency, bit depth, channels and endianness must be defined.");
|
||||||
|
}
|
||||||
|
var endianness = isBigEndian ? Endianness.BIG : Endianness.LITTLE;
|
||||||
|
logger.debug("Audio wav spec: frequency '{}', bit depth '{}', channels '{}', '{}'", frequency, bitDepth,
|
||||||
|
channels, audioFormat.isBigEndian() ? "big-endian" : "little-endian");
|
||||||
|
RustpotterJava rustpotter = initRustpotter(frequency, bitDepth, channels, endianness);
|
||||||
|
var modelName = keyword.replaceAll("\\s", "_") + ".rpw";
|
||||||
|
var modelPath = Path.of(RUSTPOTTER_FOLDER, modelName);
|
||||||
|
if (!modelPath.toFile().exists()) {
|
||||||
|
throw new KSException("Missing model " + modelName);
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
rustpotter.addWakewordModelFile(modelPath.toString());
|
||||||
|
} catch (Exception e) {
|
||||||
|
throw new KSException("Unable to load wake word model: " + e.getMessage());
|
||||||
|
}
|
||||||
|
logger.debug("Model '{}' loaded", modelPath);
|
||||||
|
AtomicBoolean aborted = new AtomicBoolean(false);
|
||||||
|
executor.submit(() -> processAudioStream(rustpotter, ksListener, audioStream, aborted));
|
||||||
|
return new KSServiceHandle() {
|
||||||
|
@Override
|
||||||
|
public void abort() {
|
||||||
|
logger.debug("Stopping service");
|
||||||
|
aborted.set(true);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
private RustpotterJava initRustpotter(long frequency, int bitDepth, int channels, Endianness endianness) {
|
||||||
|
var rustpotterBuilder = new RustpotterJavaBuilder();
|
||||||
|
// audio configs
|
||||||
|
rustpotterBuilder.setBitsPerSample(bitDepth);
|
||||||
|
rustpotterBuilder.setSampleRate(frequency);
|
||||||
|
rustpotterBuilder.setChannels(channels);
|
||||||
|
rustpotterBuilder.setEndianness(endianness);
|
||||||
|
// detector configs
|
||||||
|
rustpotterBuilder.setThreshold(config.threshold);
|
||||||
|
rustpotterBuilder.setAveragedThreshold(config.averagedThreshold);
|
||||||
|
rustpotterBuilder.setComparatorRef(config.comparatorRef);
|
||||||
|
rustpotterBuilder.setComparatorBandSize(config.comparatorBandSize);
|
||||||
|
@Nullable
|
||||||
|
VadMode vadMode = getVADMode(config.vadMode);
|
||||||
|
if (vadMode != null) {
|
||||||
|
rustpotterBuilder.setVADMode(vadMode);
|
||||||
|
rustpotterBuilder.setVADSensitivity(config.vadSensitivity);
|
||||||
|
rustpotterBuilder.setVADDelay(config.vadDelay);
|
||||||
|
}
|
||||||
|
@Nullable
|
||||||
|
NoiseDetectionMode noiseDetectionMode = getNoiseMode(config.noiseDetectionMode);
|
||||||
|
if (noiseDetectionMode != null) {
|
||||||
|
rustpotterBuilder.setNoiseMode(noiseDetectionMode);
|
||||||
|
rustpotterBuilder.setNoiseSensitivity(config.noiseSensitivity);
|
||||||
|
}
|
||||||
|
rustpotterBuilder.setEagerMode(config.eagerMode);
|
||||||
|
// init the detector
|
||||||
|
var rustpotter = rustpotterBuilder.build();
|
||||||
|
rustpotterBuilder.delete();
|
||||||
|
return rustpotter;
|
||||||
|
}
|
||||||
|
|
||||||
|
private void processAudioStream(RustpotterJava rustpotter, KSListener ksListener, AudioStream audioStream,
|
||||||
|
AtomicBoolean aborted) {
|
||||||
|
int numBytesRead;
|
||||||
|
var bufferSize = (int) rustpotter.getBytesPerFrame();
|
||||||
|
byte[] audioBuffer = new byte[bufferSize];
|
||||||
|
int remaining = bufferSize;
|
||||||
|
while (!aborted.get()) {
|
||||||
|
try {
|
||||||
|
numBytesRead = audioStream.read(audioBuffer, bufferSize - remaining, remaining);
|
||||||
|
if (aborted.get()) {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
if (numBytesRead != remaining) {
|
||||||
|
remaining = remaining - numBytesRead;
|
||||||
|
Thread.sleep(100);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
remaining = bufferSize;
|
||||||
|
var result = rustpotter.processBuffer(audioBuffer);
|
||||||
|
if (result.isPresent()) {
|
||||||
|
var detection = result.get();
|
||||||
|
logger.debug("keyword '{}' detected with score {}!", detection.getName(), detection.getScore());
|
||||||
|
detection.delete();
|
||||||
|
ksListener.ksEventReceived(new KSpottedEvent());
|
||||||
|
}
|
||||||
|
} catch (IOException | InterruptedException e) {
|
||||||
|
String errorMessage = e.getMessage();
|
||||||
|
ksListener.ksEventReceived(new KSErrorEvent(errorMessage != null ? errorMessage : "Unexpected error"));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
rustpotter.delete();
|
||||||
|
logger.debug("rustpotter stopped");
|
||||||
|
}
|
||||||
|
|
||||||
|
private @Nullable VadMode getVADMode(String mode) {
|
||||||
|
switch (mode) {
|
||||||
|
case "low-bitrate":
|
||||||
|
return VadMode.LOW_BITRATE;
|
||||||
|
case "quality":
|
||||||
|
return VadMode.QUALITY;
|
||||||
|
case "aggressive":
|
||||||
|
return VadMode.AGGRESSIVE;
|
||||||
|
case "very-aggressive":
|
||||||
|
return VadMode.VERY_AGGRESSIVE;
|
||||||
|
default:
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private @Nullable NoiseDetectionMode getNoiseMode(String mode) {
|
||||||
|
switch (mode) {
|
||||||
|
case "easiest":
|
||||||
|
return NoiseDetectionMode.EASIEST;
|
||||||
|
case "easy":
|
||||||
|
return NoiseDetectionMode.EASY;
|
||||||
|
case "normal":
|
||||||
|
return NoiseDetectionMode.NORMAL;
|
||||||
|
case "hard":
|
||||||
|
return NoiseDetectionMode.HARD;
|
||||||
|
case "hardest":
|
||||||
|
return NoiseDetectionMode.HARDEST;
|
||||||
|
default:
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
@ -0,0 +1,97 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
|
<config-description:config-descriptions
|
||||||
|
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
||||||
|
xmlns:config-description="https://openhab.org/schemas/config-description/v1.0.0"
|
||||||
|
xsi:schemaLocation="https://openhab.org/schemas/config-description/v1.0.0
|
||||||
|
https://openhab.org/schemas/config-description-1.0.0.xsd">
|
||||||
|
<config-description uri="voice:rustpotterks">
|
||||||
|
<parameter-group name="wakewordDetector">
|
||||||
|
<label>Wakeword Detector</label>
|
||||||
|
<description>Wakeword detection options.</description>
|
||||||
|
</parameter-group>
|
||||||
|
<parameter-group name="noiseDetector">
|
||||||
|
<label>Noise Detector</label>
|
||||||
|
<description>Optional noise detection options.</description>
|
||||||
|
</parameter-group>
|
||||||
|
<parameter-group name="vadDetector">
|
||||||
|
<label>VAD Detector</label>
|
||||||
|
<description>Optional voice activity detector options.</description>
|
||||||
|
</parameter-group>
|
||||||
|
<parameter name="threshold" type="decimal" min="0" max="1" groupName="wakewordDetector">
|
||||||
|
<label>Threshold</label>
|
||||||
|
<description>Configures the detector threshold, is the min score (in range 0. to 1.) that some of the wakeword
|
||||||
|
templates should obtain to trigger a detection. Model defined value takes prevalence if present.</description>
|
||||||
|
<default>0.5</default>
|
||||||
|
</parameter>
|
||||||
|
<parameter name="averagedThreshold" type="decimal" min="0" max="1" groupName="wakewordDetector">
|
||||||
|
<label>Averaged Threshold</label>
|
||||||
|
<description>Configures the detector averaged threshold, is the min score (in range 0. to 1.) that the audio should
|
||||||
|
obtain against a combination of the wake word templates, the detection will be aborted if this is not the case. This
|
||||||
|
way it can prevent to run the comparison of the current frame against each of the wake word templates which saves
|
||||||
|
cpu. If set to 0 this functionality is disabled.</description>
|
||||||
|
<default>0.2</default>
|
||||||
|
</parameter>
|
||||||
|
<parameter name="comparatorRef" type="decimal" min="0" max="1" groupName="wakewordDetector">
|
||||||
|
<label>Comparator Ref</label>
|
||||||
|
<description>Configures the reference for the comparator used to match the samples.</description>
|
||||||
|
<default>0.22</default>
|
||||||
|
<advanced>true</advanced>
|
||||||
|
</parameter>
|
||||||
|
<parameter name="comparatorBandSize" type="integer" groupName="wakewordDetector">
|
||||||
|
<label>Comparator Band Size</label>
|
||||||
|
<description>Configures the band-size for the comparator used to match the samples.</description>
|
||||||
|
<default>6</default>
|
||||||
|
<advanced>true</advanced>
|
||||||
|
</parameter>
|
||||||
|
<parameter name="eagerMode" type="boolean" groupName="wakewordDetector">
|
||||||
|
<label>Eager Mode</label>
|
||||||
|
<description>Enables eager mode. End detection as soon as a result is over the score, instead of waiting to
|
||||||
|
see if the
|
||||||
|
next frame has a higher score.</description>
|
||||||
|
<default>true</default>
|
||||||
|
</parameter>
|
||||||
|
<parameter name="noiseDetectionMode" type="text" groupName="noiseDetector">
|
||||||
|
<label>Noise Detection Mode</label>
|
||||||
|
<description>Use a noise detector to reduce computation in the absence of sound. Configures the difficulty to
|
||||||
|
consider
|
||||||
|
a
|
||||||
|
frame as noise (the required noise level).</description>
|
||||||
|
<default>disabled</default>
|
||||||
|
<options>
|
||||||
|
<option value="disabled">Disabled</option>
|
||||||
|
<option value="easiest">Easiest</option>
|
||||||
|
<option value="easy">Easy</option>
|
||||||
|
<option value="normal">Normal</option>
|
||||||
|
<option value="hard">Hard</option>
|
||||||
|
<option value="hardest">Hardest</option>
|
||||||
|
</options>
|
||||||
|
</parameter>
|
||||||
|
<parameter name="noiseSensitivity" type="decimal" min="0" max="1" groupName="noiseDetector">
|
||||||
|
<label>Noise Sensitivity</label>
|
||||||
|
<description>Noise/silence ratio in the last second to consider voice is detected.</description>
|
||||||
|
<default>0.5</default>
|
||||||
|
</parameter>
|
||||||
|
<parameter name="vadMode" type="text" groupName="vadDetector">
|
||||||
|
<label>VAD Mode</label>
|
||||||
|
<description>Use a vad detector to reduce computation in the absence of vocal sound.</description>
|
||||||
|
<default>disabled</default>
|
||||||
|
<options>
|
||||||
|
<option value="disabled">Disabled</option>
|
||||||
|
<option value="low-bitrate">Low Bitrate</option>
|
||||||
|
<option value="quality">Quality</option>
|
||||||
|
<option value="aggressive">Aggressive</option>
|
||||||
|
<option value="very-aggressive">Very Aggressive</option>
|
||||||
|
</options>
|
||||||
|
</parameter>
|
||||||
|
<parameter name="vadSensitivity" type="decimal" min="0" max="1" groupName="vadDetector">
|
||||||
|
<label>VAD Sensitivity</label>
|
||||||
|
<description>Voice/silence ratio in the last second to consider voice is detected.</description>
|
||||||
|
<default>0.5</default>
|
||||||
|
</parameter>
|
||||||
|
<parameter name="vadDelay" type="integer" groupName="vadDetector">
|
||||||
|
<label>VAD Delay</label>
|
||||||
|
<description>Seconds to disable the vad detector after voice is detected.</description>
|
||||||
|
<default>3</default>
|
||||||
|
</parameter>
|
||||||
|
</config-description>
|
||||||
|
</config-description:config-descriptions>
|
@ -0,0 +1,41 @@
|
|||||||
|
voice.config.rustpotterks.averagedThreshold.label = Averaged Threshold
|
||||||
|
voice.config.rustpotterks.averagedThreshold.description = Configures the detector averaged threshold, is the min score (in range 0. to 1.) that the audio should obtain against a combination of the wake word templates, the detection will be aborted if this is not the case. This way it can prevent to run the comparison of the current frame against each of the wake word templates which saves cpu. If set to 0 this functionality is disabled.
|
||||||
|
voice.config.rustpotterks.comparatorBandSize.label = Comparator Band Size
|
||||||
|
voice.config.rustpotterks.comparatorBandSize.description = Configures the band-size for the comparator used to match the samples.
|
||||||
|
voice.config.rustpotterks.comparatorRef.label = Comparator Ref
|
||||||
|
voice.config.rustpotterks.comparatorRef.description = Configures the reference for the comparator used to match the samples.
|
||||||
|
voice.config.rustpotterks.eagerMode.label = Eager Mode
|
||||||
|
voice.config.rustpotterks.eagerMode.description = Enables eager mode. End detection as soon as a result is over the score, instead of waiting to see if the next frame has a higher score.
|
||||||
|
voice.config.rustpotterks.group.noiseDetector.label = Noise Detector
|
||||||
|
voice.config.rustpotterks.group.noiseDetector.description = Optional noise detection options.
|
||||||
|
voice.config.rustpotterks.group.vadDetector.label = VAD Detector
|
||||||
|
voice.config.rustpotterks.group.vadDetector.description = Optional voice activity detector options.
|
||||||
|
voice.config.rustpotterks.group.wakewordDetector.label = Wakeword Detector
|
||||||
|
voice.config.rustpotterks.group.wakewordDetector.description = Wakeword detection options.
|
||||||
|
voice.config.rustpotterks.noiseDetectionMode.label = Noise Detection Mode
|
||||||
|
voice.config.rustpotterks.noiseDetectionMode.description = Use a noise detector to reduce computation in the absence of sound. Configures the difficulty to consider a frame as noise (the required noise level).
|
||||||
|
voice.config.rustpotterks.noiseDetectionMode.option.disabled = Disabled
|
||||||
|
voice.config.rustpotterks.noiseDetectionMode.option.easiest = Easiest
|
||||||
|
voice.config.rustpotterks.noiseDetectionMode.option.easy = Easy
|
||||||
|
voice.config.rustpotterks.noiseDetectionMode.option.normal = Normal
|
||||||
|
voice.config.rustpotterks.noiseDetectionMode.option.hard = Hard
|
||||||
|
voice.config.rustpotterks.noiseDetectionMode.option.hardest = Hardest
|
||||||
|
voice.config.rustpotterks.noiseSensitivity.label = Noise Sensitivity
|
||||||
|
voice.config.rustpotterks.noiseSensitivity.description = Noise/silence ratio in the last second to consider voice is detected.
|
||||||
|
voice.config.rustpotterks.threshold.label = Threshold
|
||||||
|
voice.config.rustpotterks.threshold.description = Configures the detector threshold, is the min score (in range 0. to 1.) that some of the wakeword templates should obtain to trigger a detection. Model defined value takes prevalence if present.
|
||||||
|
voice.config.rustpotterks.vadDelay.label = VAD Delay
|
||||||
|
voice.config.rustpotterks.vadDelay.description = Seconds to disable the vad detector after voice is detected.
|
||||||
|
voice.config.rustpotterks.vadMode.label = VAD Mode
|
||||||
|
voice.config.rustpotterks.vadMode.description = Use a vad detector to reduce computation in the absence of vocal sound.
|
||||||
|
voice.config.rustpotterks.vadMode.option.disabled = Disabled
|
||||||
|
voice.config.rustpotterks.vadMode.option.low-bitrate = Low Bitrate
|
||||||
|
voice.config.rustpotterks.vadMode.option.quality = Quality
|
||||||
|
voice.config.rustpotterks.vadMode.option.aggressive = Aggressive
|
||||||
|
voice.config.rustpotterks.vadMode.option.very-aggressive = Very Aggressive
|
||||||
|
voice.config.rustpotterks.vadSensitivity.label = VAD Sensitivity
|
||||||
|
voice.config.rustpotterks.vadSensitivity.description = Voice/silence ratio in the last second to consider voice is detected.
|
||||||
|
|
||||||
|
# service
|
||||||
|
|
||||||
|
service.voice.rustpotterks.label = Rustpotter Keyword Spotter
|
@ -411,6 +411,7 @@
|
|||||||
<module>org.openhab.voice.picotts</module>
|
<module>org.openhab.voice.picotts</module>
|
||||||
<module>org.openhab.voice.pollytts</module>
|
<module>org.openhab.voice.pollytts</module>
|
||||||
<module>org.openhab.voice.porcupineks</module>
|
<module>org.openhab.voice.porcupineks</module>
|
||||||
|
<module>org.openhab.voice.rustpotterks</module>
|
||||||
<module>org.openhab.voice.voicerss</module>
|
<module>org.openhab.voice.voicerss</module>
|
||||||
<module>org.openhab.voice.voskstt</module>
|
<module>org.openhab.voice.voskstt</module>
|
||||||
<module>org.openhab.voice.watsonstt</module>
|
<module>org.openhab.voice.watsonstt</module>
|
||||||
|
Loading…
Reference in New Issue
Block a user