mirror of
https://github.com/openhab/openhab-addons.git
synced 2025-01-25 14:55:55 +01:00
[rustpotterks] Upgrade to version 2 (#14615)
* [rustpotter] Use version 2 Signed-off-by: Miguel Álvarez <miguelwork92@gmail.com>
This commit is contained in:
parent
1786bb0eec
commit
aa3229a97f
@ -5,6 +5,11 @@ This voice service allows you to use the open source library Rustpotter as your
|
||||
|
||||
Rustpotter provides personal on-device wake word detection. You need to generate a model for your keyword using audio samples.
|
||||
|
||||
You can test library in your browser using these web pages:
|
||||
|
||||
- [The spot demo](https://givimad.github.io/rustpotter-worklet-demo/), which include some example wakewords (but it's recommended to use your own).
|
||||
- [The model creation demo](https://givimad.github.io/rustpotter-create-model-demo/), it allows you to record compatible wav files and generate a wakeword file that you can test on the previous page.
|
||||
|
||||
Important: No voice data listened by this service will be uploaded to the Cloud.
|
||||
The voice data is processed offline, locally on your openHAB server by Rustpotter.
|
||||
|
||||
@ -12,17 +17,19 @@ The voice data is processed offline, locally on your openHAB server by Rustpotte
|
||||
|
||||
After installing, you will be able to access the service options through the openHAB configuration page in UI (**Settings / Other Services - Rustpotter Keyword Spotter**) to edit them:
|
||||
|
||||
* **Threshold** - Configures the detector threshold, is the min score (in range 0. to 1.) that some wake word template should obtain to trigger a detection. Defaults to 0.5.
|
||||
* **Averaged Threshold** - Configures the detector averaged threshold, is the min score (in range 0. to 1.) that the audio should obtain against a combination of the wake word templates, the detection will be aborted if this is not the case. This way it can prevent to run the comparison of the current frame against each of the wake word templates which saves cpu. If set to 0 this functionality is disabled.
|
||||
* **Eager mode** - Enables eager mode. End detection as soon as a result is over the score, instead of waiting to see if the next frame has a higher score.
|
||||
* **Noise Detection Mode** - Use build-in noise detection to reduce computation on absence of noise. Configures the difficulty to consider a frame as noise (the required noise level).
|
||||
* **Noise Detection Sensitivity** - Noise/silence ratio in the last second to consider noise is detected. Defaults to 0.5.
|
||||
* **VAD Mode** - Use a voice activity detector to reduce computation in the absence of vocal sound.
|
||||
* **VAD Sensitivity** - Voice/silence ratio in the last second to consider voice is detected.
|
||||
* **VAD Delay** - Seconds to disable the vad detector after voice is detected. Defaults to 3.
|
||||
* **Comparator Ref** - Configures the reference for the comparator used to match the samples.
|
||||
* **Comparator Band Size** - Configures the band-size for the comparator used to match the samples.
|
||||
|
||||
- **Threshold** - Configures the detector threshold, is the min score (in range 0. to 1.) that some wake word template should obtain to trigger a detection. Defaults to 0.5.
|
||||
- **Averaged Threshold** - Configures the detector averaged threshold, is the min score (in range 0. to 1.) that the audio should obtain against a combination of the wake word templates, the detection will be aborted if this is not the case. This way it can prevent to run the comparison of the current frame against each of the wake word templates which saves cpu. If set to 0 this functionality is disabled.
|
||||
- **Score Mode** - Indicates how to calculate the final score.
|
||||
- **Min Scores** - Minimum number of positive scores to consider a partial detection as a detection.
|
||||
- **Comparator Ref** - Configures the reference for the comparator used to match the samples.
|
||||
- **Comparator Band Size** - Configures the band-size for the comparator used to match the samples.
|
||||
- **Gain Normalizer** - Enables an audio filter that intent to approximate the volume of the stream to a reference level.
|
||||
- **Min Gain** - Min gain applied by the gain normalizer filter.
|
||||
- **Max Gain** - Max gain applied by the gain normalizer filter.
|
||||
- **Gain Ref** - The RMS reference used by the gain-normalizer to calculate the gain applied. If unset an estimation of the wakeword level is used.
|
||||
- **Band Pass** - Enables an audio filter that attenuates frequencies outside the low cutoff and high cutoff range.
|
||||
- **Low Cutoff** - Low cutoff for the band-pass filter.
|
||||
- **High Cutoff** - High cutoff for the band-pass filter.
|
||||
|
||||
In case you would like to setup the service via a text file, create a new file in `$OPENHAB_ROOT/conf/services` named `rustpotterks.cfg`
|
||||
|
||||
@ -31,21 +38,24 @@ Its contents should look similar to:
|
||||
```
|
||||
org.openhab.voice.rustpotterks:threshold=0.5
|
||||
org.openhab.voice.rustpotterks:averagedthreshold=0.2
|
||||
org.openhab.voice.rustpotterks:scoreMode=max
|
||||
org.openhab.voice.rustpotterks:minScores=5
|
||||
org.openhab.voice.rustpotterks:comparatorRef=0.22
|
||||
org.openhab.voice.rustpotterks:comparatorBandSize=6
|
||||
org.openhab.voice.rustpotterks:eagerMode=true
|
||||
org.openhab.voice.rustpotterks:noiseDetectionMode=hard
|
||||
org.openhab.voice.rustpotterks:noiseDetectionSensitivity=0.5
|
||||
org.openhab.voice.rustpotterks:vadMode=aggressive
|
||||
org.openhab.voice.rustpotterks:vadSensitivity=0.5
|
||||
org.openhab.voice.rustpotterks:vadDelay=3
|
||||
org.openhab.voice.rustpotterks:comparatorBandSize=5
|
||||
org.openhab.voice.rustpotterks:gainNormalizer=true
|
||||
org.openhab.voice.rustpotterks:minGain=0.5
|
||||
org.openhab.voice.rustpotterks:maxGain=1
|
||||
org.openhab.voice.rustpotterks:gainRef=
|
||||
org.openhab.voice.rustpotterks:bandPass=true
|
||||
org.openhab.voice.rustpotterks:lowCutoff=80
|
||||
org.openhab.voice.rustpotterks:highCutoff=400
|
||||
```
|
||||
|
||||
## Magic Word Configuration
|
||||
|
||||
The magic word to spot is gathered from your 'Voice' configuration.
|
||||
|
||||
You can generate your own wake word model by using the [Rustpotter CLI](https://github.com/GiviMAD/rustpotter-cli).
|
||||
You can generate your own wakeword files using the [Rustpotter CLI](https://github.com/GiviMAD/rustpotter-cli).
|
||||
|
||||
You can also download the models used as examples on the [rustpotter web demo](https://givimad.github.io/rustpotter-worklet-demo/) from [this folder](https://github.com/GiviMAD/rustpotter-worklet-demo/tree/main/static).
|
||||
|
||||
@ -59,11 +69,11 @@ The service will only work if it's able to find the correct rpw for your magic w
|
||||
|
||||
You can setup your preferred default keyword spotter and default magic word in the UI:
|
||||
|
||||
* Go to **Settings**.
|
||||
* Edit **System Services - Voice**.
|
||||
* Set **Rustpotter Keyword Spotter** as **Default Keyword Spotter**.
|
||||
* Choose your preferred **Magic Word** for your setup.
|
||||
* Choose optionally your **Listening Switch** item that will be switch ON during the period when the dialog processor has spotted the keyword and is listening for commands.
|
||||
- Go to **Settings**.
|
||||
- Edit **System Services - Voice**.
|
||||
- Set **Rustpotter Keyword Spotter** as **Default Keyword Spotter**.
|
||||
- Choose your preferred **Magic Word** for your setup.
|
||||
- Choose optionally your **Listening Switch** item that will be switch ON during the period when the dialog processor has spotted the keyword and is listening for commands.
|
||||
|
||||
In case you would like to setup these settings via a text file, you can edit the file `runtime.cfg` in `$OPENHAB_ROOT/conf/services` and set the following entries:
|
||||
|
||||
|
@ -18,7 +18,7 @@
|
||||
<dependency>
|
||||
<groupId>io.github.givimad</groupId>
|
||||
<artifactId>rustpotter-java</artifactId>
|
||||
<version>1.0.0</version>
|
||||
<version>2.0.0</version>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
</project>
|
||||
|
@ -13,6 +13,7 @@
|
||||
package org.openhab.voice.rustpotterks.internal;
|
||||
|
||||
import org.eclipse.jdt.annotation.NonNullByDefault;
|
||||
import org.eclipse.jdt.annotation.Nullable;
|
||||
|
||||
/**
|
||||
* The {@link RustpotterKSConfiguration} class contains fields mapping thing configuration parameters.
|
||||
@ -36,31 +37,13 @@ public class RustpotterKSConfiguration {
|
||||
*/
|
||||
public float averagedThreshold = 0.2f;
|
||||
/**
|
||||
* Terminate the detection as son as one result is above the score,
|
||||
* instead of wait to see if the next frame has a higher score.
|
||||
* Indicates how to calculate the final score.
|
||||
*/
|
||||
public boolean eagerMode = true;
|
||||
public String scoreMode = "max";
|
||||
/**
|
||||
* Use build-in noise detection to reduce computation on absence of noise.
|
||||
* Configures the difficulty to consider a frame as noise (the required noise level).
|
||||
* Minimum number of positive scores to consider a partial detection as a detection.
|
||||
*/
|
||||
public String noiseDetectionMode = "disabled";
|
||||
/**
|
||||
* Noise/silence ratio in the last second to consider noise is detected. Defaults to 0.5.
|
||||
*/
|
||||
public float noiseSensitivity = 0.5f;
|
||||
/**
|
||||
* Seconds to disable the vad detector after voice is detected. Defaults to 3.
|
||||
*/
|
||||
public int vadDelay = 3;
|
||||
/**
|
||||
* Voice/silence ratio in the last second to consider voice is detected.
|
||||
*/
|
||||
public float vadSensitivity = 0.5f;
|
||||
/**
|
||||
* Use a voice activity detector to reduce computation in the absence of vocal sound.
|
||||
*/
|
||||
public String vadMode = "disabled";
|
||||
public int minScores = 5;
|
||||
/**
|
||||
* Configures the reference for the comparator used to match the samples.
|
||||
*/
|
||||
@ -68,5 +51,35 @@ public class RustpotterKSConfiguration {
|
||||
/**
|
||||
* Configures the band-size for the comparator used to match the samples.
|
||||
*/
|
||||
public int comparatorBandSize = 6;
|
||||
public int comparatorBandSize = 5;
|
||||
/**
|
||||
* Enables an audio filter that intent to approximate the volume of the stream to a reference level (RMS of the
|
||||
* samples is used as volume measure).
|
||||
*/
|
||||
public boolean gainNormalizer = false;
|
||||
/**
|
||||
* Min gain applied by the gain normalizer filter.
|
||||
*/
|
||||
public float minGain = 0.5f;
|
||||
/**
|
||||
* Max gain applied by the gain normalizer filter.
|
||||
*/
|
||||
public float maxGain = 1f;
|
||||
/**
|
||||
* Set the RMS reference used by the gain-normalizer to calculate the gain applied. If unset an estimation of the
|
||||
* wakeword level is used.
|
||||
*/
|
||||
public @Nullable Float gainRef = null;
|
||||
/**
|
||||
* Enables an audio filter that attenuates frequencies outside the low cutoff and high cutoff range.
|
||||
*/
|
||||
public boolean bandPass = false;
|
||||
/**
|
||||
* Low cutoff for the band-pass filter.
|
||||
*/
|
||||
public float lowCutoff = 80f;
|
||||
/**
|
||||
* High cutoff for the band-pass filter.
|
||||
*/
|
||||
public float highCutoff = 400f;
|
||||
}
|
||||
|
@ -17,6 +17,7 @@ import static org.openhab.voice.rustpotterks.internal.RustpotterKSConstants.*;
|
||||
import java.io.File;
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Path;
|
||||
import java.util.ArrayList;
|
||||
import java.util.Locale;
|
||||
import java.util.Map;
|
||||
import java.util.Set;
|
||||
@ -38,7 +39,6 @@ import org.openhab.core.voice.KSService;
|
||||
import org.openhab.core.voice.KSServiceHandle;
|
||||
import org.openhab.core.voice.KSpottedEvent;
|
||||
import org.osgi.framework.Constants;
|
||||
import org.osgi.service.component.ComponentContext;
|
||||
import org.osgi.service.component.annotations.Activate;
|
||||
import org.osgi.service.component.annotations.Component;
|
||||
import org.osgi.service.component.annotations.Modified;
|
||||
@ -46,10 +46,10 @@ import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
|
||||
import io.github.givimad.rustpotter_java.Endianness;
|
||||
import io.github.givimad.rustpotter_java.NoiseDetectionMode;
|
||||
import io.github.givimad.rustpotter_java.RustpotterJava;
|
||||
import io.github.givimad.rustpotter_java.RustpotterJavaBuilder;
|
||||
import io.github.givimad.rustpotter_java.VadMode;
|
||||
import io.github.givimad.rustpotter_java.Rustpotter;
|
||||
import io.github.givimad.rustpotter_java.RustpotterBuilder;
|
||||
import io.github.givimad.rustpotter_java.SampleFormat;
|
||||
import io.github.givimad.rustpotter_java.ScoreMode;
|
||||
|
||||
/**
|
||||
* The {@link RustpotterKSService} is a keyword spotting implementation based on rustpotter.
|
||||
@ -76,7 +76,7 @@ public class RustpotterKSService implements KSService {
|
||||
}
|
||||
|
||||
@Activate
|
||||
protected void activate(ComponentContext componentContext, Map<String, Object> config) {
|
||||
protected void activate(Map<String, Object> config) {
|
||||
modified(config);
|
||||
}
|
||||
|
||||
@ -111,7 +111,7 @@ public class RustpotterKSService implements KSService {
|
||||
throws KSException {
|
||||
logger.debug("Loading library");
|
||||
try {
|
||||
RustpotterJava.loadLibrary();
|
||||
Rustpotter.loadLibrary();
|
||||
} catch (IOException e) {
|
||||
throw new KSException("Unable to load rustpotter lib: " + e.getMessage());
|
||||
}
|
||||
@ -126,8 +126,13 @@ public class RustpotterKSService implements KSService {
|
||||
}
|
||||
var endianness = isBigEndian ? Endianness.BIG : Endianness.LITTLE;
|
||||
logger.debug("Audio wav spec: frequency '{}', bit depth '{}', channels '{}', '{}'", frequency, bitDepth,
|
||||
channels, audioFormat.isBigEndian() ? "big-endian" : "little-endian");
|
||||
RustpotterJava rustpotter = initRustpotter(frequency, bitDepth, channels, endianness);
|
||||
channels, isBigEndian ? "big-endian" : "little-endian");
|
||||
Rustpotter rustpotter;
|
||||
try {
|
||||
rustpotter = initRustpotter(frequency, bitDepth, channels, endianness);
|
||||
} catch (Exception e) {
|
||||
throw new KSException("Unable to configure rustpotter: " + e.getMessage(), e);
|
||||
}
|
||||
var modelName = keyword.replaceAll("\\s", "_") + ".rpw";
|
||||
var modelPath = Path.of(RUSTPOTTER_FOLDER, modelName);
|
||||
if (!modelPath.toFile().exists()) {
|
||||
@ -141,48 +146,43 @@ public class RustpotterKSService implements KSService {
|
||||
logger.debug("Model '{}' loaded", modelPath);
|
||||
AtomicBoolean aborted = new AtomicBoolean(false);
|
||||
executor.submit(() -> processAudioStream(rustpotter, ksListener, audioStream, aborted));
|
||||
return new KSServiceHandle() {
|
||||
@Override
|
||||
public void abort() {
|
||||
logger.debug("Stopping service");
|
||||
aborted.set(true);
|
||||
}
|
||||
return () -> {
|
||||
logger.debug("Stopping service");
|
||||
aborted.set(true);
|
||||
};
|
||||
}
|
||||
|
||||
private RustpotterJava initRustpotter(long frequency, int bitDepth, int channels, Endianness endianness) {
|
||||
var rustpotterBuilder = new RustpotterJavaBuilder();
|
||||
private Rustpotter initRustpotter(long frequency, int bitDepth, int channels, Endianness endianness)
|
||||
throws Exception {
|
||||
var rustpotterBuilder = new RustpotterBuilder();
|
||||
// audio configs
|
||||
rustpotterBuilder.setBitsPerSample(bitDepth);
|
||||
rustpotterBuilder.setSampleRate(frequency);
|
||||
rustpotterBuilder.setChannels(channels);
|
||||
rustpotterBuilder.setSampleFormat(SampleFormat.INT);
|
||||
rustpotterBuilder.setEndianness(endianness);
|
||||
// detector configs
|
||||
rustpotterBuilder.setThreshold(config.threshold);
|
||||
rustpotterBuilder.setAveragedThreshold(config.averagedThreshold);
|
||||
rustpotterBuilder.setScoreMode(getScoreMode(config.scoreMode));
|
||||
rustpotterBuilder.setMinScores(config.minScores);
|
||||
rustpotterBuilder.setComparatorRef(config.comparatorRef);
|
||||
rustpotterBuilder.setComparatorBandSize(config.comparatorBandSize);
|
||||
@Nullable
|
||||
VadMode vadMode = getVADMode(config.vadMode);
|
||||
if (vadMode != null) {
|
||||
rustpotterBuilder.setVADMode(vadMode);
|
||||
rustpotterBuilder.setVADSensitivity(config.vadSensitivity);
|
||||
rustpotterBuilder.setVADDelay(config.vadDelay);
|
||||
}
|
||||
@Nullable
|
||||
NoiseDetectionMode noiseDetectionMode = getNoiseMode(config.noiseDetectionMode);
|
||||
if (noiseDetectionMode != null) {
|
||||
rustpotterBuilder.setNoiseMode(noiseDetectionMode);
|
||||
rustpotterBuilder.setNoiseSensitivity(config.noiseSensitivity);
|
||||
}
|
||||
rustpotterBuilder.setEagerMode(config.eagerMode);
|
||||
// filter configs
|
||||
rustpotterBuilder.setGainNormalizerEnabled(config.gainNormalizer);
|
||||
rustpotterBuilder.setMinGain(config.minGain);
|
||||
rustpotterBuilder.setMaxGain(config.maxGain);
|
||||
rustpotterBuilder.setGainRef(config.gainRef);
|
||||
rustpotterBuilder.setBandPassFilterEnabled(config.bandPass);
|
||||
rustpotterBuilder.setBandPassLowCutoff(config.lowCutoff);
|
||||
rustpotterBuilder.setBandPassHighCutoff(config.highCutoff);
|
||||
// init the detector
|
||||
var rustpotter = rustpotterBuilder.build();
|
||||
rustpotterBuilder.delete();
|
||||
return rustpotter;
|
||||
}
|
||||
|
||||
private void processAudioStream(RustpotterJava rustpotter, KSListener ksListener, AudioStream audioStream,
|
||||
private void processAudioStream(Rustpotter rustpotter, KSListener ksListener, AudioStream audioStream,
|
||||
AtomicBoolean aborted) {
|
||||
int numBytesRead;
|
||||
var bufferSize = (int) rustpotter.getBytesPerFrame();
|
||||
@ -200,10 +200,20 @@ public class RustpotterKSService implements KSService {
|
||||
continue;
|
||||
}
|
||||
remaining = bufferSize;
|
||||
var result = rustpotter.processBuffer(audioBuffer);
|
||||
var result = rustpotter.processBytes(audioBuffer);
|
||||
if (result.isPresent()) {
|
||||
var detection = result.get();
|
||||
logger.debug("keyword '{}' detected with score {}!", detection.getName(), detection.getScore());
|
||||
if (logger.isDebugEnabled()) {
|
||||
ArrayList<String> scores = new ArrayList<>();
|
||||
var scoreNames = detection.getScoreNames().split("\\|\\|");
|
||||
var scoreValues = detection.getScores();
|
||||
for (var i = 0; i < Integer.min(scoreNames.length, scoreValues.length); i++) {
|
||||
scores.add("'" + scoreNames[i] + "': " + scoreValues[i]);
|
||||
}
|
||||
logger.debug("Detected '{}' with: Score: {}, AvgScore: {}, Count: {}, Gain: {}, Scores: {}",
|
||||
detection.getName(), detection.getScore(), detection.getAvgScore(),
|
||||
detection.getCounter(), detection.getGain(), String.join(", ", scores));
|
||||
}
|
||||
detection.delete();
|
||||
ksListener.ksEventReceived(new KSpottedEvent());
|
||||
}
|
||||
@ -216,35 +226,27 @@ public class RustpotterKSService implements KSService {
|
||||
logger.debug("rustpotter stopped");
|
||||
}
|
||||
|
||||
private @Nullable VadMode getVADMode(String mode) {
|
||||
private ScoreMode getScoreMode(String mode) {
|
||||
switch (mode) {
|
||||
case "low-bitrate":
|
||||
return VadMode.LOW_BITRATE;
|
||||
case "quality":
|
||||
return VadMode.QUALITY;
|
||||
case "aggressive":
|
||||
return VadMode.AGGRESSIVE;
|
||||
case "very-aggressive":
|
||||
return VadMode.VERY_AGGRESSIVE;
|
||||
case "average":
|
||||
return ScoreMode.AVG;
|
||||
case "median":
|
||||
return ScoreMode.MEDIAN;
|
||||
case "p25":
|
||||
return ScoreMode.P25;
|
||||
case "p50":
|
||||
return ScoreMode.P50;
|
||||
case "p75":
|
||||
return ScoreMode.P75;
|
||||
case "p80":
|
||||
return ScoreMode.P80;
|
||||
case "p90":
|
||||
return ScoreMode.P90;
|
||||
case "p95":
|
||||
return ScoreMode.P95;
|
||||
case "max":
|
||||
default:
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
private @Nullable NoiseDetectionMode getNoiseMode(String mode) {
|
||||
switch (mode) {
|
||||
case "easiest":
|
||||
return NoiseDetectionMode.EASIEST;
|
||||
case "easy":
|
||||
return NoiseDetectionMode.EASY;
|
||||
case "normal":
|
||||
return NoiseDetectionMode.NORMAL;
|
||||
case "hard":
|
||||
return NoiseDetectionMode.HARD;
|
||||
case "hardest":
|
||||
return NoiseDetectionMode.HARDEST;
|
||||
default:
|
||||
return null;
|
||||
return ScoreMode.MAX;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -9,13 +9,9 @@
|
||||
<label>Wakeword Detector</label>
|
||||
<description>Wakeword detection options.</description>
|
||||
</parameter-group>
|
||||
<parameter-group name="noiseDetector">
|
||||
<label>Noise Detector</label>
|
||||
<description>Optional noise detection options.</description>
|
||||
</parameter-group>
|
||||
<parameter-group name="vadDetector">
|
||||
<label>VAD Detector</label>
|
||||
<description>Optional voice activity detector options.</description>
|
||||
<parameter-group name="filters">
|
||||
<label>Audio Filters</label>
|
||||
<description>Optional audio filter options.</description>
|
||||
</parameter-group>
|
||||
<parameter name="threshold" type="decimal" min="0" max="1" groupName="wakewordDetector">
|
||||
<label>Threshold</label>
|
||||
@ -31,6 +27,27 @@
|
||||
cpu. If set to 0 this functionality is disabled.</description>
|
||||
<default>0.2</default>
|
||||
</parameter>
|
||||
<parameter name="scoreMode" type="text" groupName="wakewordDetector">
|
||||
<label>Score Mode</label>
|
||||
<description>Indicates how to calculate the final score.</description>
|
||||
<default>max</default>
|
||||
<options>
|
||||
<option value="average">Average</option>
|
||||
<option value="max">Max</option>
|
||||
<option value="median">Median</option>
|
||||
<option value="p25">P25</option>
|
||||
<option value="p50">P50</option>
|
||||
<option value="p75">P75</option>
|
||||
<option value="p80">P80</option>
|
||||
<option value="p90">P90</option>
|
||||
<option value="p95">P95</option>
|
||||
</options>
|
||||
</parameter>
|
||||
<parameter name="minScores" type="integer" groupName="wakewordDetector">
|
||||
<label>Min Scores</label>
|
||||
<description>Minimum number of positive scores to consider a partial detection as a detection.</description>
|
||||
<default>5</default>
|
||||
</parameter>
|
||||
<parameter name="comparatorRef" type="decimal" min="0" max="1" groupName="wakewordDetector">
|
||||
<label>Comparator Ref</label>
|
||||
<description>Configures the reference for the comparator used to match the samples.</description>
|
||||
@ -40,58 +57,44 @@
|
||||
<parameter name="comparatorBandSize" type="integer" groupName="wakewordDetector">
|
||||
<label>Comparator Band Size</label>
|
||||
<description>Configures the band-size for the comparator used to match the samples.</description>
|
||||
<default>6</default>
|
||||
<default>5</default>
|
||||
<advanced>true</advanced>
|
||||
</parameter>
|
||||
<parameter name="eagerMode" type="boolean" groupName="wakewordDetector">
|
||||
<label>Eager Mode</label>
|
||||
<description>Enables eager mode. End detection as soon as a result is over the score, instead of waiting to
|
||||
see if the
|
||||
next frame has a higher score.</description>
|
||||
<default>true</default>
|
||||
<parameter name="gainNormalizer" type="boolean" groupName="filters">
|
||||
<label>Gain Normalizer</label>
|
||||
<description> Enables an audio filter that intent to approximate the volume of the stream to a reference level (RMS
|
||||
of the samples is used as volume measure).</description>
|
||||
<default>false</default>
|
||||
</parameter>
|
||||
<parameter name="noiseDetectionMode" type="text" groupName="noiseDetector">
|
||||
<label>Noise Detection Mode</label>
|
||||
<description>Use a noise detector to reduce computation in the absence of sound. Configures the difficulty to
|
||||
consider
|
||||
a
|
||||
frame as noise (the required noise level).</description>
|
||||
<default>disabled</default>
|
||||
<options>
|
||||
<option value="disabled">Disabled</option>
|
||||
<option value="easiest">Easiest</option>
|
||||
<option value="easy">Easy</option>
|
||||
<option value="normal">Normal</option>
|
||||
<option value="hard">Hard</option>
|
||||
<option value="hardest">Hardest</option>
|
||||
</options>
|
||||
</parameter>
|
||||
<parameter name="noiseSensitivity" type="decimal" min="0" max="1" groupName="noiseDetector">
|
||||
<label>Noise Sensitivity</label>
|
||||
<description>Noise/silence ratio in the last second to consider voice is detected.</description>
|
||||
<parameter name="minGain" type="decimal" min="0.1" max="1" step="0.1" groupName="filters">
|
||||
<label>Min Gain</label>
|
||||
<description>Min gain applied by the gain normalizer filter.</description>
|
||||
<default>0.5</default>
|
||||
</parameter>
|
||||
<parameter name="vadMode" type="text" groupName="vadDetector">
|
||||
<label>VAD Mode</label>
|
||||
<description>Use a vad detector to reduce computation in the absence of vocal sound.</description>
|
||||
<default>disabled</default>
|
||||
<options>
|
||||
<option value="disabled">Disabled</option>
|
||||
<option value="low-bitrate">Low Bitrate</option>
|
||||
<option value="quality">Quality</option>
|
||||
<option value="aggressive">Aggressive</option>
|
||||
<option value="very-aggressive">Very Aggressive</option>
|
||||
</options>
|
||||
<parameter name="maxGain" type="decimal" min="0.1" max="1" step="0.1" groupName="filters">
|
||||
<label>Max Gain</label>
|
||||
<description>Max gain applied by the gain normalizer filter.</description>
|
||||
<default>1</default>
|
||||
</parameter>
|
||||
<parameter name="vadSensitivity" type="decimal" min="0" max="1" groupName="vadDetector">
|
||||
<label>VAD Sensitivity</label>
|
||||
<description>Voice/silence ratio in the last second to consider voice is detected.</description>
|
||||
<default>0.5</default>
|
||||
<parameter name="gainRef" type="decimal" min="0" max="1" step="0.001" groupName="filters">
|
||||
<label>Gain Ref</label>
|
||||
<description>Set the RMS reference used by the gain-normalizer to calculate the gain applied. If unset an estimation
|
||||
of the wakeword level is used.</description>
|
||||
</parameter>
|
||||
<parameter name="vadDelay" type="integer" groupName="vadDetector">
|
||||
<label>VAD Delay</label>
|
||||
<description>Seconds to disable the vad detector after voice is detected.</description>
|
||||
<default>3</default>
|
||||
<parameter name="bandPass" type="boolean" groupName="filters">
|
||||
<label>Band Pass</label>
|
||||
<description>Enables an audio filter that attenuates frequencies outside the low cutoff and high cutoff range.</description>
|
||||
<default>false</default>
|
||||
</parameter>
|
||||
<parameter name="lowCutoff" type="decimal" min="0" groupName="filters">
|
||||
<label>Low Cutoff</label>
|
||||
<description>Low cutoff for the band-pass filter.</description>
|
||||
<default>80</default>
|
||||
</parameter>
|
||||
<parameter name="highCutoff" type="decimal" min="0" groupName="filters">
|
||||
<label>High Cutoff</label>
|
||||
<description>High cutoff for the band-pass filter.</description>
|
||||
<default>400</default>
|
||||
</parameter>
|
||||
</config-description>
|
||||
</config-description:config-descriptions>
|
||||
|
@ -1,40 +1,42 @@
|
||||
voice.config.rustpotterks.averagedThreshold.label = Averaged Threshold
|
||||
voice.config.rustpotterks.averagedThreshold.description = Configures the detector averaged threshold, is the min score (in range 0. to 1.) that the audio should obtain against a combination of the wake word templates, the detection will be aborted if this is not the case. This way it can prevent to run the comparison of the current frame against each of the wake word templates which saves cpu. If set to 0 this functionality is disabled.
|
||||
voice.config.rustpotterks.bandPass.label = Band Pass
|
||||
voice.config.rustpotterks.bandPass.description = Enables an audio filter that attenuates frequencies outside the low cutoff and high cutoff range.
|
||||
voice.config.rustpotterks.comparatorBandSize.label = Comparator Band Size
|
||||
voice.config.rustpotterks.comparatorBandSize.description = Configures the band-size for the comparator used to match the samples.
|
||||
voice.config.rustpotterks.comparatorRef.label = Comparator Ref
|
||||
voice.config.rustpotterks.comparatorRef.description = Configures the reference for the comparator used to match the samples.
|
||||
voice.config.rustpotterks.eagerMode.label = Eager Mode
|
||||
voice.config.rustpotterks.eagerMode.description = Enables eager mode. End detection as soon as a result is over the score, instead of waiting to see if the next frame has a higher score.
|
||||
voice.config.rustpotterks.group.noiseDetector.label = Noise Detector
|
||||
voice.config.rustpotterks.group.noiseDetector.description = Optional noise detection options.
|
||||
voice.config.rustpotterks.group.vadDetector.label = VAD Detector
|
||||
voice.config.rustpotterks.group.vadDetector.description = Optional voice activity detector options.
|
||||
voice.config.rustpotterks.gainNormalizer.label = Gain Normalizer
|
||||
voice.config.rustpotterks.gainNormalizer.description = Enables an audio filter that intent to approximate the volume of the stream to a reference level (RMS of the samples is used as volume measure).
|
||||
voice.config.rustpotterks.gainRef.label = Gain Ref
|
||||
voice.config.rustpotterks.gainRef.description = Set the RMS reference used by the gain-normalizer to calculate the gain applied. If unset an estimation of the wakeword level is used.
|
||||
voice.config.rustpotterks.group.filters.label = Audio Filters
|
||||
voice.config.rustpotterks.group.filters.description = Optional audio filter options.
|
||||
voice.config.rustpotterks.group.wakewordDetector.label = Wakeword Detector
|
||||
voice.config.rustpotterks.group.wakewordDetector.description = Wakeword detection options.
|
||||
voice.config.rustpotterks.noiseDetectionMode.label = Noise Detection Mode
|
||||
voice.config.rustpotterks.noiseDetectionMode.description = Use a noise detector to reduce computation in the absence of sound. Configures the difficulty to consider a frame as noise (the required noise level).
|
||||
voice.config.rustpotterks.noiseDetectionMode.option.disabled = Disabled
|
||||
voice.config.rustpotterks.noiseDetectionMode.option.easiest = Easiest
|
||||
voice.config.rustpotterks.noiseDetectionMode.option.easy = Easy
|
||||
voice.config.rustpotterks.noiseDetectionMode.option.normal = Normal
|
||||
voice.config.rustpotterks.noiseDetectionMode.option.hard = Hard
|
||||
voice.config.rustpotterks.noiseDetectionMode.option.hardest = Hardest
|
||||
voice.config.rustpotterks.noiseSensitivity.label = Noise Sensitivity
|
||||
voice.config.rustpotterks.noiseSensitivity.description = Noise/silence ratio in the last second to consider voice is detected.
|
||||
voice.config.rustpotterks.highCutoff.label = High Cutoff
|
||||
voice.config.rustpotterks.highCutoff.description = High cutoff for the band-pass filter.
|
||||
voice.config.rustpotterks.lowCutoff.label = Low Cutoff
|
||||
voice.config.rustpotterks.lowCutoff.description = Low cutoff for the band-pass filter.
|
||||
voice.config.rustpotterks.maxGain.label = Max Gain
|
||||
voice.config.rustpotterks.maxGain.description = Max gain applied by the gain normalizer filter.
|
||||
voice.config.rustpotterks.minGain.label = Min Gain
|
||||
voice.config.rustpotterks.minGain.description = Min gain applied by the gain normalizer filter.
|
||||
voice.config.rustpotterks.minScores.label = Min Scores
|
||||
voice.config.rustpotterks.minScores.description = Minimum number of positive scores to consider a partial detection as a detection.
|
||||
voice.config.rustpotterks.scoreMode.label = Score Mode
|
||||
voice.config.rustpotterks.scoreMode.description = Indicates how to calculate the final score.
|
||||
voice.config.rustpotterks.scoreMode.option.average = Average
|
||||
voice.config.rustpotterks.scoreMode.option.max = Max
|
||||
voice.config.rustpotterks.scoreMode.option.median = Median
|
||||
voice.config.rustpotterks.scoreMode.option.p25 = P25
|
||||
voice.config.rustpotterks.scoreMode.option.p50 = P50
|
||||
voice.config.rustpotterks.scoreMode.option.p75 = P75
|
||||
voice.config.rustpotterks.scoreMode.option.p80 = P80
|
||||
voice.config.rustpotterks.scoreMode.option.p90 = P90
|
||||
voice.config.rustpotterks.scoreMode.option.p95 = P95
|
||||
voice.config.rustpotterks.threshold.label = Threshold
|
||||
voice.config.rustpotterks.threshold.description = Configures the detector threshold, is the min score (in range 0. to 1.) that some of the wakeword templates should obtain to trigger a detection. Model defined value takes prevalence if present.
|
||||
voice.config.rustpotterks.vadDelay.label = VAD Delay
|
||||
voice.config.rustpotterks.vadDelay.description = Seconds to disable the vad detector after voice is detected.
|
||||
voice.config.rustpotterks.vadMode.label = VAD Mode
|
||||
voice.config.rustpotterks.vadMode.description = Use a vad detector to reduce computation in the absence of vocal sound.
|
||||
voice.config.rustpotterks.vadMode.option.disabled = Disabled
|
||||
voice.config.rustpotterks.vadMode.option.low-bitrate = Low Bitrate
|
||||
voice.config.rustpotterks.vadMode.option.quality = Quality
|
||||
voice.config.rustpotterks.vadMode.option.aggressive = Aggressive
|
||||
voice.config.rustpotterks.vadMode.option.very-aggressive = Very Aggressive
|
||||
voice.config.rustpotterks.vadSensitivity.label = VAD Sensitivity
|
||||
voice.config.rustpotterks.vadSensitivity.description = Voice/silence ratio in the last second to consider voice is detected.
|
||||
|
||||
# service
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user