[watsonstt] initial contribution (#12161)

* [watsonstt] initial contribution Signed-off-by: Miguel Álvarez Díez <miguelwork92@gmail.com>
2025-01-10 15:11:59 +01:00 · 2022-02-08 20:52:02 +01:00 · 2022-02-08 20:52:02 +01:00 · 9a086fd6e3
commit 9a086fd6e3
parent 99cfb65aba
12 changed files with 681 additions and 0 deletions
--- a/1
+++ b/1
@ -383,6 +383,7 @@
 /bundles/org.openhab.voice.pollytts/ @hillmanr
 /bundles/org.openhab.voice.porcupineks/ @GiviMAD
 /bundles/org.openhab.voice.voicerss/ @JochenHiller @lolodomo
+/bundles/org.openhab.voice.watsonstt/ @GiviMAD
 /itests/org.openhab.binding.astro.tests/ @gerrieg
 /itests/org.openhab.binding.avmfritz.tests/ @cweitkamp
 /itests/org.openhab.binding.feed.tests/ @svilenvul
--- a/bom/openhab-addons/pom.xml
+++ b/bom/openhab-addons/pom.xml
@ -1906,6 +1906,11 @@
      <artifactId>org.openhab.voice.voicerss</artifactId>
      <version>${project.version}</version>
    </dependency>
+    <dependency>
+      <groupId>org.openhab.addons.bundles</groupId>
+      <artifactId>org.openhab.voice.watsonstt</artifactId>
+      <version>${project.version}</version>
+    </dependency>
  </dependencies>

 </project>
--- a/bundles/org.openhab.voice.watsonstt/NOTICE
+++ b/bundles/org.openhab.voice.watsonstt/NOTICE
@ -0,0 +1,20 @@
+This content is produced and maintained by the openHAB project.
+
+* Project home: https://www.openhab.org
+
+== Declared Project Licenses
+
+This program and the accompanying materials are made available under the terms
+of the Eclipse Public License 2.0 which is available at
+https://www.eclipse.org/legal/epl-2.0/.
+
+== Source Code
+
+https://github.com/openhab/openhab-addons
+
+== Third-party Content
+ 
+com.ibm.watson: speech-to-text
+* License: Apache 2.0 License
+* Project: https://github.com/watson-developer-cloud/java-sdk
+* Source: https://github.com/watson-developer-cloud/java-sdk/tree/master/speech-to-text
--- a/bundles/org.openhab.voice.watsonstt/README.md
+++ b/bundles/org.openhab.voice.watsonstt/README.md
@ -0,0 +1,65 @@
+# IBM Watson Speech-to-Text
+
+Watson STT Service uses the non-free IBM Watson Speech-to-Text API to transcript audio data to text. 
+Be aware that using this service may incur cost on your IBM account.
+You can find pricing information on [this page](https://www.ibm.com/cloud/watson-speech-to-text/pricing).
+
+## Obtaining Credentials
+
+Before you can use this add-on, you should create a Speech-to-Text instance in the IBM Cloud service. 
+
+* Go to the following [link](https://cloud.ibm.com/catalog/services/speech-to-text) and create the instance in your desired region.
+* After the instance is created you should be able to view its url and api key.
+
+## Configuration
+
+### Authentication Configuration
+
+Use your favorite configuration UI to edit **Settings / Other Services - IBM Watson Speech-to-Text** and set:
+
+* **Api Key** - Api key for Speech-to-Text instance created on IBM Cloud.
+* **Instance Url** - Url for Speech-to-Text instance created on IBM Cloud.
+
+### Speech to Text Configuration
+
+Use your favorite configuration UI to edit **Settings / Other Services - IBM Watson Speech-to-Text**:
+
+* **Background Audio Suppression** - Use the parameter to suppress side conversations or background noise.
+* **Speech Detector Sensitivity** - Use the parameter to suppress word insertions from music, coughing, and other non-speech events.
+* **Inactivity Timeout** - The time in seconds after which, if only silence (no speech) is detected in the audio, the connection is closed.
+* **Opt Out Logging** - By default, all IBM Watson™ services log requests and their results. Logging is done only to improve the services for future users. The logged data is not shared or made public.
+* **No Results Message** - Message to be told when no results.
+* **Smart Formatting** - If true, the service converts dates, times, series of digits and numbers, phone numbers, currency values, and internet addresses into more readable. (Not available for all locales)
+* **Redaction** - If true, the service redacts, or masks, numeric data from final transcripts. (Not available for all locales)
+
+### Configuration via a text file
+
+In case you would like to setup the service via a text file, create a new file in `$OPENHAB_ROOT/conf/services` named `watsonstt.cfg`
+
+Its contents should look similar to:
+
+```
+org.openhab.voice.watsonstt:apiKey=******
+org.openhab.voice.watsonstt:instanceUrl=https://api.***.speech-to-text.watson.cloud.ibm.com/instances/*****
+org.openhab.voice.watsonstt:backgroundAudioSuppression=0.5
+org.openhab.voice.watsonstt:speechDetectorSensitivity=0.5
+org.openhab.voice.watsonstt:inactivityTimeout=2
+org.openhab.voice.watsonstt:optOutLogging=false
+org.openhab.voice.watsonstt:smartFormatting=false
+org.openhab.voice.watsonstt:redaction=false
+org.openhab.voice.watsonstt:noResultsMessage="Sorry, I didn't understand you"
+```
+
+### Default Speech-to-Text Configuration
+
+You can setup your preferred default Speech-to-Text in the UI:
+
+* Go to **Settings**.
+* Edit **System Services - Voice**.
+* Set **Watson** as **Speech-to-Text**.
+
+In case you would like to setup these settings via a text file, you can edit the file `runtime.cfg` in `$OPENHAB_ROOT/conf/services` and set the following entries:
+
+```
+org.openhab.voice:defaultSTT=watsonstt
+```
--- a/bundles/org.openhab.voice.watsonstt/pom.xml
+++ b/bundles/org.openhab.voice.watsonstt/pom.xml
@ -0,0 +1,70 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://maven.apache.org/POM/4.0.0"
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <modelVersion>4.0.0</modelVersion>
+
+  <parent>
+    <groupId>org.openhab.addons.bundles</groupId>
+    <artifactId>org.openhab.addons.reactor.bundles</artifactId>
+    <version>3.3.0-SNAPSHOT</version>
+  </parent>
+
+  <artifactId>org.openhab.voice.watsonstt</artifactId>
+
+  <name>openHAB Add-ons :: Bundles :: Voice :: IBM Watson Speech to Text</name>
+  <properties>
+    <bnd.importpackage>!android.*,!dalvik.*,!kotlin.*,sun.security.*;resolution:=optional,org.openjsse.*;resolution:=optional,org.conscrypt.*;resolution:=optional,org.bouncycastle.*;resolution:=optional,okhttp3.logging.*;resolution:=optional,com.google.gson.*;resolution:=optional,io.reactivex;resolution:=optional,okio.*;resolution:=optional,org.apache.commons.*;resolution:=optional,*</bnd.importpackage>
+  </properties>
+  <dependencies>
+    <dependency>
+      <groupId>com.ibm.watson</groupId>
+      <artifactId>speech-to-text</artifactId>
+      <version>9.3.1</version>
+      <scope>compile</scope>
+    </dependency>
+    <!-- sdk deps -->
+    <dependency>
+      <groupId>com.ibm.cloud</groupId>
+      <artifactId>sdk-core</artifactId>
+      <version>9.15.0</version>
+      <scope>compile</scope>
+    </dependency>
+    <dependency>
+      <groupId>com.ibm.watson</groupId>
+      <artifactId>common</artifactId>
+      <version>9.3.1</version>
+      <scope>compile</scope>
+    </dependency>
+    <dependency>
+      <groupId>com.squareup.okhttp3</groupId>
+      <artifactId>okhttp</artifactId>
+      <version>4.9.1</version>
+      <scope>compile</scope>
+    </dependency>
+    <dependency>
+      <groupId>com.squareup.okhttp3</groupId>
+      <artifactId>okhttp-urlconnection</artifactId>
+      <version>4.9.1</version>
+      <scope>compile</scope>
+    </dependency>
+    <dependency>
+      <groupId>org.jetbrains.kotlin</groupId>
+      <artifactId>kotlin-stdlib</artifactId>
+      <version>1.4.10</version>
+      <scope>compile</scope>
+    </dependency>
+    <dependency>
+      <groupId>com.squareup.okio</groupId>
+      <artifactId>okio</artifactId>
+      <version>2.8.0</version>
+      <scope>compile</scope>
+    </dependency>
+    <dependency>
+      <groupId>com.google.code.gson</groupId>
+      <artifactId>gson</artifactId>
+      <version>2.8.9</version>
+      <scope>compile</scope>
+    </dependency>
+  </dependencies>
+</project>
--- a/bundles/org.openhab.voice.watsonstt/src/main/feature/feature.xml
+++ b/bundles/org.openhab.voice.watsonstt/src/main/feature/feature.xml
@ -0,0 +1,9 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<features name="org.openhab.voice.watsonstt-${project.version}" xmlns="http://karaf.apache.org/xmlns/features/v1.4.0">
+	<repository>mvn:org.openhab.core.features.karaf/org.openhab.core.features.karaf.openhab-core/${ohc.version}/xml/features</repository>
+
+	<feature name="openhab-voice-watsonstt" description="IBM Watson Speech-to-Text" version="${project.version}">
+		<feature>openhab-runtime-base</feature>
+		<bundle start-level="80">mvn:org.openhab.addons.bundles/org.openhab.voice.watsonstt/${project.version}</bundle>
+	</feature>
+</features>
--- a/bundles/org.openhab.voice.watsonstt/src/main/java/org/openhab/voice/watsonstt/internal/WatsonSTTConfiguration.java
+++ b/bundles/org.openhab.voice.watsonstt/src/main/java/org/openhab/voice/watsonstt/internal/WatsonSTTConfiguration.java
@ -0,0 +1,63 @@
+/**
+ * Copyright (c) 2010-2022 Contributors to the openHAB project
+ *
+ * See the NOTICE file(s) distributed with this work for additional
+ * information.
+ *
+ * This program and the accompanying materials are made available under the
+ * terms of the Eclipse Public License 2.0 which is available at
+ * http://www.eclipse.org/legal/epl-2.0
+ *
+ * SPDX-License-Identifier: EPL-2.0
+ */
+package org.openhab.voice.watsonstt.internal;
+
+import org.eclipse.jdt.annotation.NonNullByDefault;
+
+/**
+ * The {@link WatsonSTTConfiguration} class contains fields mapping thing configuration parameters.
+ *
+ * @author Miguel Álvarez - Initial contribution
+ */
+@NonNullByDefault
+public class WatsonSTTConfiguration {
+
+    /**
+     * Api key for Speech-to-Text instance created on IBM Cloud.
+     */
+    public String apiKey = "";
+    /**
+     * Url for Speech-to-Text instance created on IBM Cloud.
+     */
+    public String instanceUrl = "";
+    /**
+     * Use the parameter to suppress side conversations or background noise.
+     */
+    public float backgroundAudioSuppression = 0f;
+    /**
+     * Use the parameter to suppress word insertions from music, coughing, and other non-speech events.
+     */
+    public float speechDetectorSensitivity = 0.5f;
+    /**
+     * If true, the service converts dates, times, series of digits and numbers, phone numbers, currency values, and
+     * internet addresses into more readable.
+     */
+    public boolean smartFormatting = false;
+    /**
+     * If true, the service redacts, or masks, numeric data from final transcripts.
+     */
+    public boolean redaction = false;
+    /**
+     * The time in seconds after which, if only silence (no speech) is detected in the audio, the connection is closed.
+     */
+    public int inactivityTimeout = 3;
+    /**
+     * Message to be told when no results
+     */
+    public String noResultsMessage = "No results";
+    /**
+     * By default, all IBM Watson™ services log requests and their results. Logging is done only to improve the services
+     * for future users. The logged data is not shared or made public.
+     */
+    public boolean optOutLogging = true;
+}
--- a/bundles/org.openhab.voice.watsonstt/src/main/java/org/openhab/voice/watsonstt/internal/WatsonSTTConstants.java
+++ b/bundles/org.openhab.voice.watsonstt/src/main/java/org/openhab/voice/watsonstt/internal/WatsonSTTConstants.java
@ -0,0 +1,43 @@
+/**
+ * Copyright (c) 2010-2022 Contributors to the openHAB project
+ *
+ * See the NOTICE file(s) distributed with this work for additional
+ * information.
+ *
+ * This program and the accompanying materials are made available under the
+ * terms of the Eclipse Public License 2.0 which is available at
+ * http://www.eclipse.org/legal/epl-2.0
+ *
+ * SPDX-License-Identifier: EPL-2.0
+ */
+package org.openhab.voice.watsonstt.internal;
+
+import org.eclipse.jdt.annotation.NonNullByDefault;
+
+/**
+ * The {@link WatsonSTTConstants} class defines common constants, which are
+ * used across the whole binding.
+ *
+ * @author Miguel Álvarez - Initial contribution
+ */
+@NonNullByDefault
+public class WatsonSTTConstants {
+    /**
+     * Service name
+     */
+    public static final String SERVICE_NAME = "IBM Watson";
+    /**
+     * Service id
+     */
+    public static final String SERVICE_ID = "watsonstt";
+
+    /**
+     * Service category
+     */
+    public static final String SERVICE_CATEGORY = "voice";
+
+    /**
+     * Service pid
+     */
+    public static final String SERVICE_PID = "org.openhab." + SERVICE_CATEGORY + "." + SERVICE_ID;
+}
--- a/bundles/org.openhab.voice.watsonstt/src/main/java/org/openhab/voice/watsonstt/internal/WatsonSTTService.java
+++ b/bundles/org.openhab.voice.watsonstt/src/main/java/org/openhab/voice/watsonstt/internal/WatsonSTTService.java
@ -0,0 +1,310 @@
+/**
+ * Copyright (c) 2010-2022 Contributors to the openHAB project
+ *
+ * See the NOTICE file(s) distributed with this work for additional
+ * information.
+ *
+ * This program and the accompanying materials are made available under the
+ * terms of the Eclipse Public License 2.0 which is available at
+ * http://www.eclipse.org/legal/epl-2.0
+ *
+ * SPDX-License-Identifier: EPL-2.0
+ */
+package org.openhab.voice.watsonstt.internal;
+
+import static org.openhab.voice.watsonstt.internal.WatsonSTTConstants.*;
+
+import java.util.List;
+import java.util.Locale;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.stream.Collectors;
+
+import javax.net.ssl.SSLPeerUnverifiedException;
+
+import org.eclipse.jdt.annotation.NonNullByDefault;
+import org.eclipse.jdt.annotation.Nullable;
+import org.openhab.core.audio.AudioFormat;
+import org.openhab.core.audio.AudioStream;
+import org.openhab.core.common.ThreadPoolManager;
+import org.openhab.core.config.core.ConfigurableService;
+import org.openhab.core.config.core.Configuration;
+import org.openhab.core.voice.RecognitionStartEvent;
+import org.openhab.core.voice.RecognitionStopEvent;
+import org.openhab.core.voice.STTException;
+import org.openhab.core.voice.STTListener;
+import org.openhab.core.voice.STTService;
+import org.openhab.core.voice.STTServiceHandle;
+import org.openhab.core.voice.SpeechRecognitionErrorEvent;
+import org.openhab.core.voice.SpeechRecognitionEvent;
+import org.osgi.framework.Constants;
+import org.osgi.service.component.annotations.Activate;
+import org.osgi.service.component.annotations.Component;
+import org.osgi.service.component.annotations.Modified;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.ibm.cloud.sdk.core.http.HttpMediaType;
+import com.ibm.cloud.sdk.core.security.IamAuthenticator;
+import com.ibm.watson.speech_to_text.v1.SpeechToText;
+import com.ibm.watson.speech_to_text.v1.model.RecognizeWithWebsocketsOptions;
+import com.ibm.watson.speech_to_text.v1.model.SpeechRecognitionAlternative;
+import com.ibm.watson.speech_to_text.v1.model.SpeechRecognitionResult;
+import com.ibm.watson.speech_to_text.v1.model.SpeechRecognitionResults;
+import com.ibm.watson.speech_to_text.v1.websocket.RecognizeCallback;
+
+import okhttp3.WebSocket;
+
+/**
+ * The {@link WatsonSTTService} allows to use Watson as Speech-to-Text engine
+ *
+ * @author Miguel Álvarez - Initial contribution
+ */
+@NonNullByDefault
+@Component(configurationPid = SERVICE_PID, property = Constants.SERVICE_PID + "=" + SERVICE_PID)
+@ConfigurableService(category = SERVICE_CATEGORY, label = SERVICE_NAME
+        + " Speech-to-Text", description_uri = SERVICE_CATEGORY + ":" + SERVICE_ID)
+public class WatsonSTTService implements STTService {
+    private final Logger logger = LoggerFactory.getLogger(WatsonSTTService.class);
+    private final ScheduledExecutorService executor = ThreadPoolManager.getScheduledPool("OH-voice-watsonstt");
+    private final List<String> models = List.of("ar-AR_BroadbandModel", "de-DE_BroadbandModel", "en-AU_BroadbandModel",
+            "en-GB_BroadbandModel", "en-US_BroadbandModel", "es-AR_BroadbandModel", "es-CL_BroadbandModel",
+            "es-CO_BroadbandModel", "es-ES_BroadbandModel", "es-MX_BroadbandModel", "es-PE_BroadbandModel",
+            "fr-CA_BroadbandModel", "fr-FR_BroadbandModel", "it-IT_BroadbandModel", "ja-JP_BroadbandModel",
+            "ko-KR_BroadbandModel", "nl-NL_BroadbandModel", "pt-BR_BroadbandModel", "zh-CN_BroadbandModel");
+    private final Set<Locale> supportedLocales = models.stream().map(name -> name.split("_")[0])
+            .map(Locale::forLanguageTag).collect(Collectors.toSet());
+    private WatsonSTTConfiguration config = new WatsonSTTConfiguration();
+
+    @Activate
+    protected void activate(Map<String, Object> config) {
+        this.config = new Configuration(config).as(WatsonSTTConfiguration.class);
+    }
+
+    @Modified
+    protected void modified(Map<String, Object> config) {
+        this.config = new Configuration(config).as(WatsonSTTConfiguration.class);
+    }
+
+    @Override
+    public String getId() {
+        return SERVICE_ID;
+    }
+
+    @Override
+    public String getLabel(@Nullable Locale locale) {
+        return SERVICE_NAME;
+    }
+
+    @Override
+    public Set<Locale> getSupportedLocales() {
+        return supportedLocales;
+    }
+
+    @Override
+    public Set<AudioFormat> getSupportedFormats() {
+        return Set.of(AudioFormat.WAV, AudioFormat.OGG, new AudioFormat("OGG", "OPUS", null, null, null, null),
+                AudioFormat.MP3);
+    }
+
+    @Override
+    public STTServiceHandle recognize(STTListener sttListener, AudioStream audioStream, Locale locale, Set<String> set)
+            throws STTException {
+        if (config.apiKey.isBlank() || config.instanceUrl.isBlank()) {
+            throw new STTException("service is not correctly configured");
+        }
+        String contentType = getContentType(audioStream);
+        if (contentType == null) {
+            throw new STTException("Unsupported format, unable to resolve audio content type");
+        }
+        logger.debug("Content-Type: {}", contentType);
+        var speechToText = new SpeechToText(new IamAuthenticator.Builder().apikey(config.apiKey).build());
+        speechToText.setServiceUrl(config.instanceUrl);
+        if (config.optOutLogging) {
+            speechToText.setDefaultHeaders(Map.of("X-Watson-Learning-Opt-Out", "1"));
+        }
+        RecognizeWithWebsocketsOptions wsOptions = new RecognizeWithWebsocketsOptions.Builder().audio(audioStream)
+                .contentType(contentType).redaction(config.redaction).smartFormatting(config.smartFormatting)
+                .model(locale.toLanguageTag() + "_BroadbandModel").interimResults(true)
+                .backgroundAudioSuppression(config.backgroundAudioSuppression)
+                .speechDetectorSensitivity(config.speechDetectorSensitivity).inactivityTimeout(config.inactivityTimeout)
+                .build();
+        final AtomicReference<@Nullable WebSocket> socketRef = new AtomicReference<>();
+        var task = executor.submit(() -> {
+            int retries = 2;
+            while (retries > 0) {
+                try {
+                    socketRef.set(speechToText.recognizeUsingWebSocket(wsOptions,
+                            new TranscriptionListener(sttListener, config)));
+                    break;
+                } catch (RuntimeException e) {
+                    var cause = e.getCause();
+                    if (cause instanceof SSLPeerUnverifiedException) {
+                        logger.debug("Retrying on error: {}", cause.getMessage());
+                        retries--;
+                    } else {
+                        var errorMessage = e.getMessage();
+                        logger.warn("Aborting on error: {}", errorMessage);
+                        sttListener.sttEventReceived(
+                                new SpeechRecognitionErrorEvent(errorMessage != null ? errorMessage : "Unknown error"));
+                        break;
+                    }
+                }
+            }
+        });
+        return new STTServiceHandle() {
+            @Override
+            public void abort() {
+                var socket = socketRef.get();
+                if (socket != null) {
+                    socket.close(1000, null);
+                    socket.cancel();
+                    try {
+                        Thread.sleep(100);
+                    } catch (InterruptedException ignored) {
+                    }
+                }
+                task.cancel(true);
+            }
+        };
+    }
+
+    private @Nullable String getContentType(AudioStream audioStream) throws STTException {
+        AudioFormat format = audioStream.getFormat();
+        String container = format.getContainer();
+        String codec = format.getCodec();
+        if (container == null || codec == null) {
+            throw new STTException("Missing audio stream info");
+        }
+        Long frequency = format.getFrequency();
+        Integer bitDepth = format.getBitDepth();
+        switch (container) {
+            case AudioFormat.CONTAINER_WAVE:
+                if (AudioFormat.CODEC_PCM_SIGNED.equals(codec)) {
+                    if (bitDepth == null || bitDepth != 16) {
+                        return "audio/wav";
+                    }
+                    // rate is a required parameter for this type
+                    if (frequency == null) {
+                        return null;
+                    }
+                    StringBuilder contentTypeL16 = new StringBuilder(HttpMediaType.AUDIO_PCM).append(";rate=")
+                            .append(frequency);
+                    // // those are optional
+                    Integer channels = format.getChannels();
+                    if (channels != null) {
+                        contentTypeL16.append(";channels=").append(channels);
+                    }
+                    Boolean bigEndian = format.isBigEndian();
+                    if (bigEndian != null) {
+                        contentTypeL16.append(";")
+                                .append(bigEndian ? "endianness=big-endian" : "endianness=little-endian");
+                    }
+                    return contentTypeL16.toString();
+                }
+            case AudioFormat.CONTAINER_OGG:
+                switch (codec) {
+                    case AudioFormat.CODEC_VORBIS:
+                        return "audio/ogg;codecs=vorbis";
+                    case "OPUS":
+                        return "audio/ogg;codecs=opus";
+                }
+                break;
+            case AudioFormat.CONTAINER_NONE:
+                if (AudioFormat.CODEC_MP3.equals(codec)) {
+                    return "audio/mp3";
+                }
+                break;
+        }
+        return null;
+    }
+
+    private static class TranscriptionListener implements RecognizeCallback {
+        private final Logger logger = LoggerFactory.getLogger(TranscriptionListener.class);
+        private final StringBuilder transcriptBuilder = new StringBuilder();
+        private final STTListener sttListener;
+        private final WatsonSTTConfiguration config;
+        private float confidenceSum = 0f;
+        private int responseCount = 0;
+        private boolean disconnected = false;
+
+        public TranscriptionListener(STTListener sttListener, WatsonSTTConfiguration config) {
+            this.sttListener = sttListener;
+            this.config = config;
+        }
+
+        @Override
+        public void onTranscription(@Nullable SpeechRecognitionResults speechRecognitionResults) {
+            logger.debug("onTranscription");
+            if (speechRecognitionResults == null) {
+                return;
+            }
+            speechRecognitionResults.getResults().stream().filter(SpeechRecognitionResult::isXFinal).forEach(result -> {
+                SpeechRecognitionAlternative alternative = result.getAlternatives().stream().findFirst().orElse(null);
+                if (alternative == null) {
+                    return;
+                }
+                logger.debug("onTranscription Final");
+                Double confidence = alternative.getConfidence();
+                transcriptBuilder.append(alternative.getTranscript());
+                confidenceSum += confidence != null ? confidence.floatValue() : 0f;
+                responseCount++;
+            });
+        }
+
+        @Override
+        public void onConnected() {
+            logger.debug("onConnected");
+        }
+
+        @Override
+        public void onError(@Nullable Exception e) {
+            var errorMessage = e != null ? e.getMessage() : null;
+            if (errorMessage != null && disconnected && errorMessage.contains("Socket closed")) {
+                logger.debug("Error ignored: {}", errorMessage);
+                return;
+            }
+            logger.warn("TranscriptionError: {}", errorMessage);
+            sttListener.sttEventReceived(
+                    new SpeechRecognitionErrorEvent(errorMessage != null ? errorMessage : "Unknown error"));
+        }
+
+        @Override
+        public void onDisconnected() {
+            logger.debug("onDisconnected");
+            disconnected = true;
+            sttListener.sttEventReceived(new RecognitionStopEvent());
+            float averageConfidence = confidenceSum / (float) responseCount;
+            String transcript = transcriptBuilder.toString();
+            if (!transcript.isBlank()) {
+                sttListener.sttEventReceived(new SpeechRecognitionEvent(transcript, averageConfidence));
+            } else {
+                if (!config.noResultsMessage.isBlank()) {
+                    sttListener.sttEventReceived(new SpeechRecognitionErrorEvent(config.noResultsMessage));
+                } else {
+                    sttListener.sttEventReceived(new SpeechRecognitionErrorEvent("No results"));
+                }
+            }
+        }
+
+        @Override
+        public void onInactivityTimeout(@Nullable RuntimeException e) {
+            if (e != null) {
+                logger.debug("InactivityTimeout: {}", e.getMessage());
+            }
+        }
+
+        @Override
+        public void onListening() {
+            logger.debug("onListening");
+            sttListener.sttEventReceived(new RecognitionStartEvent());
+        }
+
+        @Override
+        public void onTranscriptionComplete() {
+            logger.debug("onTranscriptionComplete");
+        }
+    }
+}
--- a/bundles/org.openhab.voice.watsonstt/src/main/resources/OH-INF/config/config.xml
+++ b/bundles/org.openhab.voice.watsonstt/src/main/resources/OH-INF/config/config.xml
@ -0,0 +1,68 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<config-description:config-descriptions
+	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+	xmlns:config-description="https://openhab.org/schemas/config-description/v1.0.0"
+	xsi:schemaLocation="https://openhab.org/schemas/config-description/v1.0.0
+		https://openhab.org/schemas/config-description-1.0.0.xsd">
+
+	<config-description uri="voice:watsonstt">
+		<parameter-group name="authentication">
+			<label>Authentication</label>
+			<description>Information for connection to your Watson Speech-to-Text instance.</description>
+		</parameter-group>
+		<parameter-group name="stt">
+			<label>STT Configuration</label>
+			<description>Parameters for Watson Speech-to-Text API.</description>
+		</parameter-group>
+		<parameter name="apiKey" type="text" required="true" groupName="authentication">
+			<label>Api Key</label>
+			<description>Api key for Speech-to-Text instance created on IBM Cloud.</description>
+		</parameter>
+		<parameter name="instanceUrl" type="text" required="true" groupName="authentication">
+			<label>Instance Url</label>
+			<description>Url for Speech-to-Text instance created on IBM Cloud.</description>
+		</parameter>
+		<parameter name="backgroundAudioSuppression" type="decimal" min="0" max="1" step="0.1" groupName="stt">
+			<label>Background Audio Suppression</label>
+			<description>Use the parameter to suppress side conversations or background noise.</description>
+			<default>0</default>
+		</parameter>
+		<parameter name="speechDetectorSensitivity" type="decimal" min="0" max="1" step="0.1" groupName="stt">
+			<label>Speech Detector Sensitivity</label>
+			<description>Use the parameter to suppress word insertions from music, coughing, and other non-speech events.</description>
+			<default>0.5</default>
+		</parameter>
+		<parameter name="inactivityTimeout" type="integer" unit="s" groupName="stt">
+			<label>Inactivity Timeout</label>
+			<description>The time in seconds after which, if only silence (no speech) is detected in the audio, the connection is
+				closed.</description>
+			<default>3</default>
+		</parameter>
+		<parameter name="noResultsMessage" type="text" groupName="stt">
+			<label>No Results Message</label>
+			<description>Message to be told when no transcription is done.</description>
+			<default>No results</default>
+		</parameter>
+		<parameter name="optOutLogging" type="boolean" groupName="stt">
+			<label>Opt Out Logging</label>
+			<description>By default, all IBM Watson™ services log requests and their results. Logging is done only to improve the
+				services for future users. The logged data is not shared or made public.</description>
+			<default>true</default>
+		</parameter>
+		<parameter name="smartFormatting" type="boolean" groupName="stt">
+			<label>Smart Formatting</label>
+			<description>If true, the service converts dates, times, series of digits and numbers, phone numbers, currency
+				values, and internet addresses into more readable. (Not available for all locales)</description>
+			<default>false</default>
+			<advanced>true</advanced>
+		</parameter>
+		<parameter name="redaction" type="boolean" groupName="stt">
+			<label>Redaction</label>
+			<description>If true, the service redacts, or masks, numeric data from final transcripts. (Not available for all
+				locales)</description>
+			<default>false</default>
+			<advanced>true</advanced>
+		</parameter>
+	</config-description>
+
+</config-description:config-descriptions>
--- a/bundles/org.openhab.voice.watsonstt/src/main/resources/OH-INF/i18n/watsonstt.properties
+++ b/bundles/org.openhab.voice.watsonstt/src/main/resources/OH-INF/i18n/watsonstt.properties
@ -0,0 +1,26 @@
+voice.config.watsonstt.apiKey.label = Api Key
+voice.config.watsonstt.apiKey.description = Api key for Speech-to-Text instance created on IBM Cloud.
+voice.config.watsonstt.backgroundAudioSuppression.label = Background Audio Suppression
+voice.config.watsonstt.backgroundAudioSuppression.description = Use the parameter to suppress side conversations or background noise.
+voice.config.watsonstt.group.authentication.label = Authentication
+voice.config.watsonstt.group.authentication.description = Information for connection to your Watson Speech-to-Text instance.
+voice.config.watsonstt.group.stt.label = STT Configuration
+voice.config.watsonstt.group.stt.description = Parameters for Watson Speech-to-Text API.
+voice.config.watsonstt.inactivityTimeout.label = Inactivity Timeout
+voice.config.watsonstt.inactivityTimeout.description = The time in seconds after which, if only silence (no speech) is detected in the audio, the connection is closed.
+voice.config.watsonstt.instanceUrl.label = Instance Url
+voice.config.watsonstt.instanceUrl.description = Url for Speech-to-Text instance created on IBM Cloud.
+voice.config.watsonstt.noResultsMessage.label = No Results Message
+voice.config.watsonstt.noResultsMessage.description = Message to be told when no transcription is done.
+voice.config.watsonstt.optOutLogging.label = Opt Out Logging
+voice.config.watsonstt.optOutLogging.description = By default, all IBM Watson™ services log requests and their results. Logging is done only to improve the services for future users. The logged data is not shared or made public.
+voice.config.watsonstt.redaction.label = Redaction
+voice.config.watsonstt.redaction.description = If true, the service redacts, or masks, numeric data from final transcripts. (Not available for all locales)
+voice.config.watsonstt.smartFormatting.label = Smart Formatting
+voice.config.watsonstt.smartFormatting.description = If true, the service converts dates, times, series of digits and numbers, phone numbers, currency values, and internet addresses into more readable. (Not available for all locales)
+voice.config.watsonstt.speechDetectorSensitivity.label = Speech Detector Sensitivity
+voice.config.watsonstt.speechDetectorSensitivity.description = Use the parameter to suppress word insertions from music, coughing, and other non-speech events.
+
+# service
+
+service.voice.watsonstt.label = IBM Watson Speech-to-Text
--- a/bundles/pom.xml
+++ b/bundles/pom.xml
@ -401,6 +401,7 @@
    <module>org.openhab.voice.pollytts</module>
    <module>org.openhab.voice.porcupineks</module>
    <module>org.openhab.voice.voicerss</module>
+    <module>org.openhab.voice.watsonstt</module>
  </modules>

  <properties>