diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/SpeechToText.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/SpeechToText.java index 55aa1febf0..f6fccac6c0 100644 --- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/SpeechToText.java +++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/SpeechToText.java @@ -1,5 +1,5 @@ /* - * (C) Copyright IBM Corp. 2016, 2024. + * (C) Copyright IBM Corp. 2024. * * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at @@ -113,8 +113,8 @@ * *
Effective **31 July 2023**, all previous-generation models will be removed from the service
* and the documentation. Most previous-generation models were deprecated on 15 March 2022. You must
- * migrate to the equivalent next-generation model by 31 July 2023. For more information, see
- * [Migrating to next-generation
+ * migrate to the equivalent large speech model or next-generation model by 31 July 2023. For more
+ * information, see [Migrating to large speech
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate).{:
* deprecated}
*
@@ -372,31 +372,37 @@ public ServiceCall **See also:** [Supported audio
* formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats).
*
- * ### Next-generation models
+ * ### Large speech models and Next-generation models
*
- * The service supports next-generation `Multimedia` (16 kHz) and `Telephony` (8 kHz) models
- * for many languages. Next-generation models have higher throughput than the service's previous
- * generation of `Broadband` and `Narrowband` models. When you use next-generation models, the
- * service can return transcriptions more quickly and also provide noticeably better transcription
- * accuracy.
+ * The service supports large speech models and next-generation `Multimedia` (16 kHz) and
+ * `Telephony` (8 kHz) models for many languages. Large speech models and next-generation models
+ * have higher throughput than the service's previous generation of `Broadband` and `Narrowband`
+ * models. When you use large speech models and next-generation models, the service can return
+ * transcriptions more quickly and also provide noticeably better transcription accuracy.
*
- * You specify a next-generation model by using the `model` query parameter, as you do a
- * previous-generation model. Most next-generation models support the `low_latency` parameter, and
- * all next-generation models support the `character_insertion_bias` parameter. These parameters
- * are not available with previous-generation models.
+ * You specify a large speech model or next-generation model by using the `model` query
+ * parameter, as you do a previous-generation model. Only the next-generation models support the
+ * `low_latency` parameter, and all large speech models and next-generation models support the
+ * `character_insertion_bias` parameter. These parameters are not available with
+ * previous-generation models.
*
- * Next-generation models do not support all of the speech recognition parameters that are
- * available for use with previous-generation models. Next-generation models do not support the
- * following parameters: * `acoustic_customization_id` * `keywords` and `keywords_threshold` *
- * `processing_metrics` and `processing_metrics_interval` * `word_alternatives_threshold`
+ * Large speech models and next-generation models do not support all of the speech recognition
+ * parameters that are available for use with previous-generation models. Next-generation models
+ * do not support the following parameters: * `acoustic_customization_id` * `keywords` and
+ * `keywords_threshold` * `processing_metrics` and `processing_metrics_interval` *
+ * `word_alternatives_threshold`
*
* **Important:** Effective **31 July 2023**, all previous-generation models will be removed
* from the service and the documentation. Most previous-generation models were deprecated on 15
- * March 2022. You must migrate to the equivalent next-generation model by 31 July 2023. For more
- * information, see [Migrating to next-generation
+ * March 2022. You must migrate to the equivalent large speech model or next-generation model by
+ * 31 July 2023. For more information, see [Migrating to large speech
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate).
*
- * **See also:** * [Next-generation languages and
+ * **See also:** * [Large speech languages and
+ * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages)
+ * * [Supported features for large speech
+ * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages#models-lsm-supported-features)
+ * * [Next-generation languages and
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng) * [Supported
* features for next-generation
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-features)
@@ -439,6 +445,9 @@ public ServiceCall **See also:** [Supported audio
* formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats).
*
- * ### Next-generation models
+ * ### Large speech models and Next-generation models
*
- * The service supports next-generation `Multimedia` (16 kHz) and `Telephony` (8 kHz) models
- * for many languages. Next-generation models have higher throughput than the service's previous
- * generation of `Broadband` and `Narrowband` models. When you use next-generation models, the
- * service can return transcriptions more quickly and also provide noticeably better transcription
- * accuracy.
+ * The service supports large speech models and next-generation `Multimedia` (16 kHz) and
+ * `Telephony` (8 kHz) models for many languages. Large speech models and next-generation models
+ * have higher throughput than the service's previous generation of `Broadband` and `Narrowband`
+ * models. When you use large speech models and next-generation models, the service can return
+ * transcriptions more quickly and also provide noticeably better transcription accuracy.
*
- * You specify a next-generation model by using the `model` query parameter, as you do a
- * previous-generation model. Most next-generation models support the `low_latency` parameter, and
- * all next-generation models support the `character_insertion_bias` parameter. These parameters
- * are not available with previous-generation models.
+ * You specify a large speech model or next-generation model by using the `model` query
+ * parameter, as you do a previous-generation model. Only the next-generation models support the
+ * `low_latency` parameter, and all large speech models and next-generation models support the
+ * `character_insertion_bias` parameter. These parameters are not available with
+ * previous-generation models.
*
- * Next-generation models do not support all of the speech recognition parameters that are
- * available for use with previous-generation models. Next-generation models do not support the
- * following parameters: * `acoustic_customization_id` * `keywords` and `keywords_threshold` *
- * `processing_metrics` and `processing_metrics_interval` * `word_alternatives_threshold`
+ * Large speech models and next-generation models do not support all of the speech recognition
+ * parameters that are available for use with previous-generation models. Next-generation models
+ * do not support the following parameters: * `acoustic_customization_id` * `keywords` and
+ * `keywords_threshold` * `processing_metrics` and `processing_metrics_interval` *
+ * `word_alternatives_threshold`
*
* **Important:** Effective **31 July 2023**, all previous-generation models will be removed
* from the service and the documentation. Most previous-generation models were deprecated on 15
- * March 2022. You must migrate to the equivalent next-generation model by 31 July 2023. For more
- * information, see [Migrating to next-generation
+ * March 2022. You must migrate to the equivalent large speech model or next-generation model by
+ * 31 July 2023. For more information, see [Migrating to large speech
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate).
*
- * **See also:** * [Next-generation languages and
+ * **See also:** * [Large speech languages and
+ * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages)
+ * * [Supported features for large speech
+ * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages#models-lsm-supported-features)
+ * * [Next-generation languages and
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng) * [Supported
* features for next-generation
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-features).
@@ -994,14 +1009,49 @@ public ServiceCall **Important:** Effective **31 July 2023**, all previous-generation models will be removed
* from the service and the documentation. Most previous-generation models were deprecated on 15
- * March 2022. You must migrate to the equivalent next-generation model by 31 July 2023. For more
- * information, see [Migrating to next-generation
+ * March 2022. You must migrate to the equivalent large speech model or next-generation model by
+ * 31 July 2023. For more information, see [Migrating to large speech
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate).
*
* **See also:** * [Create a custom language
* model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#createModel-language)
* * [Language support for
- * customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support).
+ * customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support)
+ *
+ * ### Large speech models and Next-generation models
+ *
+ * The service supports large speech models and next-generation `Multimedia` (16 kHz) and
+ * `Telephony` (8 kHz) models for many languages. Large speech models and next-generation models
+ * have higher throughput than the service's previous generation of `Broadband` and `Narrowband`
+ * models. When you use large speech models and next-generation models, the service can return
+ * transcriptions more quickly and also provide noticeably better transcription accuracy.
+ *
+ * You specify a large speech model or next-generation model by using the `model` query
+ * parameter, as you do a previous-generation model. Only the next-generation models support the
+ * `low_latency` parameter, and all large speech models and next-generation models support the
+ * `character_insertion_bias` parameter. These parameters are not available with
+ * previous-generation models.
+ *
+ * Large speech models and next-generation models do not support all of the speech recognition
+ * parameters that are available for use with previous-generation models. Next-generation models
+ * do not support the following parameters: * `acoustic_customization_id` * `keywords` and
+ * `keywords_threshold` * `processing_metrics` and `processing_metrics_interval` *
+ * `word_alternatives_threshold`
+ *
+ * **Important:** Effective **31 July 2023**, all previous-generation models will be removed
+ * from the service and the documentation. Most previous-generation models were deprecated on 15
+ * March 2022. You must migrate to the equivalent large speech model or next-generation model by
+ * 31 July 2023. For more information, see [Migrating to large speech
+ * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate).
+ *
+ * **See also:** * [Large speech languages and
+ * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages)
+ * * [Supported features for large speech
+ * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages#models-lsm-supported-features)
+ * * [Next-generation languages and
+ * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng) * [Supported
+ * features for next-generation
+ * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-features).
*
* @param createLanguageModelOptions the {@link CreateLanguageModelOptions} containing the options
* for the call
@@ -1403,6 +1453,10 @@ public ServiceCall _For custom models that are based on large speech models_, the service parses and extracts
+ * word sequences from one or multiple corpora files. The characters help the service learn and
+ * predict character sequences from audio.
+ *
* _For custom models that are based on previous-generation models_, the service auto-populates
* the model's words resource with words from the corpus that are not found in its base
* vocabulary. These words are referred to as out-of-vocabulary (OOV) words. After adding a
@@ -1429,11 +1483,11 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **Important:** Effective **31 July 2023**, all previous-generation models will be removed
* from the service and the documentation. Most previous-generation models were deprecated on 15
- * March 2022. You must migrate to the equivalent next-generation model by 31 July 2023. For more
- * information, see [Migrating to next-generation
+ * March 2022. You must migrate to the equivalent large speech model or next-generation model by
+ * 31 July 2023. For more information, see [Migrating to large speech
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate).
*
* **See also:** [Create a custom acoustic
@@ -2107,7 +2161,7 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **See also:** [Listing custom acoustic
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#listModels-acoustic).
@@ -2148,7 +2202,7 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **See also:** [Listing custom acoustic
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#listModels-acoustic).
@@ -2166,7 +2220,7 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **See also:** [Listing custom acoustic
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#listModels-acoustic).
@@ -2205,7 +2259,7 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **See also:** [Deleting a custom acoustic
* model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#deleteModel-acoustic).
@@ -2268,7 +2322,7 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **See also:** * [Train the custom acoustic
* model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acoustic#trainModel-acoustic)
@@ -2342,7 +2396,7 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **See also:** [Resetting a custom acoustic
* model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#resetModel-acoustic).
@@ -2400,7 +2454,7 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **See also:** [Upgrading a custom acoustic
* model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade#custom-upgrade-acoustic).
@@ -2450,7 +2504,7 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **See also:** [Listing audio resources for a custom acoustic
* model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#listAudio).
@@ -2521,7 +2575,7 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **See also:** [Add audio to the custom acoustic
* model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acoustic#addAudio).
@@ -2631,7 +2685,7 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **See also:** [Listing audio resources for a custom acoustic
* model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#listAudio).
@@ -2676,7 +2730,7 @@ public ServiceCall **Note:** Acoustic model customization is supported only for use with previous-generation
- * models. It is not supported for next-generation models.
+ * models. It is not supported for large speech models and next-generation models.
*
* **See also:** [Deleting an audio resource from a custom acoustic
* model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#deleteAudio).
diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/Corpus.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/Corpus.java
index c67051d400..5c75be5491 100644
--- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/Corpus.java
+++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/Corpus.java
@@ -1,5 +1,5 @@
/*
- * (C) Copyright IBM Corp. 2016, 2023.
+ * (C) Copyright IBM Corp. 2024.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
@@ -73,8 +73,9 @@ public Long getTotalWords() {
/**
* Gets the outOfVocabularyWords.
*
- * _For custom models that are based on previous-generation models_, the number of OOV words
- * extracted from the corpus. The value is `0` while the corpus is being processed.
+ * _For custom models that are based on large speech models and previous-generation models_,
+ * the number of OOV words extracted from the corpus. The value is `0` while the corpus is being
+ * processed.
*
* _For custom models that are based on next-generation models_, no OOV words are extracted
* from corpora, so the value is always `0`.
diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateJobOptions.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateJobOptions.java
index 6508138f67..45d4ad890b 100644
--- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateJobOptions.java
+++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateJobOptions.java
@@ -1,5 +1,5 @@
/*
- * (C) Copyright IBM Corp. 2018, 2024.
+ * (C) Copyright IBM Corp. 2024.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
@@ -51,6 +51,8 @@ public interface Model {
String DE_DE_NARROWBANDMODEL = "de-DE_NarrowbandModel";
/** de-DE_Telephony. */
String DE_DE_TELEPHONY = "de-DE_Telephony";
+ /** en-AU. */
+ String EN_AU = "en-AU";
/** en-AU_BroadbandModel. */
String EN_AU_BROADBANDMODEL = "en-AU_BroadbandModel";
/** en-AU_Multimedia. */
@@ -59,8 +61,12 @@ public interface Model {
String EN_AU_NARROWBANDMODEL = "en-AU_NarrowbandModel";
/** en-AU_Telephony. */
String EN_AU_TELEPHONY = "en-AU_Telephony";
+ /** en-IN. */
+ String EN_IN = "en-IN";
/** en-IN_Telephony. */
String EN_IN_TELEPHONY = "en-IN_Telephony";
+ /** en-GB. */
+ String EN_GB = "en-GB";
/** en-GB_BroadbandModel. */
String EN_GB_BROADBANDMODEL = "en-GB_BroadbandModel";
/** en-GB_Multimedia. */
@@ -69,6 +75,8 @@ public interface Model {
String EN_GB_NARROWBANDMODEL = "en-GB_NarrowbandModel";
/** en-GB_Telephony. */
String EN_GB_TELEPHONY = "en-GB_Telephony";
+ /** en-US. */
+ String EN_US = "en-US";
/** en-US_BroadbandModel. */
String EN_US_BROADBANDMODEL = "en-US_BroadbandModel";
/** en-US_Multimedia. */
@@ -111,6 +119,8 @@ public interface Model {
String ES_PE_BROADBANDMODEL = "es-PE_BroadbandModel";
/** es-PE_NarrowbandModel. */
String ES_PE_NARROWBANDMODEL = "es-PE_NarrowbandModel";
+ /** fr-CA. */
+ String FR_CA = "fr-CA";
/** fr-CA_BroadbandModel. */
String FR_CA_BROADBANDMODEL = "fr-CA_BroadbandModel";
/** fr-CA_Multimedia. */
@@ -119,6 +129,8 @@ public interface Model {
String FR_CA_NARROWBANDMODEL = "fr-CA_NarrowbandModel";
/** fr-CA_Telephony. */
String FR_CA_TELEPHONY = "fr-CA_Telephony";
+ /** fr-FR. */
+ String FR_FR = "fr-FR";
/** fr-FR_BroadbandModel. */
String FR_FR_BROADBANDMODEL = "fr-FR_BroadbandModel";
/** fr-FR_Multimedia. */
@@ -137,6 +149,8 @@ public interface Model {
String IT_IT_MULTIMEDIA = "it-IT_Multimedia";
/** it-IT_Telephony. */
String IT_IT_TELEPHONY = "it-IT_Telephony";
+ /** ja-JP. */
+ String JA_JP = "ja-JP";
/** ja-JP_BroadbandModel. */
String JA_JP_BROADBANDMODEL = "ja-JP_BroadbandModel";
/** ja-JP_Multimedia. */
@@ -952,9 +966,9 @@ public String baseModelVersion() {
* custom language model compared to those from the base model for the current request.
*
* Specify a value between 0.0 and 1.0. Unless a different customization weight was specified
- * for the custom model when the model was trained, the default value is: * 0.3 for
- * previous-generation models * 0.2 for most next-generation models * 0.1 for next-generation
- * English and Japanese models
+ * for the custom model when the model was trained, the default value is: * 0.5 for large speech
+ * models * 0.3 for previous-generation models * 0.2 for most next-generation models * 0.1 for
+ * next-generation English and Japanese models
*
* A customization weight that you specify overrides a weight that was specified when the
* custom model was trained. The default value yields the best performance in general. Assign a
@@ -1117,8 +1131,8 @@ public Boolean smartFormatting() {
/**
* Gets the smartFormattingVersion.
*
- * Smart formatting version is for next-generation models and that is supported in US English,
- * Brazilian Portuguese, French and German languages.
+ * Smart formatting version for large speech models and next-generation models is supported in
+ * US English, Brazilian Portuguese, French, German, Spanish and French Canadian languages.
*
* @return the smartFormattingVersion
*/
@@ -1135,8 +1149,8 @@ public Long smartFormattingVersion() {
* of whether you specify `false` for the parameter. * _For previous-generation models,_ the
* parameter can be used with Australian English, US English, German, Japanese, Korean, and
* Spanish (both broadband and narrowband models) and UK English (narrowband model) transcription
- * only. * _For next-generation models,_ the parameter can be used with Czech, English
- * (Australian, Indian, UK, and US), German, Japanese, Korean, and Spanish transcription only.
+ * only. * _For large speech models and next-generation models,_ the parameter can be used with
+ * all available languages.
*
* See [Speaker
* labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-speaker-labels).
@@ -1310,8 +1324,8 @@ public Boolean splitTranscriptAtPhraseEnd() {
* The values increase on a monotonic curve. Specifying one or two decimal places of precision
* (for example, `0.55`) is typically more than sufficient.
*
- * The parameter is supported with all next-generation models and with most previous-generation
- * models. See [Speech detector
+ * The parameter is supported with all large speech models, next-generation models and with
+ * most previous-generation models. See [Speech detector
* sensitivity](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-sensitivity)
* and [Language model
* support](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-support).
@@ -1336,8 +1350,8 @@ public Float speechDetectorSensitivity() {
* The values increase on a monotonic curve. Specifying one or two decimal places of precision
* (for example, `0.55`) is typically more than sufficient.
*
- * The parameter is supported with all next-generation models and with most previous-generation
- * models. See [Background audio
+ * The parameter is supported with all large speech models, next-generation models and with
+ * most previous-generation models. See [Background audio
* suppression](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-suppression)
* and [Language model
* support](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-support).
@@ -1357,9 +1371,9 @@ public Float backgroundAudioSuppression() {
* parameter causes the models to produce results even more quickly, though the results might be
* less accurate when the parameter is used.
*
- * The parameter is not available for previous-generation `Broadband` and `Narrowband` models.
- * It is available for most next-generation models. * For a list of next-generation models that
- * support low latency, see [Supported next-generation language
+ * The parameter is not available for large speech models and previous-generation `Broadband`
+ * and `Narrowband` models. It is available for most next-generation models. * For a list of
+ * next-generation models that support low latency, see [Supported next-generation language
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported).
* * For more information about the `low_latency` parameter, see [Low
* latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
@@ -1373,9 +1387,10 @@ public Boolean lowLatency() {
/**
* Gets the characterInsertionBias.
*
- * For next-generation models, an indication of whether the service is biased to recognize
- * shorter or longer strings of characters when developing transcription hypotheses. By default,
- * the service is optimized to produce the best balance of strings of different lengths.
+ * For large speech models and next-generation models, an indication of whether the service is
+ * biased to recognize shorter or longer strings of characters when developing transcription
+ * hypotheses. By default, the service is optimized to produce the best balance of strings of
+ * different lengths.
*
* The default bias is 0.0. The allowable range of values is -1.0 to 1.0. * Negative values
* bias the service to favor hypotheses with shorter strings of characters. * Positive values bias
diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateLanguageModelOptions.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateLanguageModelOptions.java
index e6922f9a05..99cf429c92 100644
--- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateLanguageModelOptions.java
+++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateLanguageModelOptions.java
@@ -1,5 +1,5 @@
/*
- * (C) Copyright IBM Corp. 2018, 2023.
+ * (C) Copyright IBM Corp. 2024.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
@@ -39,6 +39,8 @@ public interface BaseModelName {
String DE_DE_NARROWBANDMODEL = "de-DE_NarrowbandModel";
/** de-DE_Telephony. */
String DE_DE_TELEPHONY = "de-DE_Telephony";
+ /** en-AU. */
+ String EN_AU = "en-AU";
/** en-AU_BroadbandModel. */
String EN_AU_BROADBANDMODEL = "en-AU_BroadbandModel";
/** en-AU_Multimedia. */
@@ -47,6 +49,8 @@ public interface BaseModelName {
String EN_AU_NARROWBANDMODEL = "en-AU_NarrowbandModel";
/** en-AU_Telephony. */
String EN_AU_TELEPHONY = "en-AU_Telephony";
+ /** en-GB. */
+ String EN_GB = "en-GB";
/** en-GB_BroadbandModel. */
String EN_GB_BROADBANDMODEL = "en-GB_BroadbandModel";
/** en-GB_Multimedia. */
@@ -55,8 +59,12 @@ public interface BaseModelName {
String EN_GB_NARROWBANDMODEL = "en-GB_NarrowbandModel";
/** en-GB_Telephony. */
String EN_GB_TELEPHONY = "en-GB_Telephony";
+ /** en-IN. */
+ String EN_IN = "en-IN";
/** en-IN_Telephony. */
String EN_IN_TELEPHONY = "en-IN_Telephony";
+ /** en-US. */
+ String EN_US = "en-US";
/** en-US_BroadbandModel. */
String EN_US_BROADBANDMODEL = "en-US_BroadbandModel";
/** en-US_Multimedia. */
@@ -99,6 +107,8 @@ public interface BaseModelName {
String ES_PE_BROADBANDMODEL = "es-PE_BroadbandModel";
/** es-PE_NarrowbandModel. */
String ES_PE_NARROWBANDMODEL = "es-PE_NarrowbandModel";
+ /** fr-CA. */
+ String FR_CA = "fr-CA";
/** fr-CA_BroadbandModel. */
String FR_CA_BROADBANDMODEL = "fr-CA_BroadbandModel";
/** fr-CA_Multimedia. */
@@ -107,6 +117,8 @@ public interface BaseModelName {
String FR_CA_NARROWBANDMODEL = "fr-CA_NarrowbandModel";
/** fr-CA_Telephony. */
String FR_CA_TELEPHONY = "fr-CA_Telephony";
+ /** fr-FR. */
+ String FR_FR = "fr-FR";
/** fr-FR_BroadbandModel. */
String FR_FR_BROADBANDMODEL = "fr-FR_BroadbandModel";
/** fr-FR_Multimedia. */
@@ -125,6 +137,8 @@ public interface BaseModelName {
String IT_IT_MULTIMEDIA = "it-IT_Multimedia";
/** it-IT_Telephony. */
String IT_IT_TELEPHONY = "it-IT_Telephony";
+ /** ja-JP. */
+ String JA_JP = "ja-JP";
/** ja-JP_BroadbandModel. */
String JA_JP_BROADBANDMODEL = "ja-JP_BroadbandModel";
/** ja-JP_Multimedia. */
diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/GetModelOptions.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/GetModelOptions.java
index 58d75de720..88efe09bef 100644
--- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/GetModelOptions.java
+++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/GetModelOptions.java
@@ -1,5 +1,5 @@
/*
- * (C) Copyright IBM Corp. 2018, 2023.
+ * (C) Copyright IBM Corp. 2024.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
@@ -36,6 +36,8 @@ public interface ModelId {
String DE_DE_NARROWBANDMODEL = "de-DE_NarrowbandModel";
/** de-DE_Telephony. */
String DE_DE_TELEPHONY = "de-DE_Telephony";
+ /** en-AU. */
+ String EN_AU = "en-AU";
/** en-AU_BroadbandModel. */
String EN_AU_BROADBANDMODEL = "en-AU_BroadbandModel";
/** en-AU_Multimedia. */
@@ -44,6 +46,8 @@ public interface ModelId {
String EN_AU_NARROWBANDMODEL = "en-AU_NarrowbandModel";
/** en-AU_Telephony. */
String EN_AU_TELEPHONY = "en-AU_Telephony";
+ /** en-GB. */
+ String EN_GB = "en-GB";
/** en-GB_BroadbandModel. */
String EN_GB_BROADBANDMODEL = "en-GB_BroadbandModel";
/** en-GB_Multimedia. */
@@ -52,8 +56,12 @@ public interface ModelId {
String EN_GB_NARROWBANDMODEL = "en-GB_NarrowbandModel";
/** en-GB_Telephony. */
String EN_GB_TELEPHONY = "en-GB_Telephony";
+ /** en-IN. */
+ String EN_IN = "en-IN";
/** en-IN_Telephony. */
String EN_IN_TELEPHONY = "en-IN_Telephony";
+ /** en-US. */
+ String EN_US = "en-US";
/** en-US_BroadbandModel. */
String EN_US_BROADBANDMODEL = "en-US_BroadbandModel";
/** en-US_Multimedia. */
@@ -96,6 +104,8 @@ public interface ModelId {
String ES_PE_BROADBANDMODEL = "es-PE_BroadbandModel";
/** es-PE_NarrowbandModel. */
String ES_PE_NARROWBANDMODEL = "es-PE_NarrowbandModel";
+ /** fr-CA. */
+ String FR_CA = "fr-CA";
/** fr-CA_BroadbandModel. */
String FR_CA_BROADBANDMODEL = "fr-CA_BroadbandModel";
/** fr-CA_Multimedia. */
@@ -104,6 +114,8 @@ public interface ModelId {
String FR_CA_NARROWBANDMODEL = "fr-CA_NarrowbandModel";
/** fr-CA_Telephony. */
String FR_CA_TELEPHONY = "fr-CA_Telephony";
+ /** fr-FR. */
+ String FR_FR = "fr-FR";
/** fr-FR_BroadbandModel. */
String FR_FR_BROADBANDMODEL = "fr-FR_BroadbandModel";
/** fr-FR_Multimedia. */
@@ -122,6 +134,8 @@ public interface ModelId {
String IT_IT_MULTIMEDIA = "it-IT_Multimedia";
/** it-IT_Telephony. */
String IT_IT_TELEPHONY = "it-IT_Telephony";
+ /** ja-JP. */
+ String JA_JP = "ja-JP";
/** ja-JP_BroadbandModel. */
String JA_JP_BROADBANDMODEL = "ja-JP_BroadbandModel";
/** ja-JP_Multimedia. */
diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptions.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptions.java
index 94476fbdb5..21119472b5 100644
--- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptions.java
+++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptions.java
@@ -1,5 +1,5 @@
/*
- * (C) Copyright IBM Corp. 2016, 2024.
+ * (C) Copyright IBM Corp. 2024.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
@@ -51,6 +51,8 @@ public interface Model {
String DE_DE_NARROWBANDMODEL = "de-DE_NarrowbandModel";
/** de-DE_Telephony. */
String DE_DE_TELEPHONY = "de-DE_Telephony";
+ /** en-AU. */
+ String EN_AU = "en-AU";
/** en-AU_BroadbandModel. */
String EN_AU_BROADBANDMODEL = "en-AU_BroadbandModel";
/** en-AU_Multimedia. */
@@ -59,8 +61,12 @@ public interface Model {
String EN_AU_NARROWBANDMODEL = "en-AU_NarrowbandModel";
/** en-AU_Telephony. */
String EN_AU_TELEPHONY = "en-AU_Telephony";
+ /** en-IN. */
+ String EN_IN = "en-IN";
/** en-IN_Telephony. */
String EN_IN_TELEPHONY = "en-IN_Telephony";
+ /** en-GB. */
+ String EN_GB = "en-GB";
/** en-GB_BroadbandModel. */
String EN_GB_BROADBANDMODEL = "en-GB_BroadbandModel";
/** en-GB_Multimedia. */
@@ -69,6 +75,8 @@ public interface Model {
String EN_GB_NARROWBANDMODEL = "en-GB_NarrowbandModel";
/** en-GB_Telephony. */
String EN_GB_TELEPHONY = "en-GB_Telephony";
+ /** en-US. */
+ String EN_US = "en-US";
/** en-US_BroadbandModel. */
String EN_US_BROADBANDMODEL = "en-US_BroadbandModel";
/** en-US_Multimedia. */
@@ -111,6 +119,8 @@ public interface Model {
String ES_PE_BROADBANDMODEL = "es-PE_BroadbandModel";
/** es-PE_NarrowbandModel. */
String ES_PE_NARROWBANDMODEL = "es-PE_NarrowbandModel";
+ /** fr-CA. */
+ String FR_CA = "fr-CA";
/** fr-CA_BroadbandModel. */
String FR_CA_BROADBANDMODEL = "fr-CA_BroadbandModel";
/** fr-CA_Multimedia. */
@@ -119,6 +129,8 @@ public interface Model {
String FR_CA_NARROWBANDMODEL = "fr-CA_NarrowbandModel";
/** fr-CA_Telephony. */
String FR_CA_TELEPHONY = "fr-CA_Telephony";
+ /** fr-FR. */
+ String FR_FR = "fr-FR";
/** fr-FR_BroadbandModel. */
String FR_FR_BROADBANDMODEL = "fr-FR_BroadbandModel";
/** fr-FR_Multimedia. */
@@ -137,6 +149,8 @@ public interface Model {
String IT_IT_MULTIMEDIA = "it-IT_Multimedia";
/** it-IT_Telephony. */
String IT_IT_TELEPHONY = "it-IT_Telephony";
+ /** ja-JP. */
+ String JA_JP = "ja-JP";
/** ja-JP_BroadbandModel. */
String JA_JP_BROADBANDMODEL = "ja-JP_BroadbandModel";
/** ja-JP_Multimedia. */
@@ -184,6 +198,7 @@ public interface Model {
protected InputStream audio;
protected String contentType;
protected String model;
+ protected Boolean speechBeginEvent;
protected String languageCustomizationId;
protected String acousticCustomizationId;
protected String baseModelVersion;
@@ -214,6 +229,7 @@ public static class Builder {
private InputStream audio;
private String contentType;
private String model;
+ private Boolean speechBeginEvent;
private String languageCustomizationId;
private String acousticCustomizationId;
private String baseModelVersion;
@@ -248,6 +264,7 @@ private Builder(RecognizeOptions recognizeOptions) {
this.audio = recognizeOptions.audio;
this.contentType = recognizeOptions.contentType;
this.model = recognizeOptions.model;
+ this.speechBeginEvent = recognizeOptions.speechBeginEvent;
this.languageCustomizationId = recognizeOptions.languageCustomizationId;
this.acousticCustomizationId = recognizeOptions.acousticCustomizationId;
this.baseModelVersion = recognizeOptions.baseModelVersion;
@@ -343,6 +360,17 @@ public Builder model(String model) {
return this;
}
+ /**
+ * Set the speechBeginEvent.
+ *
+ * @param speechBeginEvent the speechBeginEvent
+ * @return the RecognizeOptions builder
+ */
+ public Builder speechBeginEvent(Boolean speechBeginEvent) {
+ this.speechBeginEvent = speechBeginEvent;
+ return this;
+ }
+
/**
* Set the languageCustomizationId.
*
@@ -627,6 +655,7 @@ protected RecognizeOptions(Builder builder) {
audio = builder.audio;
contentType = builder.contentType;
model = builder.model;
+ speechBeginEvent = builder.speechBeginEvent;
languageCustomizationId = builder.languageCustomizationId;
acousticCustomizationId = builder.acousticCustomizationId;
baseModelVersion = builder.baseModelVersion;
@@ -706,6 +735,22 @@ public String model() {
return model;
}
+ /**
+ * Gets the speechBeginEvent.
+ *
+ * If `true`, the service returns a response object `SpeechActivity` which contains the time
+ * when a speech activity is detected in the stream. This can be used both in standard and low
+ * latency mode. This feature enables client applications to know that some words/speech has been
+ * detected and the service is in the process of decoding. This can be used in lieu of interim
+ * results in standard mode. See [Using speech recognition
+ * parameters](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-service-features#features-parameters).
+ *
+ * @return the speechBeginEvent
+ */
+ public Boolean speechBeginEvent() {
+ return speechBeginEvent;
+ }
+
/**
* Gets the languageCustomizationId.
*
@@ -764,9 +809,9 @@ public String baseModelVersion() {
* custom language model compared to those from the base model for the current request.
*
* Specify a value between 0.0 and 1.0. Unless a different customization weight was specified
- * for the custom model when the model was trained, the default value is: * 0.3 for
- * previous-generation models * 0.2 for most next-generation models * 0.1 for next-generation
- * English and Japanese models
+ * for the custom model when the model was trained, the default value is: * 0.5 for large speech
+ * models * 0.3 for previous-generation models * 0.2 for most next-generation models * 0.1 for
+ * next-generation English and Japanese models
*
* A customization weight that you specify overrides a weight that was specified when the
* custom model was trained. The default value yields the best performance in general. Assign a
@@ -929,8 +974,8 @@ public Boolean smartFormatting() {
/**
* Gets the smartFormattingVersion.
*
- * Smart formatting version is for next-generation models and that is supported in US English,
- * Brazilian Portuguese, French and German languages.
+ * Smart formatting version for large speech models and next-generation models is supported in
+ * US English, Brazilian Portuguese, French, German, Spanish and French Canadian languages.
*
* @return the smartFormattingVersion
*/
@@ -947,8 +992,8 @@ public Long smartFormattingVersion() {
* of whether you specify `false` for the parameter. * _For previous-generation models,_ the
* parameter can be used with Australian English, US English, German, Japanese, Korean, and
* Spanish (both broadband and narrowband models) and UK English (narrowband model) transcription
- * only. * _For next-generation models,_ the parameter can be used with Czech, English
- * (Australian, Indian, UK, and US), German, Japanese, Korean, and Spanish transcription only.
+ * only. * _For large speech models and next-generation models,_ the parameter can be used with
+ * all available languages.
*
* See [Speaker
* labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-speaker-labels).
@@ -1080,8 +1125,8 @@ public Boolean splitTranscriptAtPhraseEnd() {
* The values increase on a monotonic curve. Specifying one or two decimal places of precision
* (for example, `0.55`) is typically more than sufficient.
*
- * The parameter is supported with all next-generation models and with most previous-generation
- * models. See [Speech detector
+ * The parameter is supported with all large speech models, next-generation models and with
+ * most previous-generation models. See [Speech detector
* sensitivity](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-sensitivity)
* and [Language model
* support](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-support).
@@ -1106,8 +1151,8 @@ public Float speechDetectorSensitivity() {
* The values increase on a monotonic curve. Specifying one or two decimal places of precision
* (for example, `0.55`) is typically more than sufficient.
*
- * The parameter is supported with all next-generation models and with most previous-generation
- * models. See [Background audio
+ * The parameter is supported with all large speech models, next-generation models and with
+ * most previous-generation models. See [Background audio
* suppression](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-suppression)
* and [Language model
* support](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-support).
@@ -1127,9 +1172,9 @@ public Float backgroundAudioSuppression() {
* parameter causes the models to produce results even more quickly, though the results might be
* less accurate when the parameter is used.
*
- * The parameter is not available for previous-generation `Broadband` and `Narrowband` models.
- * It is available for most next-generation models. * For a list of next-generation models that
- * support low latency, see [Supported next-generation language
+ * The parameter is not available for large speech models and previous-generation `Broadband`
+ * and `Narrowband` models. It is available for most next-generation models. * For a list of
+ * next-generation models that support low latency, see [Supported next-generation language
* models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported).
* * For more information about the `low_latency` parameter, see [Low
* latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
@@ -1143,9 +1188,10 @@ public Boolean lowLatency() {
/**
* Gets the characterInsertionBias.
*
- * For next-generation models, an indication of whether the service is biased to recognize
- * shorter or longer strings of characters when developing transcription hypotheses. By default,
- * the service is optimized to produce the best balance of strings of different lengths.
+ * For large speech models and next-generation models, an indication of whether the service is
+ * biased to recognize shorter or longer strings of characters when developing transcription
+ * hypotheses. By default, the service is optimized to produce the best balance of strings of
+ * different lengths.
*
* The default bias is 0.0. The allowable range of values is -1.0 to 1.0. * Negative values
* bias the service to favor hypotheses with shorter strings of characters. * Positive values bias
diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/TrainLanguageModelOptions.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/TrainLanguageModelOptions.java
index f32bd6de8d..53a61a7f92 100644
--- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/TrainLanguageModelOptions.java
+++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/TrainLanguageModelOptions.java
@@ -1,5 +1,5 @@
/*
- * (C) Copyright IBM Corp. 2018, 2024.
+ * (C) Copyright IBM Corp. 2024.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
@@ -25,9 +25,9 @@ public class TrainLanguageModelOptions extends GenericModel {
* that were added or modified by the user directly. The model is not trained on new words
* extracted from corpora or grammars.
*
- * _For custom models that are based on next-generation models_, the service ignores the
- * parameter. The words resource contains only custom words that the user adds or modifies
- * directly, so the parameter is unnecessary.
+ * _For custom models that are based on large speech models and next-generation models_, the
+ * service ignores the `word_type_to_add` parameter. The words resource contains only custom words
+ * that the user adds or modifies directly, so the parameter is unnecessary.
*/
public interface WordTypeToAdd {
/** all. */
@@ -184,9 +184,9 @@ public String customizationId() {
* that were added or modified by the user directly. The model is not trained on new words
* extracted from corpora or grammars.
*
- * _For custom models that are based on next-generation models_, the service ignores the
- * parameter. The words resource contains only custom words that the user adds or modifies
- * directly, so the parameter is unnecessary.
+ * _For custom models that are based on large speech models and next-generation models_, the
+ * service ignores the `word_type_to_add` parameter. The words resource contains only custom words
+ * that the user adds or modifies directly, so the parameter is unnecessary.
*
* @return the wordTypeToAdd
*/
@@ -200,8 +200,8 @@ public String wordTypeToAdd() {
* Specifies a customization weight for the custom language model. The customization weight
* tells the service how much weight to give to words from the custom language model compared to
* those from the base model for speech recognition. Specify a value between 0.0 and 1.0. The
- * default value is: * 0.3 for previous-generation models * 0.2 for most next-generation models *
- * 0.1 for next-generation English and Japanese models
+ * default value is: * 0.5 for large speech models * 0.3 for previous-generation models * 0.2 for
+ * most next-generation models * 0.1 for next-generation English and Japanese models
*
* The default value yields the best performance in general. Assign a higher value if your
* audio makes frequent use of OOV words from the custom model. Use caution when setting the
diff --git a/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/SpeechToTextTest.java b/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/SpeechToTextTest.java
index 1c340b1772..d345abbb9d 100755
--- a/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/SpeechToTextTest.java
+++ b/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/SpeechToTextTest.java
@@ -1,5 +1,5 @@
/*
- * (C) Copyright IBM Corp. 2019, 2024.
+ * (C) Copyright IBM Corp. 2024.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
@@ -226,6 +226,7 @@ public void testRecognizeWOptions() throws Throwable {
.audio(TestUtilities.createMockStream("This is a mock file."))
.contentType("application/octet-stream")
.model("en-US_BroadbandModel")
+ .speechBeginEvent(false)
.languageCustomizationId("testString")
.acousticCustomizationId("testString")
.baseModelVersion("testString")
@@ -270,6 +271,7 @@ public void testRecognizeWOptions() throws Throwable {
Map