diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/SpeechToText.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/SpeechToText.java index 55aa1febf0..f6fccac6c0 100644 --- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/SpeechToText.java +++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/SpeechToText.java @@ -1,5 +1,5 @@ /* - * (C) Copyright IBM Corp. 2016, 2024. + * (C) Copyright IBM Corp. 2024. * * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at @@ -113,8 +113,8 @@ * *

Effective **31 July 2023**, all previous-generation models will be removed from the service * and the documentation. Most previous-generation models were deprecated on 15 March 2022. You must - * migrate to the equivalent next-generation model by 31 July 2023. For more information, see - * [Migrating to next-generation + * migrate to the equivalent large speech model or next-generation model by 31 July 2023. For more + * information, see [Migrating to large speech * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate).{: * deprecated} * @@ -372,31 +372,37 @@ public ServiceCall getModel(GetModelOptions getModelOptions) { *

**See also:** [Supported audio * formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats). * - *

### Next-generation models + *

### Large speech models and Next-generation models * - *

The service supports next-generation `Multimedia` (16 kHz) and `Telephony` (8 kHz) models - * for many languages. Next-generation models have higher throughput than the service's previous - * generation of `Broadband` and `Narrowband` models. When you use next-generation models, the - * service can return transcriptions more quickly and also provide noticeably better transcription - * accuracy. + *

The service supports large speech models and next-generation `Multimedia` (16 kHz) and + * `Telephony` (8 kHz) models for many languages. Large speech models and next-generation models + * have higher throughput than the service's previous generation of `Broadband` and `Narrowband` + * models. When you use large speech models and next-generation models, the service can return + * transcriptions more quickly and also provide noticeably better transcription accuracy. * - *

You specify a next-generation model by using the `model` query parameter, as you do a - * previous-generation model. Most next-generation models support the `low_latency` parameter, and - * all next-generation models support the `character_insertion_bias` parameter. These parameters - * are not available with previous-generation models. + *

You specify a large speech model or next-generation model by using the `model` query + * parameter, as you do a previous-generation model. Only the next-generation models support the + * `low_latency` parameter, and all large speech models and next-generation models support the + * `character_insertion_bias` parameter. These parameters are not available with + * previous-generation models. * - *

Next-generation models do not support all of the speech recognition parameters that are - * available for use with previous-generation models. Next-generation models do not support the - * following parameters: * `acoustic_customization_id` * `keywords` and `keywords_threshold` * - * `processing_metrics` and `processing_metrics_interval` * `word_alternatives_threshold` + *

Large speech models and next-generation models do not support all of the speech recognition + * parameters that are available for use with previous-generation models. Next-generation models + * do not support the following parameters: * `acoustic_customization_id` * `keywords` and + * `keywords_threshold` * `processing_metrics` and `processing_metrics_interval` * + * `word_alternatives_threshold` * *

**Important:** Effective **31 July 2023**, all previous-generation models will be removed * from the service and the documentation. Most previous-generation models were deprecated on 15 - * March 2022. You must migrate to the equivalent next-generation model by 31 July 2023. For more - * information, see [Migrating to next-generation + * March 2022. You must migrate to the equivalent large speech model or next-generation model by + * 31 July 2023. For more information, see [Migrating to large speech * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate). * - *

**See also:** * [Next-generation languages and + *

**See also:** * [Large speech languages and + * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages) + * * [Supported features for large speech + * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages#models-lsm-supported-features) + * * [Next-generation languages and * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng) * [Supported * features for next-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-features) @@ -439,6 +445,9 @@ public ServiceCall recognize(RecognizeOptions recogniz if (recognizeOptions.model() != null) { builder.query("model", String.valueOf(recognizeOptions.model())); } + if (recognizeOptions.speechBeginEvent() != null) { + builder.query("speech_begin_event", String.valueOf(recognizeOptions.speechBeginEvent())); + } if (recognizeOptions.languageCustomizationId() != null) { builder.query( "language_customization_id", String.valueOf(recognizeOptions.languageCustomizationId())); @@ -699,31 +708,37 @@ public ServiceCall unregisterCallback(UnregisterCallbackOptions unregister *

**See also:** [Supported audio * formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats). * - *

### Next-generation models + *

### Large speech models and Next-generation models * - *

The service supports next-generation `Multimedia` (16 kHz) and `Telephony` (8 kHz) models - * for many languages. Next-generation models have higher throughput than the service's previous - * generation of `Broadband` and `Narrowband` models. When you use next-generation models, the - * service can return transcriptions more quickly and also provide noticeably better transcription - * accuracy. + *

The service supports large speech models and next-generation `Multimedia` (16 kHz) and + * `Telephony` (8 kHz) models for many languages. Large speech models and next-generation models + * have higher throughput than the service's previous generation of `Broadband` and `Narrowband` + * models. When you use large speech models and next-generation models, the service can return + * transcriptions more quickly and also provide noticeably better transcription accuracy. * - *

You specify a next-generation model by using the `model` query parameter, as you do a - * previous-generation model. Most next-generation models support the `low_latency` parameter, and - * all next-generation models support the `character_insertion_bias` parameter. These parameters - * are not available with previous-generation models. + *

You specify a large speech model or next-generation model by using the `model` query + * parameter, as you do a previous-generation model. Only the next-generation models support the + * `low_latency` parameter, and all large speech models and next-generation models support the + * `character_insertion_bias` parameter. These parameters are not available with + * previous-generation models. * - *

Next-generation models do not support all of the speech recognition parameters that are - * available for use with previous-generation models. Next-generation models do not support the - * following parameters: * `acoustic_customization_id` * `keywords` and `keywords_threshold` * - * `processing_metrics` and `processing_metrics_interval` * `word_alternatives_threshold` + *

Large speech models and next-generation models do not support all of the speech recognition + * parameters that are available for use with previous-generation models. Next-generation models + * do not support the following parameters: * `acoustic_customization_id` * `keywords` and + * `keywords_threshold` * `processing_metrics` and `processing_metrics_interval` * + * `word_alternatives_threshold` * *

**Important:** Effective **31 July 2023**, all previous-generation models will be removed * from the service and the documentation. Most previous-generation models were deprecated on 15 - * March 2022. You must migrate to the equivalent next-generation model by 31 July 2023. For more - * information, see [Migrating to next-generation + * March 2022. You must migrate to the equivalent large speech model or next-generation model by + * 31 July 2023. For more information, see [Migrating to large speech * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate). * - *

**See also:** * [Next-generation languages and + *

**See also:** * [Large speech languages and + * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages) + * * [Supported features for large speech + * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages#models-lsm-supported-features) + * * [Next-generation languages and * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng) * [Supported * features for next-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-features). @@ -994,14 +1009,49 @@ public ServiceCall deleteJob(DeleteJobOptions deleteJobOptions) { * *

**Important:** Effective **31 July 2023**, all previous-generation models will be removed * from the service and the documentation. Most previous-generation models were deprecated on 15 - * March 2022. You must migrate to the equivalent next-generation model by 31 July 2023. For more - * information, see [Migrating to next-generation + * March 2022. You must migrate to the equivalent large speech model or next-generation model by + * 31 July 2023. For more information, see [Migrating to large speech * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate). * *

**See also:** * [Create a custom language * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#createModel-language) * * [Language support for - * customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support). + * customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support) + * + *

### Large speech models and Next-generation models + * + *

The service supports large speech models and next-generation `Multimedia` (16 kHz) and + * `Telephony` (8 kHz) models for many languages. Large speech models and next-generation models + * have higher throughput than the service's previous generation of `Broadband` and `Narrowband` + * models. When you use large speech models and next-generation models, the service can return + * transcriptions more quickly and also provide noticeably better transcription accuracy. + * + *

You specify a large speech model or next-generation model by using the `model` query + * parameter, as you do a previous-generation model. Only the next-generation models support the + * `low_latency` parameter, and all large speech models and next-generation models support the + * `character_insertion_bias` parameter. These parameters are not available with + * previous-generation models. + * + *

Large speech models and next-generation models do not support all of the speech recognition + * parameters that are available for use with previous-generation models. Next-generation models + * do not support the following parameters: * `acoustic_customization_id` * `keywords` and + * `keywords_threshold` * `processing_metrics` and `processing_metrics_interval` * + * `word_alternatives_threshold` + * + *

**Important:** Effective **31 July 2023**, all previous-generation models will be removed + * from the service and the documentation. Most previous-generation models were deprecated on 15 + * March 2022. You must migrate to the equivalent large speech model or next-generation model by + * 31 July 2023. For more information, see [Migrating to large speech + * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate). + * + *

**See also:** * [Large speech languages and + * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages) + * * [Supported features for large speech + * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-large-speech-languages#models-lsm-supported-features) + * * [Next-generation languages and + * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng) * [Supported + * features for next-generation + * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-features). * * @param createLanguageModelOptions the {@link CreateLanguageModelOptions} containing the options * for the call @@ -1403,6 +1453,10 @@ public ServiceCall listCorpora(ListCorporaOptions listCorporaOptions) { * until the service's analysis of the corpus for the current request completes. Use the [Get a * corpus](#getcorpus) method to check the status of the analysis. * + *

_For custom models that are based on large speech models_, the service parses and extracts + * word sequences from one or multiple corpora files. The characters help the service learn and + * predict character sequences from audio. + * *

_For custom models that are based on previous-generation models_, the service auto-populates * the model's words resource with words from the corpus that are not found in its base * vocabulary. These words are referred to as out-of-vocabulary (OOV) words. After adding a @@ -1429,11 +1483,11 @@ public ServiceCall listCorpora(ListCorporaOptions listCorporaOptions) { * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#addCorpus) * * [Working with corpora for previous-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#workingCorpora) - * * [Working with corpora for next-generation + * * [Working with corpora for large speech models and next-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#workingCorpora-ng) * * [Validating a words resource for previous-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#validateModel) - * * [Validating a words resource for next-generation + * * [Validating a words resource for large speech models and next-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#validateModel-ng). * * @param addCorpusOptions the {@link AddCorpusOptions} containing the options for the call @@ -1657,11 +1711,11 @@ public ServiceCall listWords(ListWordsOptions listWordsOptions) { * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#addWords) * * [Working with custom words for previous-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#workingWords) - * * [Working with custom words for next-generation + * * [Working with custom words for large speech models and next-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#workingWords-ng) * * [Validating a words resource for previous-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#validateModel) - * * [Validating a words resource for next-generation + * * [Validating a words resource for large speech models and next-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#validateModel-ng). * * @param addWordsOptions the {@link AddWordsOptions} containing the options for the call @@ -1732,11 +1786,11 @@ public ServiceCall addWords(AddWordsOptions addWordsOptions) { * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#addWords) * * [Working with custom words for previous-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#workingWords) - * * [Working with custom words for next-generation + * * [Working with custom words for large speech models and next-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#workingWords-ng) * * [Validating a words resource for previous-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#validateModel) - * * [Validating a words resource for next-generation + * * [Validating a words resource for large speech models and next-generation * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#validateModel-ng). * * @param addWordOptions the {@link AddWordOptions} containing the options for the call @@ -2057,12 +2111,12 @@ public ServiceCall deleteGrammar(DeleteGrammarOptions deleteGrammarOptions * but you cannot create any more until your model count is below the limit. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**Important:** Effective **31 July 2023**, all previous-generation models will be removed * from the service and the documentation. Most previous-generation models were deprecated on 15 - * March 2022. You must migrate to the equivalent next-generation model by 31 July 2023. For more - * information, see [Migrating to next-generation + * March 2022. You must migrate to the equivalent large speech model or next-generation model by + * 31 July 2023. For more information, see [Migrating to large speech * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-migrate). * *

**See also:** [Create a custom acoustic @@ -2107,7 +2161,7 @@ public ServiceCall createAcousticModel( * credentials for the instance of the service that owns a model to list information about it. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**See also:** [Listing custom acoustic * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#listModels-acoustic). @@ -2148,7 +2202,7 @@ public ServiceCall listAcousticModels( * credentials for the instance of the service that owns a model to list information about it. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**See also:** [Listing custom acoustic * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#listModels-acoustic). @@ -2166,7 +2220,7 @@ public ServiceCall listAcousticModels() { * instance of the service that owns a model to list information about it. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**See also:** [Listing custom acoustic * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#listModels-acoustic). @@ -2205,7 +2259,7 @@ public ServiceCall getAcousticModel( * use credentials for the instance of the service that owns a model to delete it. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**See also:** [Deleting a custom acoustic * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#deleteModel-acoustic). @@ -2268,7 +2322,7 @@ public ServiceCall deleteAcousticModel( * fully trained and available. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**See also:** * [Train the custom acoustic * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acoustic#trainModel-acoustic) @@ -2342,7 +2396,7 @@ public ServiceCall trainAcousticModel( * owns a model to reset it. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**See also:** [Resetting a custom acoustic * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#resetModel-acoustic). @@ -2400,7 +2454,7 @@ public ServiceCall resetAcousticModel(ResetAcousticModelOptions resetAcous * language model. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**See also:** [Upgrading a custom acoustic * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade#custom-upgrade-acoustic). @@ -2450,7 +2504,7 @@ public ServiceCall upgradeAcousticModel( * audio resources. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**See also:** [Listing audio resources for a custom acoustic * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#listAudio). @@ -2521,7 +2575,7 @@ public ServiceCall listAudio(ListAudioOptions listAudioOptions) * becomes `ok`. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**See also:** [Add audio to the custom acoustic * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acoustic#addAudio). @@ -2631,7 +2685,7 @@ public ServiceCall addAudio(AddAudioOptions addAudioOptions) { * resources. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**See also:** [Listing audio resources for a custom acoustic * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#listAudio). @@ -2676,7 +2730,7 @@ public ServiceCall getAudio(GetAudioOptions getAudioOptions) { * audio resources. * *

**Note:** Acoustic model customization is supported only for use with previous-generation - * models. It is not supported for next-generation models. + * models. It is not supported for large speech models and next-generation models. * *

**See also:** [Deleting an audio resource from a custom acoustic * model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#deleteAudio). diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/Corpus.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/Corpus.java index c67051d400..5c75be5491 100644 --- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/Corpus.java +++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/Corpus.java @@ -1,5 +1,5 @@ /* - * (C) Copyright IBM Corp. 2016, 2023. + * (C) Copyright IBM Corp. 2024. * * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at @@ -73,8 +73,9 @@ public Long getTotalWords() { /** * Gets the outOfVocabularyWords. * - *

_For custom models that are based on previous-generation models_, the number of OOV words - * extracted from the corpus. The value is `0` while the corpus is being processed. + *

_For custom models that are based on large speech models and previous-generation models_, + * the number of OOV words extracted from the corpus. The value is `0` while the corpus is being + * processed. * *

_For custom models that are based on next-generation models_, no OOV words are extracted * from corpora, so the value is always `0`. diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateJobOptions.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateJobOptions.java index 6508138f67..45d4ad890b 100644 --- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateJobOptions.java +++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateJobOptions.java @@ -1,5 +1,5 @@ /* - * (C) Copyright IBM Corp. 2018, 2024. + * (C) Copyright IBM Corp. 2024. * * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at @@ -51,6 +51,8 @@ public interface Model { String DE_DE_NARROWBANDMODEL = "de-DE_NarrowbandModel"; /** de-DE_Telephony. */ String DE_DE_TELEPHONY = "de-DE_Telephony"; + /** en-AU. */ + String EN_AU = "en-AU"; /** en-AU_BroadbandModel. */ String EN_AU_BROADBANDMODEL = "en-AU_BroadbandModel"; /** en-AU_Multimedia. */ @@ -59,8 +61,12 @@ public interface Model { String EN_AU_NARROWBANDMODEL = "en-AU_NarrowbandModel"; /** en-AU_Telephony. */ String EN_AU_TELEPHONY = "en-AU_Telephony"; + /** en-IN. */ + String EN_IN = "en-IN"; /** en-IN_Telephony. */ String EN_IN_TELEPHONY = "en-IN_Telephony"; + /** en-GB. */ + String EN_GB = "en-GB"; /** en-GB_BroadbandModel. */ String EN_GB_BROADBANDMODEL = "en-GB_BroadbandModel"; /** en-GB_Multimedia. */ @@ -69,6 +75,8 @@ public interface Model { String EN_GB_NARROWBANDMODEL = "en-GB_NarrowbandModel"; /** en-GB_Telephony. */ String EN_GB_TELEPHONY = "en-GB_Telephony"; + /** en-US. */ + String EN_US = "en-US"; /** en-US_BroadbandModel. */ String EN_US_BROADBANDMODEL = "en-US_BroadbandModel"; /** en-US_Multimedia. */ @@ -111,6 +119,8 @@ public interface Model { String ES_PE_BROADBANDMODEL = "es-PE_BroadbandModel"; /** es-PE_NarrowbandModel. */ String ES_PE_NARROWBANDMODEL = "es-PE_NarrowbandModel"; + /** fr-CA. */ + String FR_CA = "fr-CA"; /** fr-CA_BroadbandModel. */ String FR_CA_BROADBANDMODEL = "fr-CA_BroadbandModel"; /** fr-CA_Multimedia. */ @@ -119,6 +129,8 @@ public interface Model { String FR_CA_NARROWBANDMODEL = "fr-CA_NarrowbandModel"; /** fr-CA_Telephony. */ String FR_CA_TELEPHONY = "fr-CA_Telephony"; + /** fr-FR. */ + String FR_FR = "fr-FR"; /** fr-FR_BroadbandModel. */ String FR_FR_BROADBANDMODEL = "fr-FR_BroadbandModel"; /** fr-FR_Multimedia. */ @@ -137,6 +149,8 @@ public interface Model { String IT_IT_MULTIMEDIA = "it-IT_Multimedia"; /** it-IT_Telephony. */ String IT_IT_TELEPHONY = "it-IT_Telephony"; + /** ja-JP. */ + String JA_JP = "ja-JP"; /** ja-JP_BroadbandModel. */ String JA_JP_BROADBANDMODEL = "ja-JP_BroadbandModel"; /** ja-JP_Multimedia. */ @@ -952,9 +966,9 @@ public String baseModelVersion() { * custom language model compared to those from the base model for the current request. * *

Specify a value between 0.0 and 1.0. Unless a different customization weight was specified - * for the custom model when the model was trained, the default value is: * 0.3 for - * previous-generation models * 0.2 for most next-generation models * 0.1 for next-generation - * English and Japanese models + * for the custom model when the model was trained, the default value is: * 0.5 for large speech + * models * 0.3 for previous-generation models * 0.2 for most next-generation models * 0.1 for + * next-generation English and Japanese models * *

A customization weight that you specify overrides a weight that was specified when the * custom model was trained. The default value yields the best performance in general. Assign a @@ -1117,8 +1131,8 @@ public Boolean smartFormatting() { /** * Gets the smartFormattingVersion. * - *

Smart formatting version is for next-generation models and that is supported in US English, - * Brazilian Portuguese, French and German languages. + *

Smart formatting version for large speech models and next-generation models is supported in + * US English, Brazilian Portuguese, French, German, Spanish and French Canadian languages. * * @return the smartFormattingVersion */ @@ -1135,8 +1149,8 @@ public Long smartFormattingVersion() { * of whether you specify `false` for the parameter. * _For previous-generation models,_ the * parameter can be used with Australian English, US English, German, Japanese, Korean, and * Spanish (both broadband and narrowband models) and UK English (narrowband model) transcription - * only. * _For next-generation models,_ the parameter can be used with Czech, English - * (Australian, Indian, UK, and US), German, Japanese, Korean, and Spanish transcription only. + * only. * _For large speech models and next-generation models,_ the parameter can be used with + * all available languages. * *

See [Speaker * labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-speaker-labels). @@ -1310,8 +1324,8 @@ public Boolean splitTranscriptAtPhraseEnd() { *

The values increase on a monotonic curve. Specifying one or two decimal places of precision * (for example, `0.55`) is typically more than sufficient. * - *

The parameter is supported with all next-generation models and with most previous-generation - * models. See [Speech detector + *

The parameter is supported with all large speech models, next-generation models and with + * most previous-generation models. See [Speech detector * sensitivity](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-sensitivity) * and [Language model * support](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-support). @@ -1336,8 +1350,8 @@ public Float speechDetectorSensitivity() { *

The values increase on a monotonic curve. Specifying one or two decimal places of precision * (for example, `0.55`) is typically more than sufficient. * - *

The parameter is supported with all next-generation models and with most previous-generation - * models. See [Background audio + *

The parameter is supported with all large speech models, next-generation models and with + * most previous-generation models. See [Background audio * suppression](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-suppression) * and [Language model * support](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-support). @@ -1357,9 +1371,9 @@ public Float backgroundAudioSuppression() { * parameter causes the models to produce results even more quickly, though the results might be * less accurate when the parameter is used. * - *

The parameter is not available for previous-generation `Broadband` and `Narrowband` models. - * It is available for most next-generation models. * For a list of next-generation models that - * support low latency, see [Supported next-generation language + *

The parameter is not available for large speech models and previous-generation `Broadband` + * and `Narrowband` models. It is available for most next-generation models. * For a list of + * next-generation models that support low latency, see [Supported next-generation language * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported). * * For more information about the `low_latency` parameter, see [Low * latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency). @@ -1373,9 +1387,10 @@ public Boolean lowLatency() { /** * Gets the characterInsertionBias. * - *

For next-generation models, an indication of whether the service is biased to recognize - * shorter or longer strings of characters when developing transcription hypotheses. By default, - * the service is optimized to produce the best balance of strings of different lengths. + *

For large speech models and next-generation models, an indication of whether the service is + * biased to recognize shorter or longer strings of characters when developing transcription + * hypotheses. By default, the service is optimized to produce the best balance of strings of + * different lengths. * *

The default bias is 0.0. The allowable range of values is -1.0 to 1.0. * Negative values * bias the service to favor hypotheses with shorter strings of characters. * Positive values bias diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateLanguageModelOptions.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateLanguageModelOptions.java index e6922f9a05..99cf429c92 100644 --- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateLanguageModelOptions.java +++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/CreateLanguageModelOptions.java @@ -1,5 +1,5 @@ /* - * (C) Copyright IBM Corp. 2018, 2023. + * (C) Copyright IBM Corp. 2024. * * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at @@ -39,6 +39,8 @@ public interface BaseModelName { String DE_DE_NARROWBANDMODEL = "de-DE_NarrowbandModel"; /** de-DE_Telephony. */ String DE_DE_TELEPHONY = "de-DE_Telephony"; + /** en-AU. */ + String EN_AU = "en-AU"; /** en-AU_BroadbandModel. */ String EN_AU_BROADBANDMODEL = "en-AU_BroadbandModel"; /** en-AU_Multimedia. */ @@ -47,6 +49,8 @@ public interface BaseModelName { String EN_AU_NARROWBANDMODEL = "en-AU_NarrowbandModel"; /** en-AU_Telephony. */ String EN_AU_TELEPHONY = "en-AU_Telephony"; + /** en-GB. */ + String EN_GB = "en-GB"; /** en-GB_BroadbandModel. */ String EN_GB_BROADBANDMODEL = "en-GB_BroadbandModel"; /** en-GB_Multimedia. */ @@ -55,8 +59,12 @@ public interface BaseModelName { String EN_GB_NARROWBANDMODEL = "en-GB_NarrowbandModel"; /** en-GB_Telephony. */ String EN_GB_TELEPHONY = "en-GB_Telephony"; + /** en-IN. */ + String EN_IN = "en-IN"; /** en-IN_Telephony. */ String EN_IN_TELEPHONY = "en-IN_Telephony"; + /** en-US. */ + String EN_US = "en-US"; /** en-US_BroadbandModel. */ String EN_US_BROADBANDMODEL = "en-US_BroadbandModel"; /** en-US_Multimedia. */ @@ -99,6 +107,8 @@ public interface BaseModelName { String ES_PE_BROADBANDMODEL = "es-PE_BroadbandModel"; /** es-PE_NarrowbandModel. */ String ES_PE_NARROWBANDMODEL = "es-PE_NarrowbandModel"; + /** fr-CA. */ + String FR_CA = "fr-CA"; /** fr-CA_BroadbandModel. */ String FR_CA_BROADBANDMODEL = "fr-CA_BroadbandModel"; /** fr-CA_Multimedia. */ @@ -107,6 +117,8 @@ public interface BaseModelName { String FR_CA_NARROWBANDMODEL = "fr-CA_NarrowbandModel"; /** fr-CA_Telephony. */ String FR_CA_TELEPHONY = "fr-CA_Telephony"; + /** fr-FR. */ + String FR_FR = "fr-FR"; /** fr-FR_BroadbandModel. */ String FR_FR_BROADBANDMODEL = "fr-FR_BroadbandModel"; /** fr-FR_Multimedia. */ @@ -125,6 +137,8 @@ public interface BaseModelName { String IT_IT_MULTIMEDIA = "it-IT_Multimedia"; /** it-IT_Telephony. */ String IT_IT_TELEPHONY = "it-IT_Telephony"; + /** ja-JP. */ + String JA_JP = "ja-JP"; /** ja-JP_BroadbandModel. */ String JA_JP_BROADBANDMODEL = "ja-JP_BroadbandModel"; /** ja-JP_Multimedia. */ diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/GetModelOptions.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/GetModelOptions.java index 58d75de720..88efe09bef 100644 --- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/GetModelOptions.java +++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/GetModelOptions.java @@ -1,5 +1,5 @@ /* - * (C) Copyright IBM Corp. 2018, 2023. + * (C) Copyright IBM Corp. 2024. * * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at @@ -36,6 +36,8 @@ public interface ModelId { String DE_DE_NARROWBANDMODEL = "de-DE_NarrowbandModel"; /** de-DE_Telephony. */ String DE_DE_TELEPHONY = "de-DE_Telephony"; + /** en-AU. */ + String EN_AU = "en-AU"; /** en-AU_BroadbandModel. */ String EN_AU_BROADBANDMODEL = "en-AU_BroadbandModel"; /** en-AU_Multimedia. */ @@ -44,6 +46,8 @@ public interface ModelId { String EN_AU_NARROWBANDMODEL = "en-AU_NarrowbandModel"; /** en-AU_Telephony. */ String EN_AU_TELEPHONY = "en-AU_Telephony"; + /** en-GB. */ + String EN_GB = "en-GB"; /** en-GB_BroadbandModel. */ String EN_GB_BROADBANDMODEL = "en-GB_BroadbandModel"; /** en-GB_Multimedia. */ @@ -52,8 +56,12 @@ public interface ModelId { String EN_GB_NARROWBANDMODEL = "en-GB_NarrowbandModel"; /** en-GB_Telephony. */ String EN_GB_TELEPHONY = "en-GB_Telephony"; + /** en-IN. */ + String EN_IN = "en-IN"; /** en-IN_Telephony. */ String EN_IN_TELEPHONY = "en-IN_Telephony"; + /** en-US. */ + String EN_US = "en-US"; /** en-US_BroadbandModel. */ String EN_US_BROADBANDMODEL = "en-US_BroadbandModel"; /** en-US_Multimedia. */ @@ -96,6 +104,8 @@ public interface ModelId { String ES_PE_BROADBANDMODEL = "es-PE_BroadbandModel"; /** es-PE_NarrowbandModel. */ String ES_PE_NARROWBANDMODEL = "es-PE_NarrowbandModel"; + /** fr-CA. */ + String FR_CA = "fr-CA"; /** fr-CA_BroadbandModel. */ String FR_CA_BROADBANDMODEL = "fr-CA_BroadbandModel"; /** fr-CA_Multimedia. */ @@ -104,6 +114,8 @@ public interface ModelId { String FR_CA_NARROWBANDMODEL = "fr-CA_NarrowbandModel"; /** fr-CA_Telephony. */ String FR_CA_TELEPHONY = "fr-CA_Telephony"; + /** fr-FR. */ + String FR_FR = "fr-FR"; /** fr-FR_BroadbandModel. */ String FR_FR_BROADBANDMODEL = "fr-FR_BroadbandModel"; /** fr-FR_Multimedia. */ @@ -122,6 +134,8 @@ public interface ModelId { String IT_IT_MULTIMEDIA = "it-IT_Multimedia"; /** it-IT_Telephony. */ String IT_IT_TELEPHONY = "it-IT_Telephony"; + /** ja-JP. */ + String JA_JP = "ja-JP"; /** ja-JP_BroadbandModel. */ String JA_JP_BROADBANDMODEL = "ja-JP_BroadbandModel"; /** ja-JP_Multimedia. */ diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptions.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptions.java index 94476fbdb5..21119472b5 100644 --- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptions.java +++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptions.java @@ -1,5 +1,5 @@ /* - * (C) Copyright IBM Corp. 2016, 2024. + * (C) Copyright IBM Corp. 2024. * * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at @@ -51,6 +51,8 @@ public interface Model { String DE_DE_NARROWBANDMODEL = "de-DE_NarrowbandModel"; /** de-DE_Telephony. */ String DE_DE_TELEPHONY = "de-DE_Telephony"; + /** en-AU. */ + String EN_AU = "en-AU"; /** en-AU_BroadbandModel. */ String EN_AU_BROADBANDMODEL = "en-AU_BroadbandModel"; /** en-AU_Multimedia. */ @@ -59,8 +61,12 @@ public interface Model { String EN_AU_NARROWBANDMODEL = "en-AU_NarrowbandModel"; /** en-AU_Telephony. */ String EN_AU_TELEPHONY = "en-AU_Telephony"; + /** en-IN. */ + String EN_IN = "en-IN"; /** en-IN_Telephony. */ String EN_IN_TELEPHONY = "en-IN_Telephony"; + /** en-GB. */ + String EN_GB = "en-GB"; /** en-GB_BroadbandModel. */ String EN_GB_BROADBANDMODEL = "en-GB_BroadbandModel"; /** en-GB_Multimedia. */ @@ -69,6 +75,8 @@ public interface Model { String EN_GB_NARROWBANDMODEL = "en-GB_NarrowbandModel"; /** en-GB_Telephony. */ String EN_GB_TELEPHONY = "en-GB_Telephony"; + /** en-US. */ + String EN_US = "en-US"; /** en-US_BroadbandModel. */ String EN_US_BROADBANDMODEL = "en-US_BroadbandModel"; /** en-US_Multimedia. */ @@ -111,6 +119,8 @@ public interface Model { String ES_PE_BROADBANDMODEL = "es-PE_BroadbandModel"; /** es-PE_NarrowbandModel. */ String ES_PE_NARROWBANDMODEL = "es-PE_NarrowbandModel"; + /** fr-CA. */ + String FR_CA = "fr-CA"; /** fr-CA_BroadbandModel. */ String FR_CA_BROADBANDMODEL = "fr-CA_BroadbandModel"; /** fr-CA_Multimedia. */ @@ -119,6 +129,8 @@ public interface Model { String FR_CA_NARROWBANDMODEL = "fr-CA_NarrowbandModel"; /** fr-CA_Telephony. */ String FR_CA_TELEPHONY = "fr-CA_Telephony"; + /** fr-FR. */ + String FR_FR = "fr-FR"; /** fr-FR_BroadbandModel. */ String FR_FR_BROADBANDMODEL = "fr-FR_BroadbandModel"; /** fr-FR_Multimedia. */ @@ -137,6 +149,8 @@ public interface Model { String IT_IT_MULTIMEDIA = "it-IT_Multimedia"; /** it-IT_Telephony. */ String IT_IT_TELEPHONY = "it-IT_Telephony"; + /** ja-JP. */ + String JA_JP = "ja-JP"; /** ja-JP_BroadbandModel. */ String JA_JP_BROADBANDMODEL = "ja-JP_BroadbandModel"; /** ja-JP_Multimedia. */ @@ -184,6 +198,7 @@ public interface Model { protected InputStream audio; protected String contentType; protected String model; + protected Boolean speechBeginEvent; protected String languageCustomizationId; protected String acousticCustomizationId; protected String baseModelVersion; @@ -214,6 +229,7 @@ public static class Builder { private InputStream audio; private String contentType; private String model; + private Boolean speechBeginEvent; private String languageCustomizationId; private String acousticCustomizationId; private String baseModelVersion; @@ -248,6 +264,7 @@ private Builder(RecognizeOptions recognizeOptions) { this.audio = recognizeOptions.audio; this.contentType = recognizeOptions.contentType; this.model = recognizeOptions.model; + this.speechBeginEvent = recognizeOptions.speechBeginEvent; this.languageCustomizationId = recognizeOptions.languageCustomizationId; this.acousticCustomizationId = recognizeOptions.acousticCustomizationId; this.baseModelVersion = recognizeOptions.baseModelVersion; @@ -343,6 +360,17 @@ public Builder model(String model) { return this; } + /** + * Set the speechBeginEvent. + * + * @param speechBeginEvent the speechBeginEvent + * @return the RecognizeOptions builder + */ + public Builder speechBeginEvent(Boolean speechBeginEvent) { + this.speechBeginEvent = speechBeginEvent; + return this; + } + /** * Set the languageCustomizationId. * @@ -627,6 +655,7 @@ protected RecognizeOptions(Builder builder) { audio = builder.audio; contentType = builder.contentType; model = builder.model; + speechBeginEvent = builder.speechBeginEvent; languageCustomizationId = builder.languageCustomizationId; acousticCustomizationId = builder.acousticCustomizationId; baseModelVersion = builder.baseModelVersion; @@ -706,6 +735,22 @@ public String model() { return model; } + /** + * Gets the speechBeginEvent. + * + *

If `true`, the service returns a response object `SpeechActivity` which contains the time + * when a speech activity is detected in the stream. This can be used both in standard and low + * latency mode. This feature enables client applications to know that some words/speech has been + * detected and the service is in the process of decoding. This can be used in lieu of interim + * results in standard mode. See [Using speech recognition + * parameters](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-service-features#features-parameters). + * + * @return the speechBeginEvent + */ + public Boolean speechBeginEvent() { + return speechBeginEvent; + } + /** * Gets the languageCustomizationId. * @@ -764,9 +809,9 @@ public String baseModelVersion() { * custom language model compared to those from the base model for the current request. * *

Specify a value between 0.0 and 1.0. Unless a different customization weight was specified - * for the custom model when the model was trained, the default value is: * 0.3 for - * previous-generation models * 0.2 for most next-generation models * 0.1 for next-generation - * English and Japanese models + * for the custom model when the model was trained, the default value is: * 0.5 for large speech + * models * 0.3 for previous-generation models * 0.2 for most next-generation models * 0.1 for + * next-generation English and Japanese models * *

A customization weight that you specify overrides a weight that was specified when the * custom model was trained. The default value yields the best performance in general. Assign a @@ -929,8 +974,8 @@ public Boolean smartFormatting() { /** * Gets the smartFormattingVersion. * - *

Smart formatting version is for next-generation models and that is supported in US English, - * Brazilian Portuguese, French and German languages. + *

Smart formatting version for large speech models and next-generation models is supported in + * US English, Brazilian Portuguese, French, German, Spanish and French Canadian languages. * * @return the smartFormattingVersion */ @@ -947,8 +992,8 @@ public Long smartFormattingVersion() { * of whether you specify `false` for the parameter. * _For previous-generation models,_ the * parameter can be used with Australian English, US English, German, Japanese, Korean, and * Spanish (both broadband and narrowband models) and UK English (narrowband model) transcription - * only. * _For next-generation models,_ the parameter can be used with Czech, English - * (Australian, Indian, UK, and US), German, Japanese, Korean, and Spanish transcription only. + * only. * _For large speech models and next-generation models,_ the parameter can be used with + * all available languages. * *

See [Speaker * labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-speaker-labels). @@ -1080,8 +1125,8 @@ public Boolean splitTranscriptAtPhraseEnd() { *

The values increase on a monotonic curve. Specifying one or two decimal places of precision * (for example, `0.55`) is typically more than sufficient. * - *

The parameter is supported with all next-generation models and with most previous-generation - * models. See [Speech detector + *

The parameter is supported with all large speech models, next-generation models and with + * most previous-generation models. See [Speech detector * sensitivity](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-sensitivity) * and [Language model * support](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-support). @@ -1106,8 +1151,8 @@ public Float speechDetectorSensitivity() { *

The values increase on a monotonic curve. Specifying one or two decimal places of precision * (for example, `0.55`) is typically more than sufficient. * - *

The parameter is supported with all next-generation models and with most previous-generation - * models. See [Background audio + *

The parameter is supported with all large speech models, next-generation models and with + * most previous-generation models. See [Background audio * suppression](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-suppression) * and [Language model * support](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-support). @@ -1127,9 +1172,9 @@ public Float backgroundAudioSuppression() { * parameter causes the models to produce results even more quickly, though the results might be * less accurate when the parameter is used. * - *

The parameter is not available for previous-generation `Broadband` and `Narrowband` models. - * It is available for most next-generation models. * For a list of next-generation models that - * support low latency, see [Supported next-generation language + *

The parameter is not available for large speech models and previous-generation `Broadband` + * and `Narrowband` models. It is available for most next-generation models. * For a list of + * next-generation models that support low latency, see [Supported next-generation language * models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported). * * For more information about the `low_latency` parameter, see [Low * latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency). @@ -1143,9 +1188,10 @@ public Boolean lowLatency() { /** * Gets the characterInsertionBias. * - *

For next-generation models, an indication of whether the service is biased to recognize - * shorter or longer strings of characters when developing transcription hypotheses. By default, - * the service is optimized to produce the best balance of strings of different lengths. + *

For large speech models and next-generation models, an indication of whether the service is + * biased to recognize shorter or longer strings of characters when developing transcription + * hypotheses. By default, the service is optimized to produce the best balance of strings of + * different lengths. * *

The default bias is 0.0. The allowable range of values is -1.0 to 1.0. * Negative values * bias the service to favor hypotheses with shorter strings of characters. * Positive values bias diff --git a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/TrainLanguageModelOptions.java b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/TrainLanguageModelOptions.java index f32bd6de8d..53a61a7f92 100644 --- a/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/TrainLanguageModelOptions.java +++ b/speech-to-text/src/main/java/com/ibm/watson/speech_to_text/v1/model/TrainLanguageModelOptions.java @@ -1,5 +1,5 @@ /* - * (C) Copyright IBM Corp. 2018, 2024. + * (C) Copyright IBM Corp. 2024. * * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at @@ -25,9 +25,9 @@ public class TrainLanguageModelOptions extends GenericModel { * that were added or modified by the user directly. The model is not trained on new words * extracted from corpora or grammars. * - *

_For custom models that are based on next-generation models_, the service ignores the - * parameter. The words resource contains only custom words that the user adds or modifies - * directly, so the parameter is unnecessary. + *

_For custom models that are based on large speech models and next-generation models_, the + * service ignores the `word_type_to_add` parameter. The words resource contains only custom words + * that the user adds or modifies directly, so the parameter is unnecessary. */ public interface WordTypeToAdd { /** all. */ @@ -184,9 +184,9 @@ public String customizationId() { * that were added or modified by the user directly. The model is not trained on new words * extracted from corpora or grammars. * - *

_For custom models that are based on next-generation models_, the service ignores the - * parameter. The words resource contains only custom words that the user adds or modifies - * directly, so the parameter is unnecessary. + *

_For custom models that are based on large speech models and next-generation models_, the + * service ignores the `word_type_to_add` parameter. The words resource contains only custom words + * that the user adds or modifies directly, so the parameter is unnecessary. * * @return the wordTypeToAdd */ @@ -200,8 +200,8 @@ public String wordTypeToAdd() { *

Specifies a customization weight for the custom language model. The customization weight * tells the service how much weight to give to words from the custom language model compared to * those from the base model for speech recognition. Specify a value between 0.0 and 1.0. The - * default value is: * 0.3 for previous-generation models * 0.2 for most next-generation models * - * 0.1 for next-generation English and Japanese models + * default value is: * 0.5 for large speech models * 0.3 for previous-generation models * 0.2 for + * most next-generation models * 0.1 for next-generation English and Japanese models * *

The default value yields the best performance in general. Assign a higher value if your * audio makes frequent use of OOV words from the custom model. Use caution when setting the diff --git a/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/SpeechToTextTest.java b/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/SpeechToTextTest.java index 1c340b1772..d345abbb9d 100755 --- a/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/SpeechToTextTest.java +++ b/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/SpeechToTextTest.java @@ -1,5 +1,5 @@ /* - * (C) Copyright IBM Corp. 2019, 2024. + * (C) Copyright IBM Corp. 2024. * * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at @@ -226,6 +226,7 @@ public void testRecognizeWOptions() throws Throwable { .audio(TestUtilities.createMockStream("This is a mock file.")) .contentType("application/octet-stream") .model("en-US_BroadbandModel") + .speechBeginEvent(false) .languageCustomizationId("testString") .acousticCustomizationId("testString") .baseModelVersion("testString") @@ -270,6 +271,7 @@ public void testRecognizeWOptions() throws Throwable { Map query = TestUtilities.parseQueryString(request); assertNotNull(query); assertEquals(query.get("model"), "en-US_BroadbandModel"); + assertEquals(Boolean.valueOf(query.get("speech_begin_event")), Boolean.valueOf(false)); assertEquals(query.get("language_customization_id"), "testString"); assertEquals(query.get("acoustic_customization_id"), "testString"); assertEquals(query.get("base_model_version"), "testString"); diff --git a/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptionsTest.java b/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptionsTest.java index f146f693b1..9a1da897a5 100644 --- a/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptionsTest.java +++ b/speech-to-text/src/test/java/com/ibm/watson/speech_to_text/v1/model/RecognizeOptionsTest.java @@ -1,5 +1,5 @@ /* - * (C) Copyright IBM Corp. 2020, 2024. + * (C) Copyright IBM Corp. 2024. * * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at @@ -36,6 +36,7 @@ public void testRecognizeOptions() throws Throwable { .audio(TestUtilities.createMockStream("This is a mock file.")) .contentType("application/octet-stream") .model("en-US_BroadbandModel") + .speechBeginEvent(false) .languageCustomizationId("testString") .acousticCustomizationId("testString") .baseModelVersion("testString") @@ -66,6 +67,7 @@ public void testRecognizeOptions() throws Throwable { IOUtils.toString(TestUtilities.createMockStream("This is a mock file."))); assertEquals(recognizeOptionsModel.contentType(), "application/octet-stream"); assertEquals(recognizeOptionsModel.model(), "en-US_BroadbandModel"); + assertEquals(recognizeOptionsModel.speechBeginEvent(), Boolean.valueOf(false)); assertEquals(recognizeOptionsModel.languageCustomizationId(), "testString"); assertEquals(recognizeOptionsModel.acousticCustomizationId(), "testString"); assertEquals(recognizeOptionsModel.baseModelVersion(), "testString");