NameError: name 'copy' is not defined when setting timestamp = True #9820
Unanswered
AbdelrhmanElnenaey
asked this question in
Q&A
Replies: 1 comment
-
the problem does not appear if i use |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am getting this error ```NameError Traceback (most recent call last)
in <cell line: 16>()
14
15 # specify flag `return_hypotheses=True``
---> 16 hypotheses = asr_model.transcribe(["/content/audio_sample_20.wav"], return_hypotheses=True)
17
18 # if hypotheses form a tuple (from RNNT), extract just "best" hypotheses
4 frames
/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py in decorate_context(*args, **kwargs)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)
116
117 return decorate_context
/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/models/rnnt_models.py in transcribe(self, paths2audio_files, batch_size, return_hypotheses, partial_hypothesis, num_workers, channel_selector, augmentor, verbose)
302 input_signal=test_batch[0].to(device), input_signal_length=test_batch[1].to(device)
303 )
--> 304 best_hyp, all_hyp = self.decoding.rnnt_decoder_predictions_tensor(
305 encoded,
306 encoded_len,
/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/submodules/rnnt_decoding.py in rnnt_decoder_predictions_tensor(self, encoder_output, encoded_lengths, return_hypotheses, partial_hypotheses)
487
488 else:
--> 489 hypotheses = self.decode_hypothesis(prediction_list) # type: List[str]
490
491 # If computing timestamps
/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/submodules/rnnt_decoding.py in decode_hypothesis(self, hypotheses_list)
1446 A list of strings.
1447 """
-> 1448 hypotheses = super().decode_hypothesis(hypotheses_list)
1449 if self.compute_langs:
1450 if isinstance(self.tokenizer, AggregateTokenizer):
/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/submodules/rnnt_decoding.py in decode_hypothesis(self, hypotheses_list)
538 # this is done so that
rnnt_decoder_predictions_tensor()
can process this hypothesis539 # in order to compute exact time stamps.
--> 540 alignments = copy.deepcopy(hypotheses_list[ind].alignments)
541 token_repetitions = [1] * len(alignments) # preserve number of repetitions per token
542 hypothesis = (prediction, alignments, token_repetitions)
NameError: name 'copy' is not defined```
when I try to set the time stamp in the decoding configuration to True
Here is the code ```# import nemo_asr and instantiate asr_model as above
import nemo.collections.asr as nemo_asr
import copy
asr_model = nemo_asr.models.ASRModel.from_pretrained("stt_en_fastconformer_transducer_large")
update decoding config to preserve alignments and compute timestamps
from omegaconf import OmegaConf, open_dict
decoding_cfg = asr_model.cfg.decoding
with open_dict(decoding_cfg):
decoding_cfg.preserve_alignments = True
decoding_cfg.compute_timestamps = True
asr_model.change_decoding_strategy(decoding_cfg)
specify flag `return_hypotheses=True``
hypotheses = asr_model.transcribe(["/content/audio_sample_20.wav"], return_hypotheses=True)
if hypotheses form a tuple (from RNNT), extract just "best" hypotheses
if type(hypotheses) == tuple and len(hypotheses) == 2:
hypotheses = hypotheses[0]
timestamp_dict = hypotheses[0].timestep # extract timesteps from hypothesis of first (and only) audio file
print("Hypothesis contains following timestep information :", list(timestamp_dict.keys()))
For a FastConformer model, you can display the word timestamps as follows:
80ms is duration of a timestep at output of the Conformer
time_stride = 8 * asr_model.cfg.preprocessor.window_stride
word_timestamps = timestamp_dict['word']
for stamp in word_timestamps:
start = stamp['start_offset'] * time_stride
end = stamp['end_offset'] * time_stride
word = stamp['char'] if 'char' in stamp else stamp['word']
Beta Was this translation helpful? Give feedback.
All reactions