You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a really good question! <I suppose you are referring to the bf16 and tf32 flags that we used in the training script, which are both set to True>
While they are both referring to the floating point data types, their usage is quite different. We are able to access tensors in different data types such as fp16/32 and bf16. In contrast, tf32 (TensorFloat 32) is an internal data type available in NVIDIA's Ampere-architecture GPUs and cannot be operated directly in pytorch code:
It's a special computational format/mode designed for TensorCore on Nvidia GPUs, speeding up mat_mul, etc. It has an 8-bit numerical range and a 10-bit precision range. When we are training/inference with fp32, we can simply enable tf32 to gain throughput improvement. It also supports fp16 or bf16 mixed precision. There was also an interesting discussion on benchmarking the gains using tf32 on different data types in huggingface/transformers#14608 (comment).
Basically, if you have Ampere GPUs such as 3080/90 or A100, you can simply enable tf32 to speedup your tasks. That's also why previous codebases such as Alpaca and Vicuna have set both flags to be True.
thanks for your brilliant idea and opensource!
A question about the data type:
I notice that the bf16 and tf16 are both set True. I'm confused what will happen. Can they use together?
The text was updated successfully, but these errors were encountered: