feat: add mixed-precision and agc to gradaccum optimizer #550
Job | Run time |
---|---|
37s | |
1m 30s | |
1m 18s | |
12m 47s | |
1m 15s | |
1m 46s | |
11m 41s | |
1m 9s | |
1m 29s | |
1m 6s | |
12m 2s | |
1m 14s | |
12m 25s | |
12m 29s | |
13m 23s | |
7m 44s | |
7m 16s | |
7m 48s | |
7m 14s | |
7m 41s | |
7m 13s | |
13m 45s | |
13m 20s | |
13m 25s | |
12m 53s | |
27m 33s | |
26m 47s | |
3h 58m 50s |