SINGA-502 Avoid moving data between host and gpu in SoftmaxCrossEntropy #577

chrishkchris · 2020-01-21T08:45:23Z

The softmax_cross_entropy move to data to host and then back to gpu, so the whole function needed to be changed for many reasons such as efficiency and asynchronization (and buffering operation in the future). :

class SoftMaxCrossEntropy(Operation):

def init(self, t):
super(SoftMaxCrossEntropy, self).init()
self.t = t.data

def forward(self, x):
self.p = singa.SoftMax
loss = CTensor((1,), self.p.device())
ret = singa.CrossEntropyFwd(self.p, self.t)
loss.SetFloatValue(singa.SumAsFloat(ret) / x.shape()[0])
return loss

Here the SumAsFloat return a c++ float value, and this value is read back to gpu in the SetFloatValue function.

chrishkchris · 2020-01-21T08:46:03Z

Some examples running using a T4 GPU:

ubuntu@ip-172-31-17-75:~/singa/examples/autograd$ python3 mlp.py
train_data_shape: (400, 2)
train_label_shape: (400, 2)
training loss =  0.6908968
training loss =  0.59333104
training loss =  0.5687339
training loss =  0.5405281
training loss =  0.46624574
training loss =  0.35966498
training loss =  0.28517348
training loss =  0.23358232
training loss =  0.19751398
training loss =  0.17123318
training loss =  0.15162155
ubuntu@ip-172-31-17-75:~/singa/examples/autograd$ python3 mnist_cnn.py
Starting Epoch 0:
Training loss = 585.281616, training accuracy = 0.791572
Evaluation accuracy = 0.940204, Elapsed Time = 2.901998s
Starting Epoch 1:
Training loss = 234.667984, training accuracy = 0.920758
Evaluation accuracy = 0.961839, Elapsed Time = 2.898471s
Starting Epoch 2:
Training loss = 168.530197, training accuracy = 0.943420
Evaluation accuracy = 0.970753, Elapsed Time = 2.903471s
Starting Epoch 3:
Training loss = 137.636353, training accuracy = 0.953875
Evaluation accuracy = 0.974860, Elapsed Time = 2.908554s
Starting Epoch 4:
Training loss = 118.752136, training accuracy = 0.959779
Evaluation accuracy = 0.971955, Elapsed Time = 2.916549s
Starting Epoch 5:
Training loss = 105.220406, training accuracy = 0.964131
Evaluation accuracy = 0.974359, Elapsed Time = 2.926336s
Starting Epoch 6:
Training loss = 95.145279, training accuracy = 0.968350
Evaluation accuracy = 0.980569, Elapsed Time = 2.918108s
Starting Epoch 7:
Training loss = 86.757538, training accuracy = 0.971251
Evaluation accuracy = 0.982572, Elapsed Time = 2.920413s
Starting Epoch 8:
Training loss = 81.706383, training accuracy = 0.972252
Evaluation accuracy = 0.984075, Elapsed Time = 2.924388s
Starting Epoch 9:
Training loss = 77.409966, training accuracy = 0.973769
Evaluation accuracy = 0.981270, Elapsed Time = 2.929271s
ubuntu@ip-172-31-17-75:~/singa/examples/autograd$ python3 resnet.py
Start intialization............
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:29<00:00,  3.45it/s]
Throughput = 110.30029570709767 per second
Total=0.2901170825958252, forward=0.09263657093048096, softmax=0.0016593122482299804, backward=0.19582119941711426, sgd=0.009393131732940674

SINGA-502 Avoid moving data between host and gpu in SoftmaxCrossEntropy

7baf2c4

nudles merged commit b86add0 into apache:dev Jan 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SINGA-502 Avoid moving data between host and gpu in SoftmaxCrossEntropy #577

SINGA-502 Avoid moving data between host and gpu in SoftmaxCrossEntropy #577

chrishkchris commented Jan 21, 2020

chrishkchris commented Jan 21, 2020 •

edited

Loading

SINGA-502 Avoid moving data between host and gpu in SoftmaxCrossEntropy #577

SINGA-502 Avoid moving data between host and gpu in SoftmaxCrossEntropy #577

Conversation

chrishkchris commented Jan 21, 2020

chrishkchris commented Jan 21, 2020 • edited Loading

chrishkchris commented Jan 21, 2020 •

edited

Loading