Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SINGA-502 Avoid moving data between host and gpu in SoftmaxCrossEntropy #577

Merged
merged 1 commit into from
Jan 23, 2020

Conversation

chrishkchris
Copy link
Contributor

The softmax_cross_entropy move to data to host and then back to gpu, so the whole function needed to be changed for many reasons such as efficiency and asynchronization (and buffering operation in the future). :

class SoftMaxCrossEntropy(Operation):

def init(self, t):
super(SoftMaxCrossEntropy, self).init()
self.t = t.data

def forward(self, x):
self.p = singa.SoftMax
loss = CTensor((1,), self.p.device())
ret = singa.CrossEntropyFwd(self.p, self.t)
loss.SetFloatValue(singa.SumAsFloat(ret) / x.shape()[0])
return loss

Here the SumAsFloat return a c++ float value, and this value is read back to gpu in the SetFloatValue function.

@chrishkchris
Copy link
Contributor Author

chrishkchris commented Jan 21, 2020

Some examples running using a T4 GPU:

ubuntu@ip-172-31-17-75:~/singa/examples/autograd$ python3 mlp.py
train_data_shape: (400, 2)
train_label_shape: (400, 2)
training loss =  0.6908968
training loss =  0.59333104
training loss =  0.5687339
training loss =  0.5405281
training loss =  0.46624574
training loss =  0.35966498
training loss =  0.28517348
training loss =  0.23358232
training loss =  0.19751398
training loss =  0.17123318
training loss =  0.15162155
ubuntu@ip-172-31-17-75:~/singa/examples/autograd$ python3 mnist_cnn.py
Starting Epoch 0:
Training loss = 585.281616, training accuracy = 0.791572
Evaluation accuracy = 0.940204, Elapsed Time = 2.901998s
Starting Epoch 1:
Training loss = 234.667984, training accuracy = 0.920758
Evaluation accuracy = 0.961839, Elapsed Time = 2.898471s
Starting Epoch 2:
Training loss = 168.530197, training accuracy = 0.943420
Evaluation accuracy = 0.970753, Elapsed Time = 2.903471s
Starting Epoch 3:
Training loss = 137.636353, training accuracy = 0.953875
Evaluation accuracy = 0.974860, Elapsed Time = 2.908554s
Starting Epoch 4:
Training loss = 118.752136, training accuracy = 0.959779
Evaluation accuracy = 0.971955, Elapsed Time = 2.916549s
Starting Epoch 5:
Training loss = 105.220406, training accuracy = 0.964131
Evaluation accuracy = 0.974359, Elapsed Time = 2.926336s
Starting Epoch 6:
Training loss = 95.145279, training accuracy = 0.968350
Evaluation accuracy = 0.980569, Elapsed Time = 2.918108s
Starting Epoch 7:
Training loss = 86.757538, training accuracy = 0.971251
Evaluation accuracy = 0.982572, Elapsed Time = 2.920413s
Starting Epoch 8:
Training loss = 81.706383, training accuracy = 0.972252
Evaluation accuracy = 0.984075, Elapsed Time = 2.924388s
Starting Epoch 9:
Training loss = 77.409966, training accuracy = 0.973769
Evaluation accuracy = 0.981270, Elapsed Time = 2.929271s
ubuntu@ip-172-31-17-75:~/singa/examples/autograd$ python3 resnet.py
Start intialization............
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:29<00:00,  3.45it/s]
Throughput = 110.30029570709767 per second
Total=0.2901170825958252, forward=0.09263657093048096, softmax=0.0016593122482299804, backward=0.19582119941711426, sgd=0.009393131732940674

@nudles nudles merged commit b86add0 into apache:dev Jan 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants