Python implementation of softmax loss layer #4023

biprajiman · 2016-04-20T20:27:58Z

Hi,

I am trying to implement the softmaxloss layer in python to be used with pycaffe. I followed the example of euclidean loss and created a simple code as a starting point:

----------------------------------------------------------------------------------------------------------------------------

class SoftmaxLossLayer(caffe.Layer):

def setup(self, bottom, top):
    # check input pair
    if len(bottom) != 2:
        raise Exception("Need two inputs to compute distance.")

def reshape(self, bottom, top):
    # check input dimensions match
    if bottom[0].num != bottom[1].num:
        raise Exception("Inputs must have the same dimension.")
    #raise Exception("Inputs must have the same dimension.")
    # difference is shape of inputs
    self.diff = np.zeros_like(bottom[0].data, dtype=np.float32)
    # loss output is scalar
    top[0].reshape(1)

def forward(self, bottom, top):
    scores = bottom[0].data
    exp_scores = np.exp(scores)
    probs = exp_scores / np.sum(exp_scores, axis=1, keepdims=True) 
    correct_logprobs = -np.log(probs[range(bottom[0].num),np.array(bottom[1].data,dtype=np.uint16)])
    data_loss = np.sum(correct_logprobs)/bottom[0].num

    self.diff[...] = probs
    top[0].data[...] = data_loss

def backward(self, top, propagate_down, bottom):
    delta = self.diff

    for i in range(2):
        if not propagate_down[i]:
            continue
        if i == 0:
            delta[range(bottom[0].num), np.array(bottom[1].data,dtype=np.uint16)] -= 1

        bottom[i].diff[...] = delta/bottom[0].num

------------------------------------------------------------------------------------------------------------------------------

The code is working for simple LeNet and loss seems to be decreasing. I would be willing to modify this code and make it upto the standard and share. I need guidance on what am I missing (I read the c++ code and this one is far from what the c++ is doing) and modify the code to match the c++ code so that it is more generic.

You may ask why to go through this trouble, well modifying python code to create new loss is easier for me than to go through the c++ code which might take long time.

Thank you in advance for any help.

The text was updated successfully, but these errors were encountered:

seanbell · 2016-04-20T22:02:25Z

While a python layer is nice for academic/learning purposes, there's no need for it in caffe since the C++ one is faster and uses the GPU.

Also note that your forward expression is numerically unstable; you should look into lectures explaining Softmax (e.g. http://cs231n.github.io/linear-classify/#softmax) to see how to fix it.

I'm closing this since it's a modeling/usage question. Please continue the discussion on the mailing list.
From https://github.com/BVLC/caffe/blob/master/CONTRIBUTING.md:

Please do not post usage, installation, or modeling questions, or other requests for help to Issues.
Use the caffe-users list instead. This helps developers maintain a clear, uncluttered, and efficient view of the state of Caffe.

seanbell closed this as completed Apr 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python implementation of softmax loss layer #4023

Python implementation of softmax loss layer #4023

biprajiman commented Apr 20, 2016 •

edited

Loading

seanbell commented Apr 20, 2016 •

edited

Loading

Python implementation of softmax loss layer #4023

Python implementation of softmax loss layer #4023

Comments

biprajiman commented Apr 20, 2016 • edited Loading

----------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------------

seanbell commented Apr 20, 2016 • edited Loading

biprajiman commented Apr 20, 2016 •

edited

Loading

seanbell commented Apr 20, 2016 •

edited

Loading