Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python implementation of softmax loss layer #4023

Closed
biprajiman opened this issue Apr 20, 2016 · 1 comment
Closed

Python implementation of softmax loss layer #4023

biprajiman opened this issue Apr 20, 2016 · 1 comment

Comments

@biprajiman
Copy link

biprajiman commented Apr 20, 2016

Hi,

I am trying to implement the softmaxloss layer in python to be used with pycaffe. I followed the example of euclidean loss and created a simple code as a starting point:

----------------------------------------------------------------------------------------------------------------------------

class SoftmaxLossLayer(caffe.Layer):

def setup(self, bottom, top):
    # check input pair
    if len(bottom) != 2:
        raise Exception("Need two inputs to compute distance.")

def reshape(self, bottom, top):
    # check input dimensions match
    if bottom[0].num != bottom[1].num:
        raise Exception("Inputs must have the same dimension.")
    #raise Exception("Inputs must have the same dimension.")
    # difference is shape of inputs
    self.diff = np.zeros_like(bottom[0].data, dtype=np.float32)
    # loss output is scalar
    top[0].reshape(1)

def forward(self, bottom, top):
    scores = bottom[0].data
    exp_scores = np.exp(scores)
    probs = exp_scores / np.sum(exp_scores, axis=1, keepdims=True) 
    correct_logprobs = -np.log(probs[range(bottom[0].num),np.array(bottom[1].data,dtype=np.uint16)])
    data_loss = np.sum(correct_logprobs)/bottom[0].num

    self.diff[...] = probs
    top[0].data[...] = data_loss

def backward(self, top, propagate_down, bottom):
    delta = self.diff

    for i in range(2):
        if not propagate_down[i]:
            continue
        if i == 0:
            delta[range(bottom[0].num), np.array(bottom[1].data,dtype=np.uint16)] -= 1

        bottom[i].diff[...] = delta/bottom[0].num

------------------------------------------------------------------------------------------------------------------------------

The code is working for simple LeNet and loss seems to be decreasing. I would be willing to modify this code and make it upto the standard and share. I need guidance on what am I missing (I read the c++ code and this one is far from what the c++ is doing) and modify the code to match the c++ code so that it is more generic.

You may ask why to go through this trouble, well modifying python code to create new loss is easier for me than to go through the c++ code which might take long time.

Thank you in advance for any help.

@seanbell
Copy link

seanbell commented Apr 20, 2016

While a python layer is nice for academic/learning purposes, there's no need for it in caffe since the C++ one is faster and uses the GPU.

Also note that your forward expression is numerically unstable; you should look into lectures explaining Softmax (e.g. http://cs231n.github.io/linear-classify/#softmax) to see how to fix it.

I'm closing this since it's a modeling/usage question. Please continue the discussion on the mailing list.
From https://github.com/BVLC/caffe/blob/master/CONTRIBUTING.md:

Please do not post usage, installation, or modeling questions, or other requests for help to Issues.
Use the caffe-users list instead. This helps developers maintain a clear, uncluttered, and efficient view of the state of Caffe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants