Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train crfasrnn with 2 classes #21

Closed
thuanvh opened this issue Jan 20, 2016 · 9 comments
Closed

Train crfasrnn with 2 classes #21

thuanvh opened this issue Jan 20, 2016 · 9 comments

Comments

@thuanvh
Copy link

thuanvh commented Jan 20, 2016

Hi all,
I am trying to train my own images with crfasrnn. My classes are background and people.
But my training does not converge, the loss values do not decrease.
And the prediction output of my network is only background (class 0) for all image.
I think that I have a mistake in my training.
Have you ever met the similar case?
Could you give me any suggestion?

Thank you,
Thuan

@martinkersner
Copy link

Hi!

I think that the problem could be anywhere.

  • How do you create dataset?
  • How do you initial weights?

Martin

@thuanvh
Copy link
Author

thuanvh commented Jan 20, 2016

Hi @martinkersner

I use the crfasrnn trained network to segment my images into 2 classes (background and foreground).
My data is scaled between -1 and 1. Input size is 250x250 instead of 500x500.
Then I train data by customizing train_val.prototxt.
I change num_output of some convolution layers from 21 to 2.
I add weight filler into each Convolution and Deconvolution parameter

    weight_filler {
      type: "xavier"
      std: 0.1
    }
    bias_filler {
      type: "constant"
      value: 0.2
    }

Here my output:

I0120 17:01:35.764663  7444 solver.cpp:473] Iteration 10000, lr = 1e-010
I0120 17:04:47.148594  7444 solver.cpp:213] Iteration 10100, loss = 51121.4
I0120 17:04:47.159598  7444 solver.cpp:228]     Train net output #0: loss = 59859.5 (* 1 = 59859.5 loss)
I0120 17:04:47.161098  7444 solver.cpp:473] Iteration 10100, lr = 1e-010
I0120 17:08:00.796912  7444 solver.cpp:213] Iteration 10200, loss = 52975.5
I0120 17:08:00.798411  7444 solver.cpp:228]     Train net output #0: loss = 55693 (* 1 = 55693 loss)
I0120 17:08:00.799911  7444 solver.cpp:473] Iteration 10200, lr = 1e-010
I0120 17:11:11.810928  7444 solver.cpp:213] Iteration 10300, loss = 52339.4
I0120 17:11:11.825934  7444 solver.cpp:228]     Train net output #0: loss = 50822.4 (* 1 = 50822.4 loss)
I0120 17:11:11.834436  7444 solver.cpp:473] Iteration 10300, lr = 1e-010
I0120 17:14:22.996568  7444 solver.cpp:213] Iteration 10400, loss = 51125.5
I0120 17:14:22.998069  7444 solver.cpp:228]     Train net output #0: loss = 63236 (* 1 = 63236 loss)
I0120 17:14:23.001070  7444 solver.cpp:473] Iteration 10400, lr = 1e-010
I0120 17:17:36.604817  7444 solver.cpp:213] Iteration 10500, loss = 51211.2
I0120 17:17:36.608319  7444 solver.cpp:228]     Train net output #0: loss = 38913 (* 1 = 38913 loss)
I0120 17:17:36.613320  7444 solver.cpp:473] Iteration 10500, lr = 1e-010
I0120 17:20:48.551285  7444 solver.cpp:213] Iteration 10600, loss = 51850.9
I0120 17:20:48.580294  7444 solver.cpp:228]     Train net output #0: loss = 46278.5 (* 1 = 46278.5 loss)
I0120 17:20:48.581795  7444 solver.cpp:473] Iteration 10600, lr = 1e-010
I0120 17:23:59.443377  7444 solver.cpp:213] Iteration 10700, loss = 50562.2
I0120 17:23:59.444877  7444 solver.cpp:228]     Train net output #0: loss = 50714.7 (* 1 = 50714.7 loss)
I0120 17:23:59.445876  7444 solver.cpp:473] Iteration 10700, lr = 1e-010
I0120 17:27:09.559470  7444 solver.cpp:213] Iteration 10800, loss = 50792.1
I0120 17:27:09.560971  7444 solver.cpp:228]     Train net output #0: loss = 53939.8 (* 1 = 53939.8 loss)
I0120 17:27:09.562472  7444 solver.cpp:473] Iteration 10800, lr = 1e-010
I0120 17:30:19.621381  7444 solver.cpp:213] Iteration 10900, loss = 50344.1
I0120 17:30:19.622881  7444 solver.cpp:228]     Train net output #0: loss = 67609.6 (* 1 = 67609.6 loss)

@martinkersner
Copy link

"Data is scaled between -1 and 1." What data you mean, images or labels?

I don't understand why you scale your data between -1 and 1. Even though this does not have to affect (I guess) your training, common practice is to have images within range 0-255 (and subtract their mean during training). Labels are usually denoted as integers in range 0-N, where N is number of classes - 1.

Because you just use weight filler and not weights already obtained by fcn-8 or crfasrnn, training is certainly going to take long time. Static (class 0) output of your network is likely to be caused by wrong weights initialization.

You can check my repo https://github.com/martinkersner/train-CRF-RNN.

@thuanvh
Copy link
Author

thuanvh commented Jan 21, 2016

I scale only images. Labels are 0 and 1.

I tried to use crfasrnn weights as in this file
train_val.prototxt.txt

I compare it with your file, it is the same except the weight_filler. Don't you use filler for the new convolution layers?

@martinkersner
Copy link

I tried to train without fillers, just with weights from crfasrnn and what you can see in solve.py. Anyway, I should try it with them, because after 35 thousand of iterations it seems to me that I get worse results, however the loss is slightly decreasing.

If you still have problem with your unchanging predictions, don't train more than 500 iterations. I get pretty reasonable results even at such early training. Personally, I would guess that you have some problem with your training data.

@thuanvh
Copy link
Author

thuanvh commented Jan 21, 2016

After review the solve.py. I think the problem is that I did not initialize weights for Deconvolution layers as in solve.py.
I used only caffe build for training, not used solve.py. Now I will try to use it.
Thank you so much,
Thuan

@tybxiaobao
Copy link

@thuanvh Hi, have you solved your problem? And how about the accuracy for two classes (i.e. background and people) labeling?

@thuanvh
Copy link
Author

thuanvh commented Jan 26, 2016

The problem is solved. I am collecting data for training. I have no measure now.

@thuanvh thuanvh closed this as completed Oct 6, 2016
@Sam813
Copy link

Sam813 commented May 28, 2018

@thuanvh I have the same problem and I know it has been a long time from this post, but may I know how did you solve the problem?
I have medical images(CT) and I want to use this for segmentation of tumor so I have just two class of background and tumor,
After training all the predictions from the network is black.

This was referenced May 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants