-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to classify 1-channel inputs with the python wrapper (LeNet deploy) #462
Comments
The reverse, actually: the Net is expecting an input blob of 1 x 28 x 28 images (1 channel, 28 pixels high, 28 pixels wide). The included The code automatically checks whatever the network architecture expects as input. |
Got it. Thanks so much! I'm up and running now. |
hi shelhamer, |
Hi,
In the main() routine, after setting mode to CPU I have:
where argv[1] is the net prototxt and argv[2] is the pre-trained data file of LeNet MNIST ( lenet_iter_10000). When runnin it I get an error: Examining it in Debug single step I see the the error occurs when stepping over the line I interpert this error as an indication that the first conv layer expects 3 channel data while the pre-trained data provides data for only 1 channel (which is how it was trained and what I need). anybody advice ? Thanks |
caffe take your input as color image by default, you should set an images_in_color field in your predict prototxt file, please refer to caffe.proto or #538 for detail |
For those who run across this in the future, I got the network changed to 1 channel with the following diff:
|
Hi Russell91, your post saved me a lot, thanks! |
I stumbled upon the same problem and I solved it without changing the files cited above. In my case, the image were really saved with 3 channels, but all of them with the same pixel values. So, I just took one of the channels and reshaped it, like the code below: In [51]: img = caffe.io.load_image('image7.png', color=False)
In [52]: img.shape
Out[52]: (20, 20, 3)
In [53]: grayimg = img[:,:,0]
In [54]: grayimg.shape
Out[54]: (20, 20)
In [55]: gi = np.reshape(grayimg, (20,20,1))
In [56]: gi.shape
Out[56]: (20, 20, 1) Without reshaping the gray scale image, the |
@boechat107 have you fix the problem ? I use the same strategy but encouter the same problem. |
Really, @RiweiChen? Are you sure of having made the changes on the In my case, the function was expecting to get the third number of |
Hi , @boechat107 ,sure , I had trained a network using grayscale images, and the input images is also a grayscale image. I following the same strategy as you do ,and also got the IndexError: tuple index out of range at inputs[0].shape[2], so I change it to inputs[0].shape[1], but it got an another error at input_[ix]=caffe.io.resize_image(in_,self.image_dims). can you tell me what I can do next to fix this problem? Thanks .chen |
Implement my solution above and you will be fine. Boechat's solution only works if your test images are RGB. |
Hi @Russell91 , actually I have also try your method before, and recomplie the corresponding modified *.py code to .pyc; however I still got the same Error at the same place about: What I can do to fix this problem? |
Did you remember to add the line: |
@Russell91 Yes ,I am sure of this .
|
if I do not change the inputs[0].shape[2] to inputs[0].shape[1] I would get the |
I got the same problem because I forgot these changes in pycaffe.py.
These fixed my problem. |
If you have trained a model with 1-dimensional gray image, and want to classify another gray image, the following is the hack worked for me:
then just use the official classify.py. Here is the gist of Hope it will work for you too. |
@shelhamer What about adding an option to |
Hello, I have try @uronce-cc method, but it doesn't work. The funny thing is my mnist network works for image of png format, but still get an error when the images is 'jpg' format. When I want to try @Russell91 method, I find the new version of caffe has changed the 'pycaffe.py' file a lot, so that i don't know how to modify it. The error with jpg format image is "ValueError: could not broadcast input array from shape (1,3,28,28) into shape (1,1,28,28)" Can someone help me solve it? Thanks in advance ! |
Hi, I had the same problem. I followed *Russell91 On Tue, Mar 3, 2015 at 6:04 PM, wang4249 notifications@github.com wrote:
|
@ToruHironaka Thanks for your response! But the current version of 'pycaffe.py' is quite different from the one mentioned by @Russell91 ,so I don't know how to modify it. |
@wang4249, the problem seems clear from the error message. Your jpg image has been loaded with 3 channels. When I'm coding in Python, I like to be sure of the structure of my data by using the debugger: # CODE to load an image file...
import pdb; pdb.set_trace() If you are using img = caffe.io.load_image(filename)
img.shape For your jpg image, the returned shape is probably I didn't look at the code of |
@boechat107 Thank you very much for your response. I am a beginner with python, actually I don't know how to write python code , i can just modify some code. What I doing now is trying to figure out if the 'mnist' network works by modifying the 'classify.py' file. I know it's bad to put my code here, but I have spent 2 days to find the error but didn't make any progress.
Thanks very much for your help. |
@wang4249, as you said that you are learning Python, here are some suggestions:
|
@wang4249 what error have you encountered when trying my modification? |
@shelhamer
|
Some way to get error ''IndexError: tuple index out of range" at python/caffe/classifier.py, line 63, in predict is pass to image_list = [caffe.io.load_image(image_path, False)]
features = net.predict(image_list) May be it will be helpful for someone :) |
I have some grayscale images and I am using gray flag to make db and mean files, so the current amount of channel is 1. I am going to import my dataset in groups of 20 images as 20 channels. how can I change the amount of channels variable fro 1 to 20, in train_val.prototxt file? should I change the caffe files? |
@fahimeh62 I am trying to do similar thing you try to do. I found this #1494. This information might help. I am trying to test K channel number 4 (RGB with alpha) in order to see how to increase K channel but I am not successful yet. |
@ToruHironaka Thanks for your prompt answer. I had a look at that page you suggest. Could you tell me where I should add theses lines? inputs = np.zeros((10, 5, 227, 227)) in_db = lmdb.open('input-lmdb', map_size=int(1e12)) in convert_imageset.cpp or train_val.prototxt or somewhere else? |
@ToruHironaka I am also curious how do you change your channels to 4 and then generate your LMDB regarding with 4-channel data? |
@fahimeh62 the script is written in python so you have to write own your python script. I add alpha channel to my 3 channel (RGB) png file by using Imagemagic. Then, I converted from RGBA png files into lmdb. However, it seemed to be my alpha channel automatically ignored or my python script does not properly convert my image file with 4 channel into lmdb. I am still working on this. |
I am trying to train caffe using images which has 8 channels and I am also facing problems creating the LMDB. |
@fahimeh62 & @mtrth Above code converts image files into lmdb but I am still working on multi-channel things. @mtrth I think you can just run build/tools/convert_imageset.bin convert 4 channel image (my image's channel was RGBA 4 channel, I think you are trying to combine 8 image files into 1 file by increasing a number of channel. I am trying to do it too) into lmdb or use the python code above. It should work. I am still leaning and trying to converting multi-channel images into lmdb. I think caffe seem to accept up to 10 channels but I am not sure. |
I modified the python script mentioned in #1494 when I test it on a image by following the command mentioned in https://github.com/BVLC/caffe/blob/master/examples/detection.ipynb ./python/detect.py --crop_mode=selective_search --pretrained_model=./examples/trial/trial_iter_10000.caffemodel --model_def=./examples/trial/trial.prototxt --gpu --raw_scale=255 _temp/det_input.txt temp/det_output.h5 Did anyone face similar issues? |
I fixed that error; python/detect.py was taking 'caffe/imagenet/ilsvrc_2012_mean.npy' as the default mean file and its for 3 channel image, so I created a mean file for my 8-channel images using python script and used that. |
Greetings,
I'm evaluating Caffe for a commercial application. I have compiled Caffe and pycaffe and matcaffe and everything appears to be good: the installation passed all tests that are run using make runtest.
Now I want to use the python wrapper to do a simple image classification. I am trying to use a trained mnist network to do the classification (staying away from the imagenet example due to the commercial restriction associated with obtaining the pretrained network).
For my lenet_deploy.prototxt I changed the very beginning and end of the example lenet_test.prototxt. In the beginning of lenet_deploy.prototxt I have the following before the first convolutional layer for my non-RGB single test image of size [28,28]:
name: "LeNet-deploy"
input: "data"
input_dim: 1
input_dim: 1
input_dim: 28
input_dim: 28
At the end of lenet_deploy.prototxt I have:
layers {
name: "prob"
type: SOFTMAX
bottom: "ip2"
top: "prob"
}
For the trained model I simply used the lenet_iter_10000 result, which seems very good, and renamed it to lenet_pretrained. Now I want to classify my non-RGB image of size [28 28] by running the python wrapper with the options listed below:
python classify.py
--model_def='/usr/local/caffe/examples/mnist/lenet_deploy.prototxt' --pretrained_model='/usr/local/caffe/examples/mnist/lenet_pretrained'
--gpu
--center_only
--images_dim='28,28'
--mean_file=''
/usr/local/caffe/examples/images/cat2.jpg
/usr/local/caffe/examples/mnist/looker
Launching the python script results in the network being read in, but it fails in pycaffe.py line 66 in _Net_forward:
self.blobs[in_].data[...] = blob
ValueError: could not broadcast input array from shape (1,3,28,28) into shape (1,1,28,28)
I interpret this error to mean that the code (or the pretrained network?) is expecting a 3 channel image (RGB or the like), but it is only seeing a single channel image. If that is the right interpretation how do I change this? I've spent a lot of time looking at the various posts on this site, but I haven't been able to get past this point. My apologies if I'm missing something super obvious here.
Many thanks for your time.
The text was updated successfully, but these errors were encountered: