Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python log parser improvements #2350

Merged
merged 1 commit into from
May 30, 2015

Conversation

dgolden1
Copy link
Contributor

Improvements to python log parser introduced in #1384

Master-based PR version of #1547

Highlights:

  • Interface change: column order is now determined by using a list of OrderedDict objects instead of dict objects, which obviates the need to pass around a tuple with the column orders.
  • The outputs are now named according to their names in the network protobuffer; e.g., if your top is named loss, then the corresponding column header will also be loss; we no longer rename it to, e.g., TrainingLoss or TestLoss.
  • Fixed the bug/feature of the first version where the initial learning rate was always NaN.
  • Add optional parameter to specify output table delimiter. It's still a comma by default.

You can use Matlab code from this gist to verify that your results are the same before and after the changes introduced in this pull request. That code assumes that your top names are accuracy and loss, but you can modify the code if that's not true.

Over version introduced in BVLC#1384

Highlights:
* Interface change: column order is now determined by using a list of `OrderedDict` objects instead of `dict` objects, which obviates the need to pass around a tuple with the column orders.
* The outputs are now named according to their names in the network protobuffer; e.g., if your top is named `loss`, then the corresponding column header will also be `loss`; we no longer rename it to, e.g., `TrainingLoss` or `TestLoss`.
* Fixed the bug/feature of the first version where the initial learning rate was always NaN.
* Add optional parameter to specify output table delimiter. It's still a comma by default.

You can use Matlab code from [this gist](https://gist.github.com/drdan14/d8b45999c4a1cbf7ad85) to verify that your results are the same before and after the changes introduced in this pull request. That code assumes that your `top` names are `accuracy` and `loss`, but you can modify the code if that's not true.
@shelhamer
Copy link
Member

Sounds improved all around. Thanks!

shelhamer added a commit that referenced this pull request May 30, 2015
@shelhamer shelhamer merged commit 50ab52c into BVLC:master May 30, 2015
@npit
Copy link

npit commented Sep 7, 2015

I am using the log parser with this input:

I0907 13:57:03.033401  1494 solver.cpp:266] Iteration 0, Testing net (#0)
I0907 13:57:08.657651  1494 solver.cpp:315]     Test net output #0: accuracy = 0.0136348
I0907 13:57:08.657701  1494 solver.cpp:315]     Test net output #1: loss = 4.3099 (* 1 = 4.3099 loss)
[...]
I0907 14:03:02.604794  1494 solver.cpp:464] Iteration 200, lr = 1e-05
I0907 14:04:10.754405  1494 solver.cpp:266] Iteration 240, Testing net (#0)
I0907 14:04:16.212949  1494 solver.cpp:315]     Test net output #0: accuracy = 0.367476
I0907 14:04:16.212993  1494 solver.cpp:315]     Test net output #1: loss = 2.16981 (* 1 = 2.16981 loss)
I0907 14:04:16.574724  1494 solver.cpp:189] Iteration 240, loss = 2.2853
I0907 14:04:16.574767  1494 solver.cpp:204]     Train net output #0: loss = 2.36133 (* 1 = 2.36133 loss)
I0907 14:04:16.574777  1494 solver.cpp:464] Iteration 240, lr = 1e-05
I0907 14:05:14.511325  1494 solver.cpp:334] Snapshotting to /home/npit/DL/caffe/training/snap/snap__iter_280.caffemodel
I0907 14:05:15.126173  1494 solver.cpp:342] Snapshotting solver state to /home/npit/DL/caffe/training/snap/snap__iter_280.solverstate
I0907 14:05:15.756477  1494 solver.cpp:248] Iteration 280, loss = 2.02799
I0907 14:05:15.756515  1494 solver.cpp:266] Iteration 280, Testing net (#0)
I0907 14:05:20.456594  1494 solver.cpp:315]     Test net output #0: accuracy = 0.367809
I0907 14:05:20.456636  1494 solver.cpp:315]     Test net output #1: loss = 2.16945 (* 1 = 2.16945 loss)
I0907 14:05:20.456643  1494 solver.cpp:253] Optimization Done.
I0907 14:05:20.456648  1494 caffe.cpp:121] Optimization Done.

But this produces the following strange last values of 1.0 for the learning rates:
Train

NumIters,Seconds,LearningRate,loss
[...]
240.0,433.541407,1.0,2.36133

Test

NumIters,Seconds,LearningRate,accuracy,loss
[...]
240.0,433.179589,1.0,0.367476,2.16981
280.0,497.423234,1.0,0.367809,2.16945

Any ideas?
Turns out that the parser can't recognize scientific notation, so the learnign rate of 1e-05 was parsed as 1.

@dgolden1
Copy link
Contributor Author

dgolden1 commented Sep 8, 2015

@npit that bug appears to have been fixed in e342e15

@npit
Copy link

npit commented Nov 1, 2015

@drdan14 I am getting the following error when parsing a log without any test phase:

Traceback (most recent call last):
  File "/home/npit/DeepLearning/caffe/cudnn/tools/extra/updated_parser.py",
    main()
  File "/home/npit/DeepLearning/caffe/cudnn/tools/extra/updated_parser.py",
    test_dict_list, delimiter=args.delimiter)
  File "/home/npit/DeepLearning/caffe/cudnn/tools/extra/updated_parser.py",
    write_csv(test_filename, test_dict_list, delimiter, verbose)
  File "/home/npit/DeepLearning/caffe/cudnn/tools/extra/updated_parser.py",
    dict_writer = csv.DictWriter(f, fieldnames=dict_list[0].keys(),
IndexError: list index out of range

Training logs with a test phase are parsed ok.

@dgolden1
Copy link
Contributor Author

dgolden1 commented Nov 2, 2015

@npit Please post a link to the training log that generates the error. And what is updated_parser.py?

@npit
Copy link

npit commented Nov 5, 2015

@drdan14
updated_parser.py is just your parser. I had the old one too, so I renamed the new one.
A log example is at :
http://pastebin.com/MXChmWeB

@dgolden1
Copy link
Contributor Author

dgolden1 commented Nov 5, 2015

@npit see #3292

@npit
Copy link

npit commented Nov 12, 2015

Thanks @dgolden1 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants