Python log parser improvements #1547

dgolden1 · 2014-12-09T19:01:17Z

Improvements to python log parser introduced in #1384

Highlights:

Interface change: column order is now determined by using a list of OrderedDict objects instead of dict objects, which obviates the need to pass around a tuple with the column orders.
The outputs are now named according to their names in the network protobuffer; e.g., if your top is named loss, then the corresponding column header will also be loss; we no longer rename it to, e.g., TrainingLoss or TestLoss.
Fixed the bug/feature of the first version where the initial learning rate was always NaN.
Add optional parameter to specify output table delimiter. It's still a comma by default.

You can use Matlab code from this gist to verify that your results are the same before and after the changes introduced in this pull request. That code assumes that your top names are accuracy and loss, but you can modify the code if that's not true.

One caveat: we expect that the output layer names consist only of alphanumeric characters and the underscore, and are therefore captured with the \w token in the Python regular expression syntax. If other characters are allowed in layer names, we'll have to change the \w token to something else.

Update: any top name without white space is OK.

dgolden1 · 2014-12-09T19:01:34Z

@sguada have a look and please confirm that I addressed all of your comments

jamesguoxin · 2014-12-16T15:38:00Z

Hi drdan14,

Do you know how to generate .log file for your log parser in Caffe? I'm stuck here without clue. I mean usually we run executable as "./build/tools/caffe train --solver=/path/to/my_solver.prototxt", do I need to add a flag to create the log file? Thank you very much for your help!!

dgolden1 · 2014-12-16T16:00:55Z

It's in /tmp by default, or you can add the --log_dir parameter.

Please ask these sorts of questions in the caffe users Google group at https://groups.google.com/forum/m/#!forum/caffe-users

jamesguoxin · 2014-12-16T16:11:11Z

Thanks a lot!

jamesguoxin · 2014-12-17T13:51:28Z

Hi drdan14,

I used your python parser today and I think I found a bug. In parse_log(path_to_log) function, when you return train_dict_list and test_dict_list, it actually misses the last value stored in train_row and test_row. This will lead to the result that we lost last testing and training result. Thanks!

dgolden1 · 2014-12-17T15:52:53Z

@jamesguoxin you're quite right, that is a bug. It's a function of the fact that parse_log.py only adds the current row to the list of dictionaries when it finds the next row; for the last line, there is no next row, so the current row doesn't get added.

I didn't notice this because I usually have so many lines in my log that I don't care about the last one. But I'll ponder a fix.

dgolden1 · 2014-12-17T16:13:40Z

@jamesguoxin fix pushed

jamesguoxin · 2014-12-17T17:15:35Z

Great!

zhfe99 · 2014-12-26T02:24:36Z

tools/extra/parse_log.py

-    re_output_loss = re.compile('output #\d+: loss = ([\.\d]+)')
-    re_lr = re.compile('lr = ([\.\d]+)')
+    regex_iteration = re.compile('Iteration (\d+)')
+    regex_train_output = re.compile('Train net output #(\d+): (\S+) = ([\.\d]+)')


Maybe it could be
regex_train_output = re.compile('Train net output #(\d+): (\S+) = ([0-9]*.?[0-9]+([eE][-+]?[0-9]+)?.)')
in order to deal with scientific notations when the loss is closer to zero.

Update pushed, thanks for the suggestion, @zhfe99

* A sample code was added. * `slice_dim` and `slice_point` attributes were explained.

[docs] brief explanation of SLICE layer's attributes

See https://github.com/BVLC/caffe/blob/master/models/bvlc_reference_caffenet/solver.prototxt

Correct 'epochs' to 'iterations'

Next: release candidater

…current project

fix Imagenet example path

@mees

set the right rpath for tools and examples respectively thanks for the report @mees!

[build] fix dynamic linking of tools

… was overwritten with symlink created at build time and installed with install(DIRECTORY ...)

change resorce to resource

2.7.0 isn't really necessary - 2.3.0 is sufficient. This is the version available on Ubuntu 14.04 via apt-get, and seems to be a reasonable lowest common denominator in general. http://pillow.readthedocs.org/installation.html#old-versions

CUDNN_CONVOLUTION_FWD_PREFER_FASTEST requires a lot of GPU memory, which may not always be available. Add a fallback path that uses CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM when the allocation fails.

Fallback to different cuDNN algorithm when under memory pressure; fix BVLC#2197

Downgrade Pillow pip requirement

Remove scikit-learn dependency -- the need is noted in the relevant example

Previously, CUDA_VERSION would appear to be < 7 if there was no CUDA installed, and that would generate the wrong C++ flags for compiling on recent OSX versions. Instead, skip the CUDA version check if CPU_ONLY is set. This change only affects CPU_ONLY installations.

Fix Travis: no need to remove libm in new Miniconda

[build] check if CPU_ONLY is set when determining CUDA version

[example] change `import Image` for forward compatibility

[docs] fix brew command for OS X install

Output CSV format is unchanged Interface is changed; fields are no longer passed around, since the field order is self-contained within the OrderedDict Python 2.7 or higher is required for OrderedDict

E.g., if the top is called "accuracy", the output column name will also be "accuracy" We no longer make any assumptions about the name of the top

Only assumption for top name now is that it contains no white space; any other character is OK

This fixes a bug where the last row was not getting pushed because, after the row had been built, output_match was never true again

Otherwise the values are strings, which is silly, since they're all numbers

dgolden1 · 2015-04-22T14:55:30Z

Replaced by master-based version at #2350

zhfe99 reviewed Dec 26, 2014
View reviewed changes

shelhamer and others added 21 commits January 24, 2015 18:27

clarify draw_net.py usage: net prototxt, not caffemodel

2f869e7

[docs] ask install + hardware questions on caffe-users

61c63f6

[docs] send API link to class list

4cc8195

[docs] add check mode hint to CPU-only mode error

1f7c3de

Brief explanation of SLICE layer's attributes

8b96472

* A sample code was added. * `slice_dim` and `slice_point` attributes were explained.

lint 1f7c3de

75d0e16

Merge pull request BVLC#1817 from boechat107/patch-1

e3c895b

[docs] brief explanation of SLICE layer's attributes

Correct 'epochs' to 'iterations'

1e0d49a

See https://github.com/BVLC/caffe/blob/master/models/bvlc_reference_caffenet/solver.prototxt

Merge pull request BVLC#1879 from bamos/patch-1

3e9b050

Correct 'epochs' to 'iterations'

Merge pull request BVLC#1849 from BVLC/next

f998127

Next: release candidater

Updated the path for get_ilsvrc_aux.sh to match what is found in the …

af01b9c

…current project

Merge pull request BVLC#1914 from eerwitt/master

5ee85b7

fix Imagenet example path

[build] fix dynamic linking of tools

eabbccd

set the right rpath for tools and examples respectively thanks for the report @mees!

Merge pull request BVLC#1921 from shelhamer/fix-tool-linking

682d9da

[build] fix dynamic linking of tools

Add failing tests for LRNLayer due to large local region

dc12757

Bounds checks for cross-channel LRN.

e27c369

check caffe tool runs in runtest

5a26333

ignore pycharm files

a1e951d

set proper CMAKE_INSTALL_RPATH for _caffe.so and tools

fca05c3

fixed bug in install-tree: _caffe.so installed by install(TARGET ...)…

645aa03

… was overwritten with symlink created at build time and installed with install(DIRECTORY ...)

minor cmake sumamry log fix

5e06d16

lukeyeager and others added 26 commits March 24, 2015 17:42

Remove scikit-learn dependency

655246d

Add note in example about installing scikit-learn

8be7842

Merge pull request BVLC#2160 from TorosFanny/master

c308986

change resorce to resource

Downgrade Pillow pip requirement

3487d24

2.7.0 isn't really necessary - 2.3.0 is sufficient. This is the version available on Ubuntu 14.04 via apt-get, and seems to be a reasonable lowest common denominator in general. http://pillow.readthedocs.org/installation.html#old-versions

Fallback to different cuDNN algorithm when under memory pressure

add73fb

CUDNN_CONVOLUTION_FWD_PREFER_FASTEST requires a lot of GPU memory, which may not always be available. Add a fallback path that uses CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM when the allocation fails.

Merge pull request BVLC#2211 from nsubtil/fix-cudnn-algo

5c009d8

Fallback to different cuDNN algorithm when under memory pressure; fix BVLC#2197

Merge pull request BVLC#2199 from lukeyeager/downgrade-pillow

a2c7cf4

Downgrade Pillow pip requirement

Merge pull request BVLC#2192 from lukeyeager/remove-scikit-learn

12f559e

Remove scikit-learn dependency -- the need is noted in the relevant example

Fix: libm.* doesn not exist

0035b2f

Merge pull request BVLC#2231 from tnarihi/fix-travis-miniconda

d39a3eb

Fix Travis: no need to remove libm in new Miniconda

Merge pull request BVLC#2224 from small-yellow-duck/master

2a7fe03

[build] check if CPU_ONLY is set when determining CUDA version

[docs] switch lmdb url for manual install, tweak formatting

cbef35c

Changing Image import to be imported from PIL.

c832310

Merge pull request BVLC#2287 from eerwitt/web-demo-import

3f8cf4c

[example] change `import Image` for forward compatibility

improved installation for osx

ef27496

Merge pull request BVLC#2295 from akiomik/patch-1

f8dc62c

[docs] fix brew command for OS X install

set default DISTRIBUTE_DIR -- fix BVLC#2328

c6414ea

Specify field order with OrderedDict instead of dict

3db3b44

Output CSV format is unchanged Interface is changed; fields are no longer passed around, since the field order is self-contained within the OrderedDict Python 2.7 or higher is required for OrderedDict

Name output column by original name of network top

6898de1

E.g., if the top is called "accuracy", the output column name will also be "accuracy" We no longer make any assumptions about the name of the top

Fix missing learning rate of first row by copying it from second row

a3ba86a

Add optional parameter to specify table delimiter

feb1bcf

Match on any non-whitespace character for top name

3d682a8

Only assumption for top name now is that it contains no white space; any other character is OK

Push row if it's full

9a66936

This fixes a bug where the last row was not getting pushed because, after the row had been built, output_match was never true again

Update regex to catch scientific notation, like 1.0e-2 or 1.0E+4

175fe1e

Convert output values to floats

f74c649

Otherwise the values are strings, which is silly, since they're all numbers

dgolden1 force-pushed the log-parser-python-improved branch from 4da07e8 to f74c649 Compare April 22, 2015 14:52

dgolden1 mentioned this pull request Apr 22, 2015

Python log parser improvements #2350

Merged

dgolden1 closed this Apr 22, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python log parser improvements #1547

Python log parser improvements #1547

dgolden1 commented Dec 9, 2014

dgolden1 commented Dec 9, 2014

jamesguoxin commented Dec 16, 2014

dgolden1 commented Dec 16, 2014

jamesguoxin commented Dec 16, 2014

jamesguoxin commented Dec 17, 2014

dgolden1 commented Dec 17, 2014

dgolden1 commented Dec 17, 2014

jamesguoxin commented Dec 17, 2014

zhfe99 Dec 26, 2014

dgolden1 Dec 27, 2014

dgolden1 commented Apr 22, 2015

Python log parser improvements #1547

Python log parser improvements #1547

Conversation

dgolden1 commented Dec 9, 2014

dgolden1 commented Dec 9, 2014

jamesguoxin commented Dec 16, 2014

dgolden1 commented Dec 16, 2014

jamesguoxin commented Dec 16, 2014

jamesguoxin commented Dec 17, 2014

dgolden1 commented Dec 17, 2014

dgolden1 commented Dec 17, 2014

jamesguoxin commented Dec 17, 2014

zhfe99 Dec 26, 2014

Choose a reason for hiding this comment

dgolden1 Dec 27, 2014

Choose a reason for hiding this comment

dgolden1 commented Apr 22, 2015