Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python log parser improvements #1547

Closed
wants to merge 167 commits into from

Conversation

dgolden1
Copy link
Contributor

@dgolden1 dgolden1 commented Dec 9, 2014

Improvements to python log parser introduced in #1384

Highlights:

  • Interface change: column order is now determined by using a list of OrderedDict objects instead of dict objects, which obviates the need to pass around a tuple with the column orders.
  • The outputs are now named according to their names in the network protobuffer; e.g., if your top is named loss, then the corresponding column header will also be loss; we no longer rename it to, e.g., TrainingLoss or TestLoss.
  • Fixed the bug/feature of the first version where the initial learning rate was always NaN.
  • Add optional parameter to specify output table delimiter. It's still a comma by default.

You can use Matlab code from this gist to verify that your results are the same before and after the changes introduced in this pull request. That code assumes that your top names are accuracy and loss, but you can modify the code if that's not true.

One caveat: we expect that the output layer names consist only of alphanumeric characters and the underscore, and are therefore captured with the \w token in the Python regular expression syntax. If other characters are allowed in layer names, we'll have to change the \w token to something else.

Update: any top name without white space is OK.

@dgolden1
Copy link
Contributor Author

dgolden1 commented Dec 9, 2014

@sguada have a look and please confirm that I addressed all of your comments

@jamesguoxin
Copy link

Hi drdan14,

Do you know how to generate .log file for your log parser in Caffe? I'm stuck here without clue. I mean usually we run executable as "./build/tools/caffe train --solver=/path/to/my_solver.prototxt", do I need to add a flag to create the log file? Thank you very much for your help!!

@dgolden1
Copy link
Contributor Author

It's in /tmp by default, or you can add the --log_dir parameter.

Please ask these sorts of questions in the caffe users Google group at https://groups.google.com/forum/m/#!forum/caffe-users

@jamesguoxin
Copy link

Thanks a lot!

@jamesguoxin
Copy link

Hi drdan14,

I used your python parser today and I think I found a bug. In parse_log(path_to_log) function, when you return train_dict_list and test_dict_list, it actually misses the last value stored in train_row and test_row. This will lead to the result that we lost last testing and training result. Thanks!

@dgolden1
Copy link
Contributor Author

@jamesguoxin you're quite right, that is a bug. It's a function of the fact that parse_log.py only adds the current row to the list of dictionaries when it finds the next row; for the last line, there is no next row, so the current row doesn't get added.

I didn't notice this because I usually have so many lines in my log that I don't care about the last one. But I'll ponder a fix.

@dgolden1
Copy link
Contributor Author

@jamesguoxin fix pushed

@jamesguoxin
Copy link

Great!

re_output_loss = re.compile('output #\d+: loss = ([\.\d]+)')
re_lr = re.compile('lr = ([\.\d]+)')
regex_iteration = re.compile('Iteration (\d+)')
regex_train_output = re.compile('Train net output #(\d+): (\S+) = ([\.\d]+)')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could be
regex_train_output = re.compile('Train net output #(\d+): (\S+) = ([0-9]*.?[0-9]+([eE][-+]?[0-9]+)?.)')
in order to deal with scientific notations when the loss is closer to zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update pushed, thanks for the suggestion, @zhfe99

shelhamer and others added 21 commits January 24, 2015 18:27
* A sample code was added.
* `slice_dim` and `slice_point` attributes were explained.
[docs] brief explanation of SLICE layer's attributes
Correct 'epochs' to 'iterations'
Next: release candidater
set the right rpath for tools and examples respectively

thanks for the report @mees!
[build] fix dynamic linking of tools
… was overwritten with symlink created at build time and installed with install(DIRECTORY ...)
lukeyeager and others added 26 commits March 24, 2015 17:42
2.7.0 isn't really necessary - 2.3.0 is sufficient. This is the version
available on Ubuntu 14.04 via apt-get, and seems to be a reasonable lowest
common denominator in general.

http://pillow.readthedocs.org/installation.html#old-versions
CUDNN_CONVOLUTION_FWD_PREFER_FASTEST requires a lot of GPU memory, which may
not always be available. Add a fallback path that uses
CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM when the allocation fails.
Fallback to different cuDNN algorithm when under memory pressure; fix BVLC#2197
Remove scikit-learn dependency -- the need is noted in the relevant example
Previously, CUDA_VERSION would appear to be < 7 if there was no CUDA
installed, and that would generate the wrong C++ flags for compiling on
recent OSX versions. Instead, skip the CUDA version check if CPU_ONLY is
set. This change only affects CPU_ONLY installations.
Fix Travis: no need to remove libm in new Miniconda
[build] check if CPU_ONLY is set when determining CUDA version
[example] change `import Image` for forward compatibility
[docs] fix brew command for OS X install
Output CSV format is unchanged

Interface is changed; fields are no longer passed around, since the field order is self-contained within the OrderedDict

Python 2.7 or higher is required for OrderedDict
E.g., if the top is called "accuracy", the output column name will also be "accuracy"

We no longer make any assumptions about the name of the top
Only assumption for top name now is that it contains no white space; any other character is OK
This fixes a bug where the last row was not getting pushed because, after the row had been built, output_match was never true again
Otherwise the values are strings, which is silly, since they're all numbers
@dgolden1 dgolden1 force-pushed the log-parser-python-improved branch from 4da07e8 to f74c649 Compare April 22, 2015 14:52
@dgolden1
Copy link
Contributor Author

Replaced by master-based version at #2350

@dgolden1 dgolden1 closed this Apr 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.