-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tensorboard smoothing is biased to initial values #610
Comments
Hi Alex—if I recall correctly, @dandelionmane responded to this in the original Google-internal issue. Would you mind posting the response here? |
What I wrote on the Google-internal bug:
Here's the discussion from the last time we changed the smoothing algorithm: tensorflow/tensorflow#7891 (proposed) tensorflow/tensorflow#8363 (merged) @alexirpan would you take a stab at implementing the debiased smoothing, and show us how it would perform on a few realistic cases? I'm guessing that on the common case of rapid initial descent on loss curves it would look a lot better. |
Closing this out since I believe it was fixed by #639. |
The relevant code is in tensorboard/tensorboard/components/vz_line_chart/vz-line-chart.ts
The implementation of resmoothDataset is
It computes the exponential moving average over the list of data points. However, for finite sequences the EMA is biased.
Let x_i be the raw data values, and y_i be the exponential moving average, defined as
y_0 = x_0
y_i = smooth * y_{i-1} + (1-smooth) * x_i.
For y_n, expanding the EMA gives
y_n = (1-smooth) * x_n + smooth (1-smooth) * x_{n-1} + (smooth)^2 * (1-smooth) * x_{n-2} + ....
and so on. The weight on x_1 is (smooth)^{n-1} * (1 - smooth), and the weight on x_0 is (smooth)^n
Between every two consecutive x_i, x_{i-1}, the weight of its contribution gets multiplied by "smooth", except for x_1 and x_0. In that case, the weight is multiplied by (smooth) / (1 - smooth). If smooth = 0.95, this effectively makes x_0 matter 19 times as much to the final average as x_1.
To undo the biasing, I believe there should be a way to do something similar to https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/training/moving_averages.py#L155, where you add a correction term based on how many data points you've seen so far.
The text was updated successfully, but these errors were encountered: