-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix calculation of curve fit weights #1224
Fix calculation of curve fit weights #1224
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this fix. I tried on some example data with yerr
set artificially low, and the fitting is much improved. The 90th percentile clipping of outlier weights seems reasonable. I wonder if #1107 might be solved by this fix, although previously, the infinite weights should have been handled by LMFIT so I'm still not sure what's causing that random error.
The 90th percentile clipping feels a bit ad-hoc to me. Do we really need to use WLE (weighted least square) for fitting in general? Could you give us some background why we decided to use WLE instead of OLE (ordinary least square)? |
In the backend code this is not considered. However in QE we can precisely compute error propagation during the data processing and formatting thanks to the Another example is RB. This experiment uses sample average over the seeds (rather than sampling error), and the survival probability tends to diverge in shorter Clifford lengths due to deviation of the total counts of the physical gates of the error source, while it converges to a particular P1 at the tale. So I think weighted least square can add more weights to the tale, yielding better estimation of the exponent. |
bc5aa54
to
88b8086
Compare
### Summary This PR updates calculation of weights to compute residual in the curve fitting. ### Details and comments When the error bar of data points is significantly small, these data points become a dominant source of residual to minimize. This means other data points contribute little to the fit, and causes local overfit to certain data points. This is fixed by clipping the weights to remove outlier. (cherry picked from commit 6a06e74)
This is an automatic backport of pull request #1224 done by [Mergify](https://mergify.com). Co-authored-by: Naoki Kanazawa <nkanazawa1989@gmail.com>
@itoko There was some related discussion in #417 and #939. I tend to agree with you that the weights are perhaps not doing enough good to be worth using. This PR clipping the weights to the 90th percentile might not be much different from using ordinary least squares (it would be interesting to compare). It might be worth continuing a discussion about weights and fit quality somewhere. I think part of the issue that this PR is addressing is that our fit models are not infinitely accurate. There may be small deviations in the dependence on the independent variables from the analytic functions. When the weights are all about even, these deviations do not matter much for a fit to parameters within a few percent. Assuming the binomial distribution though we can get quite small errors and quite large weights and then the curve fitting routines are very punishing for small deviations from the model. Part of what makes me say this is that for a sample with good SNR when it gives near all 0 or all 1 counts I trust that the result really is close to 0 or 1 with small uncertainty, but I have seen that give poor chi squared for data sets that look pretty reasonable by eye. |
### Summary This PR updates calculation of weights to compute residual in the curve fitting. ### Details and comments When the error bar of data points is significantly small, these data points become a dominant source of residual to minimize. This means other data points contribute little to the fit, and causes local overfit to certain data points. This is fixed by clipping the weights to remove outlier.
Summary
This PR updates calculation of weights to compute residual in the curve fitting.
Details and comments
When the error bar of data points is significantly small, these data points become a dominant source of residual to minimize. This means other data points contribute little to the fit, and causes local overfit to certain data points. This is fixed by clipping the weights to remove outlier.