Fix calculation of curve fit weights #1224

nkanazawa1989 · 2023-07-12T07:17:21Z

Summary

This PR updates calculation of weights to compute residual in the curve fitting.

Details and comments

When the error bar of data points is significantly small, these data points become a dominant source of residual to minimize. This means other data points contribute little to the fit, and causes local overfit to certain data points. This is fixed by clipping the weights to remove outlier.

coruscating

Thanks for this fix. I tried on some example data with yerr set artificially low, and the fitting is much improved. The 90th percentile clipping of outlier weights seems reasonable. I wonder if #1107 might be solved by this fix, although previously, the infinite weights should have been handled by LMFIT so I'm still not sure what's causing that random error.

CLAassistant · 2023-07-18T13:24:15Z

All committers have signed the CLA.

itoko · 2023-07-19T06:29:45Z

The 90th percentile clipping feels a bit ad-hoc to me. Do we really need to use WLE (weighted least square) for fitting in general? Could you give us some background why we decided to use WLE instead of OLE (ordinary least square)?
(c.f. https://www.itl.nist.gov/div898/handbook/pmd/section1/pmd143.htm)

nkanazawa1989 · 2023-07-19T06:46:39Z

In the backend code this is not considered. However in QE we can precisely compute error propagation during the data processing and formatting thanks to the uncertainties package, and we decided to consider this error information also in the fitting. This also impacts the chisq of the fitting.

Another example is RB. This experiment uses sample average over the seeds (rather than sampling error), and the survival probability tends to diverge in shorter Clifford lengths due to deviation of the total counts of the physical gates of the error source, while it converges to a particular P1 at the tale. So I think weighted least square can add more weights to the tale, yielding better estimation of the exponent.

### Summary This PR updates calculation of weights to compute residual in the curve fitting. ### Details and comments When the error bar of data points is significantly small, these data points become a dominant source of residual to minimize. This means other data points contribute little to the fit, and causes local overfit to certain data points. This is fixed by clipping the weights to remove outlier. (cherry picked from commit 6a06e74)

This is an automatic backport of pull request #1224 done by [Mergify](https://mergify.com). Co-authored-by: Naoki Kanazawa <nkanazawa1989@gmail.com>

wshanks · 2023-09-05T20:49:29Z

@itoko There was some related discussion in #417 and #939. I tend to agree with you that the weights are perhaps not doing enough good to be worth using. This PR clipping the weights to the 90th percentile might not be much different from using ordinary least squares (it would be interesting to compare). It might be worth continuing a discussion about weights and fit quality somewhere. I think part of the issue that this PR is addressing is that our fit models are not infinitely accurate. There may be small deviations in the dependence on the independent variables from the analytic functions. When the weights are all about even, these deviations do not matter much for a fit to parameters within a few percent. Assuming the binomial distribution though we can get quite small errors and quite large weights and then the curve fitting routines are very punishing for small deviations from the model. Part of what makes me say this is that for a sample with good SNR when it gives near all 0 or all 1 counts I trust that the result really is close to 0 or 1 with small uncertainty, but I have seen that give poor chi squared for data sets that look pretty reasonable by eye.

### Summary This PR updates calculation of weights to compute residual in the curve fitting. ### Details and comments When the error bar of data points is significantly small, these data points become a dominant source of residual to minimize. This means other data points contribute little to the fit, and causes local overfit to certain data points. This is fixed by clipping the weights to remove outlier.

Fix calculation of fit weights

17bcc19

nkanazawa1989 added backport stable potential The issue or PR might be minimal and/or import enough to backport to stable Changelog: Bugfix Include in the "Fixed" section of the changelog labels Jul 12, 2023

coruscating reviewed Jul 13, 2023

View reviewed changes

coruscating approved these changes Jul 26, 2023

View reviewed changes

nkanazawa1989 added this pull request to the merge queue Aug 15, 2023

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 15, 2023

nkanazawa1989 added this to the Release 0.6 milestone Aug 18, 2023

coruscating added this pull request to the merge queue Aug 29, 2023

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 29, 2023

nkanazawa1989 added this pull request to the merge queue Aug 31, 2023

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 31, 2023

Suppress numpy warnings in lmfit

88b8086

nkanazawa1989 force-pushed the fix/handling_of_zero_yerr branch from bc5aa54 to 88b8086 Compare August 31, 2023 06:39

nkanazawa1989 enabled auto-merge August 31, 2023 06:39

nkanazawa1989 added this pull request to the merge queue Aug 31, 2023

Merged via the queue into qiskit-community:main with commit 6a06e74 Aug 31, 2023

mergify bot mentioned this pull request Aug 31, 2023

Fix calculation of curve fit weights (backport #1224) #1262

Merged

wshanks pushed a commit that referenced this pull request Sep 5, 2023

Fix calculation of curve fit weights (backport #1224) (#1262)

8bf58e9

This is an automatic backport of pull request #1224 done by [Mergify](https://mergify.com). Co-authored-by: Naoki Kanazawa <nkanazawa1989@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix calculation of curve fit weights #1224

Fix calculation of curve fit weights #1224

nkanazawa1989 commented Jul 12, 2023

coruscating left a comment

CLAassistant commented Jul 18, 2023 •

edited

Loading

itoko commented Jul 19, 2023 •

edited

Loading

nkanazawa1989 commented Jul 19, 2023 •

edited

Loading

wshanks commented Sep 5, 2023

Fix calculation of curve fit weights #1224

Fix calculation of curve fit weights #1224

Conversation

nkanazawa1989 commented Jul 12, 2023

Summary

Details and comments

coruscating left a comment

Choose a reason for hiding this comment

CLAassistant commented Jul 18, 2023 • edited Loading

itoko commented Jul 19, 2023 • edited Loading

nkanazawa1989 commented Jul 19, 2023 • edited Loading

wshanks commented Sep 5, 2023

CLAassistant commented Jul 18, 2023 •

edited

Loading

itoko commented Jul 19, 2023 •

edited

Loading

nkanazawa1989 commented Jul 19, 2023 •

edited

Loading