-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use limited ABI to reduce number of required wheels #97
Comments
This has progressed, and now works except for the polymer extension (calc_g_zs_cex.c) @richardsheridan - it has been suggested that we consider numba as an alternative to the compiled extension. Would you be opposed to that? |
Hi Brian, good to hear from you! I think numba is a fine idea, I was already experimenting with it at the time on my private repo of self-consistent field theory code: sf_nscf. You can lift as much code as you need or just use it as an inspiration for what functions to write. This was back before you were allowed allocate memory inside a jitted function, so they are a bit awkward looking, but profiling showed the jitted numba functions were just as fast or better than the c extensions... I just didn't want to ask Paul to make numba a dependency just for my little addition. In that code, Just @ me if you have questions or want a code review or whatever. |
Hi, Richard - glad to hear back from you, too!
The way you've written the functions in sf_nscf, the numba dependency is optional. I think we can stick with that. Will your tests catch it if I don't implement this correctly?
Thanks - Brian
…________________________________
From: richardsheridan <notifications@github.com>
Sent: Monday, February 8, 2021 2:37 PM
To: reflectometry/refl1d <refl1d@noreply.github.com>
Cc: Maranville, Brian B. (Fed) <brian.maranville@nist.gov>; Comment <comment@noreply.github.com>
Subject: Re: [reflectometry/refl1d] Use limited ABI to reduce number of required wheels (#97)
Hi Brian, good to hear from you!
I think numba is a fine idea, I was already experimenting with it at the time on my private repo of self-consistent field theory code: sf_nscf<https://github.com/richardsheridan/sf_nscf/blob/3e72e5f3777928e8eb844f93bd48549bca8c9041/util.py#L185>.
You can lift as much code as you need or just use it as an inspiration for what functions to write. This was back before you were allowed allocate memory inside a jitted function, so they are a bit awkward looking, but profiling showed the jitted numba functions were just as fast or better than the c extensions... I just didn't want to ask Paul to make numba a dependency just for my little addition.
In that code, fastsum and compose are faster simply because I'm iterating naturally over a Fortran-ordered 2D array, I think. Some profiling might be in order to see if they still make a difference.
Just @ me if you have questions or want a code review or whatever.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#97 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAFHT2U6FB6W3FFPK4G2HJLS6A4N7ANCNFSM4TZINQBA>.
|
The tests are quite sensitive numerically, they might catch it even if you implement it correctly 🤷 |
For example, you might be able to re-enable the SCFCache test by setting atol around 1e-8 and disabling rtol. The function is pretty nonlinear so rtol can bite you for otherwise negligible changes. |
@richardsheridan I didn't see fastsum being used anywhere so I didn't include it. I didn't see any speed increase from compose when I timed it against your polymer test functions in I did some sensitivity testing on SCFCache, and yes, it is really non-linear. Is that because of this particular parameter set, or is that generally true? The optimizers are stepping on the order of 1e-6, but I noticed significant differences when stepping your function by 1e-15. I fear that the fitting results aren't going to be reliable. I've posted my branch on PR #103 |
Hi @pkienzle! I left a code review on the PR, I don't know if you use those much. Anyway it was just minor points. That test was created at a particularly nonlinear point of parameter space to challenge the cache walk algorithm. There is a point beyond which the layer is just a step function and the thing won't converge; I think it is up to the users to switch models at that point. The rest of the parameter space is more well behaved. Which optimizers do you mean? I was using DREAM for my project. The other solvers seemed well behaved with the limited testing I did on them, but I never really challenged them. |
Oh and I have no complaint about leaving |
This has been implemented (after converting the polymer extension code to numba) with d9ae39e |
No description provided.
The text was updated successfully, but these errors were encountered: