-
Notifications
You must be signed in to change notification settings - Fork 426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QKeras, predict/trace, API enhancements #195
Conversation
* add comparison support for each layer's ouput * clean up a bit * take ouput of layer and activation separately * API changes and add distribution plot * new method for normalizing the difference * mistakenly overwrote file when merge * complete support for comparing ouputs with trace, convert hls_model's outputs to numpy array * change codes according to comments * use np.asarray and fix bug on compiling
* bsah not sh * Fix to precision inference from qkeras config
…on properly. Propagate new Softmax to keras_to_hls.
QKeras updates
…respectively rely on are present.
Fix imports
I promised some more detail on the new Softmax implementation, here it is. I created a jupyter notebook that you can see here: https://gist.github.com/thesps/c7e59cf5597804693b8663ddc9496b64 To summarise what I did:
So just to make it clear, I've isolated only the Softmax layer, and I'm running inference with the same data each time, extracted from Keras.
I don't think either is "perfect" but the new version is definitely more peaked around 0. Perhaps more usefully are the ROC curves where the difference is a bit more visible: And finally, the synthesis results (latency in clock cycles, resources in absolute numbers).
Generally the new implementation will use, for N classes: N DSPs and ceil(N/2) + 1 BRAMs assuming 18 bits used for the tables. The other new feature is that the two look up tables can use different precision, with the option exposed in the config file. For the results above I used |
…ion for Softmax activation.
Add missing BatchNormalization include for ApplyAlpha layer
Absolute imports
…n/out types are different. Add exp_ and inv_ table_t to Softmax layer in config generator.
…into csim_integration
@thesps I'm attempting to utilize the QKeras support mentioned above, but no matter how I attempt the conversion (old fashion way with the command line or using the API) I get the same exception:
I can confirm that I am attempting the conversion while using the csim_integration branch. My guess is that the QKeras layers are not being registered here, but I'm sure exactly how that block of code works. Edit: |
Thanks @veyron8800. We think the So, I'm going to merge this now since it seems the "core" functionality is working well, and we will continue to work on these user-friendliness aspects as we go. |
QKeras, predict/trace, API enhancements
This is a big one...
There are three significant new things in this PR: QKeras support,
predict
, and enhancements to the API. Bothpredict
and expanded API were essential for the QKeras support.In reverse order:
API
One can do much more by importing hls4ml than before. Previously this stopped at, basically:
Now it's much richer and we can do:
All the 'old' ways of working still apply, so
hls4ml convert -c ...
,hls4ml build -cs -p ...
from the command line still work. And the API allows the 'old' way above:keras_to_hls
with a config file returning anHLSModel
(but you can do more with the model now).predict
,trace
This is the reason the branch is called
csim_integration
. Now with anHLSModel
object (created as above with the API, for example), one can do:It's pretty self explanatory, but awesome.
X
is an array object just as you would use to predict with the Keras (or whatever) Python model, and the returnedy
is an array as well. Then you can compare your predictions vs. the original float model with ease, make ROC curves, etc.Technically this is implemented by compiling the HLS project (together with the
ap_fixed
and other headers) into a shared object, and binding it to Python usingctypes
. So what you get is "csim" but without running Vivado, and in fact this works without an installation of Vivado on the system(!).trace
is the related capability to get the output from individual layers from the model for lower-level debugging. This is useful when using custom precision throughout the model for example.It's used like:
trace = hls_model.trace(X)
. The returned object is a dictionary. The names of the (HLS) model layers are used as keys, and each one points to the array captured at the output of that layer.A utility has been added to the profiling to get the equivalent for a Keras model:
trace = hls4ml.model.profiling.get_ymodel(keras_model, X)
.hls4ml
does things like separating activations out of Dense layers into their own layers, and this utility does that too to make it easier to make comparisons.QKeras support
We can now support QKeras models using most of the quantizers that QKeras has. Some more detail can be obtained from this presentation.
When a Keras model uses
QDense
,QConv
, etc layers, the quantizer is extracted and used to quantize the weights. The HLS data type is set according to the settings of the quantizer.The QKeras quantizer
alpha
parameter is handled with a new layerApplyAlpha
(uses BatchNorm for inference) to scale the weights after the Dense or Conv layer. The proper conversion is handled at the config-dictionary level, and with new Optimizer passes. To get proper performance, users should use the hls4ml config file utility demonstrated above:Other
There is a new implementation of Softmax activation. The existing implementation had a few hard coded constants so was difficult to configure with different data types, and would occasionally bug and output the largest value for the smallest input value (I think a saturation issue).
The new version was also made more configurable, with the possibility for different types for the e^x and 1/x tables. I will follow up this PR with a comment benchmarking the performance, but I've found it to be smaller, faster and more accurate.