-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translate assembly text templates into avo programs #529
Comments
Yeah, no biggie. We can do it when it suits you. Should be pretty straightforward. If you'd like I can help out, or do the initial translation from your templates. Will make things like temp register allocation and inlining functionality much easier. |
I plan to add |
@WojciechMula I started to convert the |
Great, I planned to work on this the next week. I wanted to complete the already opened PRs before. |
@WojciechMula https://gist.github.com/klauspost/8949f70d98dd94116392019f119087e5 Ended up doing it all, but did test it at all. Place in a subfolder called I think we should just do a standalone cpuid check for now. |
You can rearrange as you see fit. I don't want The output if you're curious. |
Made a few tweaks and added more documentation. Unfortunately it crashes. Tried to debug a bit, but I will have to leave it for a while, until I've caught up on other things. |
That looks great, thank you. Let me continue your work, I'll work more on this on Tuesday. |
@WojciechMula Got it to not crash, but only output wrong values. I will see if I can get it to pass 🤞🏼 |
@WojciechMula Did you manage to spot my mistake, otherwise I will take another look through the code. |
Sorry, didn't check it yesterday. I planned to look at that problem this evening. |
Sure. I will take another pass through it. |
@WojciechMula FINALLY! After hours of debugging I finally got it working. The gist above is updated. I have noted a few potential improvements - for now just getting it working is a big win. This will make experiments with register allocations a bit easier as well. |
That's great news! Speaking of improvements, I'm wondering if it's possible to front-pad a bit-stream data with four zero bytes. Then we will not need to have to test |
@klauspost You wrote "TODO: We should be able to extract bits with BEXTRQ". I tried it already, the code is not shorter and isn't faster.
The current solution:
|
We will. Bad codes can cause considerable underrun, much beyond 4 bytes. The branch is fully predictable until it is taken, so potential performance improvement will be low-to-none. |
I am not concerned about further opts right now. Once we have the pipeline implemented we can look at the overall picture and do microopts, like looking at register allocs, etc. Having execute with history and decodeSync (doesn't need history AFAIR) would be a great start. |
@klauspost Did you have any problems with any of these lines |
@WojciechMula Oh, yeah, totally forgot:
|
I'll translate also Decompress4X from |
@WojciechMula I am not going to stop you :) |
@klauspost I also saw you add some TODO items to |
Did some small tweaks here: #537 - having to wait a bit before I can bench. |
👀 |
OK, now we have pure |
The assembler procedures are currently generated from text templates. As @klauspost suggested, it would be easier to maintain this quite complex code with
avo
(https://github.com/mmcloughlin/avo).The text was updated successfully, but these errors were encountered: