-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch translate() to the header file #6440
Conversation
What caused the Trinket M0 |
I can experiment with this on other builds. Definitely conditionalize on LTO. |
If you compiled |
This allows the compile stage to optimize most of the translate() function away and saves a ton of space (~40k on ESP). *However*, it requires us to wait for the qstr output before we compile the rest of our .o files. (Only qstr.o used to wait.) This isn't as good as the current setup with LTO though. Trinket M0 loses <1k with this setup. So, we should probably conditionalize this along with LTO.
I suspect LTO is duplicating copies of the compressed data because each compilation unit (.o) now has it's own copy. So if the same error occurs in two files, there will now be two copies of it. I haven't proven this though. |
Ok, @dhalbert. This finally built. There is a lot of diff noise due to the header move. Let me know if you want me to clean it up. I considered folding the translate.h include into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great work! One formatting oddity, and one change to improve the compile times.
The LTO builds are taking about the same amount of time, but the non-LTO builds are now much longer, and each translate build takes about 3x the time of before. Basically the non-LTO builds are now catching up to the LTO builds in slowness. But it does save a lot of space. So the entire latest PR build was about 132 minutes, compared with about 75 minutes before. |
I am thinking about two ways around the long build times:
|
I think the simplest thing would be to have the TRANSLATE_OBJECT version on by default and only do the header thing when we need the space (like small S3 builds.) |
This breaks the translation dependency to all of the other objects and therefore speeds up subsequent builds. Now, even when the big translate() function is inlined in the header, it only needs to be optimized once.
qstrdefs.generated.h no longer includes the translated strings. So, use the .po file directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reran the latest build when nothing else was queued, and got a total runtime of 1h 16m 18s, which is only 10 minutes more than what I was getting in #6436. Thanks for spending the time on this: the space savings is great!
The single failure was a transient CI issue. |
This allows the compile stage to optimize most of the translate()
function away and saves a ton of space (~40k on ESP). However, it
requires us to wait for the qstr output before we compile the rest
of our .o files. (Only qstr.o used to wait.)
This isn't as good as the current setup with LTO though. Trinket M0
loses <1k with this setup.
So, we should probably conditionalize this along with LTO.