Switch translate() to the header file #6440

tannewt · 2022-05-26T23:48:41Z

This allows the compile stage to optimize most of the translate()
function away and saves a ton of space (~40k on ESP). However, it
requires us to wait for the qstr output before we compile the rest
of our .o files. (Only qstr.o used to wait.)

This isn't as good as the current setup with LTO though. Trinket M0
loses <1k with this setup.

So, we should probably conditionalize this along with LTO.

d-c-d · 2022-05-27T01:07:30Z

What caused the Trinket M0
To grow by 1k with this setup? ( duplication of strings? )
Haven't studied the changes yet

dhalbert · 2022-05-27T01:15:13Z

I can experiment with this on other builds. Definitely conditionalize on LTO.

dhalbert · 2022-05-27T02:03:43Z

If you compiled translate.c with -O3 or something like that, would it help the LTO builds (or the non-LTO builds)?

This allows the compile stage to optimize most of the translate() function away and saves a ton of space (~40k on ESP). *However*, it requires us to wait for the qstr output before we compile the rest of our .o files. (Only qstr.o used to wait.) This isn't as good as the current setup with LTO though. Trinket M0 loses <1k with this setup. So, we should probably conditionalize this along with LTO.

tannewt · 2022-05-27T17:57:53Z

What caused the Trinket M0 To grow by 1k with this setup? ( duplication of strings? ) Haven't studied the changes yet

I suspect LTO is duplicating copies of the compressed data because each compilation unit (.o) now has it's own copy. So if the same error occurs in two files, there will now be two copies of it. I haven't proven this though.

tannewt · 2022-05-31T23:57:32Z

Ok, @dhalbert. This finally built. There is a lot of diff noise due to the header move. Let me know if you want me to clean it up. I considered folding the translate.h include into py/runtime.h but the include-what-you-use philosophy would have you list it I think.

dhalbert

This is great work! One formatting oddity, and one change to improve the compile times.

ports/atmel-samd/common-hal/audiobusio/PDMIn.c

py/circuitpy_defns.mk

dhalbert · 2022-06-01T20:59:46Z

The LTO builds are taking about the same amount of time, but the non-LTO builds are now much longer, and each translate build takes about 3x the time of before. Basically the non-LTO builds are now catching up to the LTO builds in slowness. But it does save a lot of space.

So the entire latest PR build was about 132 minutes, compared with about 75 minutes before.

atmel-samd (LTO), before and after:

espressif, before and after:

dhalbert · 2022-06-01T23:13:22Z

I am thinking about two ways around the long build times:

On a PR, do only representative builds: English, and a few large builds, like de, ja, ru or even fewer. So kind of like the windows-builds. Do the complete set only on merge.
More radical: give up on using real strings in translate(), and use message ids, e.g. translate(MSG_INVALID_Q_PIN) or whatever. I don't mean qstr symbol nameparsing, they would be too long, but a central message table file. It could still be compiled per build, or if we can use id numbers, maybe it can just be a library to link in, with a separate section or whatever for each message, so there are no unused compressed message strings in a build.

tannewt · 2022-06-01T23:17:33Z

I think the simplest thing would be to have the TRANSLATE_OBJECT version on by default and only do the header thing when we need the space (like small S3 builds.)

This breaks the translation dependency to all of the other objects and therefore speeds up subsequent builds. Now, even when the big translate() function is inlined in the header, it only needs to be optimized once.

qstrdefs.generated.h no longer includes the translated strings. So, use the .po file directly.

dhalbert

I reran the latest build when nothing else was queued, and got a total runtime of 1h 16m 18s, which is only 10 minutes more than what I was getting in #6436. Thanks for spending the time on this: the space savings is great!

dhalbert · 2022-06-06T03:50:42Z

The single failure was a transient CI issue.

tannewt requested a review from dhalbert May 26, 2022 23:48

Conditionalize LTO

9d10a3d

tannewt force-pushed the translate_header branch from acd34b5 to 9d10a3d Compare May 27, 2022 20:02

tannewt added 4 commits May 27, 2022 15:39

Fix compiles

3cc46c7

Fix unix and pre-commit

8d55919

Separate translate object control from LTO

4d77633

Fix windows and two samd builds

7fc0aa5

tannewt marked this pull request as ready for review May 31, 2022 23:55

dhalbert requested changes Jun 1, 2022

View reviewed changes

ports/atmel-samd/common-hal/audiobusio/PDMIn.c Outdated Show resolved Hide resolved

py/circuitpy_defns.mk Outdated Show resolved Hide resolved

dhalbert reviewed Jun 1, 2022

View reviewed changes

py/circuitpy_defns.mk Show resolved Hide resolved

tannewt added 2 commits June 1, 2022 11:04

Fix PDMIn.c formatting

6d36988

Split partition from LTO enable

09c61ef

tannewt added 7 commits June 2, 2022 11:48

Move compressed strings into own object file

fd5ef00

This breaks the translation dependency to all of the other objects and therefore speeds up subsequent builds. Now, even when the big translate() function is inlined in the header, it only needs to be optimized once.

Fix display resources build

36b4d49

qstrdefs.generated.h no longer includes the translated strings. So, use the .po file directly.

Fix mpy-cross and unix builds

0d257fc

Fix mpy-cross again

b690107

Move translation .o to PY_CORE_O

8ccb955

Merge remote-tracking branch 'adafruit/main' into translate_header

be67067

Shrink MatrixPortal M4 build

be6936c

dhalbert approved these changes Jun 6, 2022

View reviewed changes

dhalbert merged commit ac282b2 into adafruit:main Jun 6, 2022

dhalbert mentioned this pull request Jun 23, 2022

stm32: Enable link-time optimisation as a build option micropython/micropython#8733

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch translate() to the header file #6440

Switch translate() to the header file #6440

tannewt commented May 26, 2022

d-c-d commented May 27, 2022

dhalbert commented May 27, 2022

dhalbert commented May 27, 2022

tannewt commented May 27, 2022

tannewt commented May 31, 2022

dhalbert left a comment

dhalbert commented Jun 1, 2022

dhalbert commented Jun 1, 2022 •

edited

Loading

tannewt commented Jun 1, 2022

dhalbert left a comment

dhalbert commented Jun 6, 2022

Switch translate() to the header file #6440

Switch translate() to the header file #6440

Conversation

tannewt commented May 26, 2022

d-c-d commented May 27, 2022

dhalbert commented May 27, 2022

dhalbert commented May 27, 2022

tannewt commented May 27, 2022

tannewt commented May 31, 2022

dhalbert left a comment

Choose a reason for hiding this comment

dhalbert commented Jun 1, 2022

dhalbert commented Jun 1, 2022 • edited Loading

tannewt commented Jun 1, 2022

dhalbert left a comment

Choose a reason for hiding this comment

dhalbert commented Jun 6, 2022

dhalbert commented Jun 1, 2022 •

edited

Loading