-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llext: install modules appropriately for testing #9013
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, needs more comments.
foreach(line IN LISTS uuids) | ||
string(REGEX REPLACE "^[ \t]*uuid *= \"([0-9A-F\-]*)\"" "\\1" uuid ${line}) | ||
file(APPEND ${PROJECT_BINARY_DIR}/llext.uuid "${uuid}\n") | ||
endforeach() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we comment the regex.
scripts/xtensa-build-zephyr.py
Outdated
@@ -899,6 +901,60 @@ def build_platforms(): | |||
symlinks=True, ignore_dangling_symlinks=True, dirs_exist_ok=True) | |||
|
|||
|
|||
def install_lib(platform, sof_output_dir, abs_build_dir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have a comment describing what this does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need high-level documentation for this first of all, similar to what @ujfalusi did in thesofproject/sof-docs@f70388a4290 etc.
How are llext files supposed to be signed and by whom? With which keys? What (kernel?) code will load which files from where? What is the expected /lib/firmware/intel|nxp|other/sof/
tree structure? It does not have to be in sof-docs and it does not have to be long but it needs to be written down somewhere, reviewed and committed so everyone is on the same page.
Thanks to Peter we're finally starting to see the light at the end of a years-long, super low velocity /lib/firmware/
CI tunnel / technical debt where no one is really sure of which files are actually tested, when and how. Let's not immediately start digging a brand new, nearly identical tunnel for LLEXT and bury ourselves in more technical debt yet again.
Similar issue with signing: we need some high-level design document for signing too. After months of effort I finally got rid of all the technical debt there and got us to a much simpler place where signing and rimage configuration is NOT SOF-specific but performed 100% in Zephyr instead (so Zephyr can be... tested!) Yet this PR adds signing back into SOF? So what is the long term plan for signing LLEXT files?
Some short-term hacks in SOF can be OK but only as long as everyone is aligned on what the longer term destination is, otherwise it's pure chaos, test failures all the time and super low velocity.
More "practical" concerns:
This PR can very easily be split into smaller PRs with just fewer of these commits.
https://docs.zephyrproject.org/latest/contribute/contributor_expectations.html#defining-smaller-prs
Most notably: please don't "sneak" a west.yml update inside a big PR. For obvious reasons, some Zephyr updates have been very disruptive. So we want them to be as separate as possible and have at least one round of daily tests after west.yml updates.
scripts/xtensa-build-zephyr.py
Outdated
fdst = open(str(dst), 'wb') | ||
fllext = open(str(llext_output), 'rb') | ||
fman = open(str(llext_output) + '.xman', 'rb') | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pylint broad-exception-caught
Just remove this block. The only effect of catching an exception and exiting with nearly-empty error message is to hide the useful stacktrace and filenames. This is a build script; don't hide useful information.
If this may happen often AND you think that you can provide a very detailed, better error message than the default then catch ONLY the expected exception.
scripts/xtensa-build-zephyr.py
Outdated
llext_base = entry.name[:-6] | ||
llext_file = llext_base + '.llext' | ||
|
||
sof_lib_dir = pathlib.Path(sof_output_dir / '..' / 'sof-ipc4-lib' / platform) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to repeat pathlib.Path(
every time. This is useful to convert strings to pathlib.Path but there's no need to "convert" a pathlib.Path to a pathlib.Path. When it's already a pathlib.Path then this is enough:
sof_lib_dir = pathlib.Path(sof_output_dir / '..' / 'sof-ipc4-lib' / platform) | |
sof_lib_dir = sof_output_dir / '..' / 'sof-ipc4-lib' / platform |
sign_cmd = [str(rimage_executable), "-o", str(llext_output), | ||
"-e", "-c", str(rimage_cfg), | ||
"-k", str(signing_key), "-l", "-r", | ||
str(llext_input)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't hardcode rimage.path and arguments like that, the user has the ability to override those, see rimage_west_configuration()
. See top-level comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to CMake, I still can't see what is temporary versus not here. Please add the relevant TODOs. Even more important considering the high-level signing design sounds like "do whatever you want".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marc-hb everything is temporary... I'm far from being a Python expert, so I expect improvements throughout this code - both during this review and after this is (hopefully soon) merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everything is temporary...
Yes of course, we all die in the end. Yet there is a huge difference between:
- This code MAY change some day - or not, who knows?
- We SHOULD change this ASAP and move it to Zephyr, using new API X, Z and Z.
Seeing comment 2. stops people from wasting too much time with the corresponding code.
scripts/xtensa-build-zephyr.py
Outdated
|
||
fdst.close() | ||
fman.close() | ||
fllext.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with open('file1.txt', 'r') as file1, open('file2.txt', 'r') as file2:
Or you can just leave them open, they will be automatically closed when the script exits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you can just leave them open, they will be automatically closed when the script exits.
@marc-hb it's a loop over potentially many modules, would be dirty to leave them all open, particularly with their handle variables re-used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could have a single with
(and indentation level) for all these files at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marc-hb yes, you proposed it above, but if any of them fails, we don't need to continue and should abort, whereas with
would "hide" the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with
does not hide any error... Just insert a typo in a filename and you'll see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not following quite a few things in this PR, see comments below.
set(EXTRA_LINKED_PARAMS) | ||
set(COPY_CMD ${CMAKE_OBJCOPY} -R .xt.* ${MODULE}_llext.so ${MODULE}_out.so) | ||
endif() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should split code, script changes. Not sure why they belong in the same patch?
Also there's no reason why smart_amp should be included here, this is a separate addition IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart smart-amp is currently the only component in "main," that can be built as a module. So while adding generic relocatable format support to SOF, we also add it to smart-amp, but this can be split, yes
foreach(line IN LISTS uuids) | ||
string(REGEX REPLACE "^[ \t]*uuid *= \"([0-9A-F\-]*)\"" "\\1" uuid ${line}) | ||
file(APPEND ${PROJECT_BINARY_DIR}/llext.uuid "${uuid}\n") | ||
endforeach() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this a smart_amp specific operation? Are we going to replicate this for each and every module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart so far this CMakeLists.txt would be largely repeated for each new module, yes. But we plan to switch to using recently added Zephyr cmake support for LLEXT code ASAP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we plan to switch to using recently added Zephyr cmake support for LLEXT code ASAP
Corresponding FIXME now added in later commit, thanks! (I reviewed commits one by one and initially missed it)
${MODULE_LINKER_PARAMS} ${EXTRA_LINKED_PARAMS} -fPIC | ||
-o ${MODULE}_llext.so $<TARGET_OBJECTS:${MODULE}> | ||
${MODULE_LINKER_PARAMS} ${EXTRA_LINKER_PARAMS} -fPIC | ||
-o ${MODULE}_pre.so $<TARGET_OBJECTS:${MODULE}> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
problem statement/commit message:
a) why do we need to sign modules?
b) what are the rules for selecting the key used by different modules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart here the same key is used as the one for the base firmware. In principle any key, accepted by the DSP can be used. Why - for security, for the same reason why we sign the base firmware
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lyakh we do need the ability to specify that alternate key for modules that are not part of the base firmware.
Actually maybe it's more complicated even, e.g. if Intel releases modules then presumably they are signed with the same key than the base firmware, but a 3rd party would be required to use a separate key, so the key would need to be specified on a module-by-module basis.
It's not 9am and that already gives me a migraine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the key would need to be specified on a module-by-module basis.
Right.
I don't mind having a "demo" with some hacks and hard-coding for early testing purposes but we need:
- a mutually agreed, written down design of how this should look in the future.
- Clearly marking temporary hacks with the appropriate TODOs and FIXMEs so readers don't waste hours and hours wondering why things are done this way and trying to reverse-engineer a design that does not exist (cause something is just a temporary hack)
Otherwise we have an instant maintenance nightmare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the key would need to be specified on a module-by-module basis.
Right.
@plbossart @marc-hb would 3rd parties even use the xtensa-build-zephyr.sh script to build their modules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart @marc-hb would 3rd parties even use the xtensa-build-zephyr.sh script to build their modules?
You tell us! Either in some documentation or in the source code. Then everyone can review and agree or disagree. At the end we're all on the same page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marc-hb I feel reaching a bit beyond the scope of this work, but let's say - some will and some won't. Those who won't - they'll work out their own way of doing this.
Those who will, they will integrate their modules into their local copy of sof.git in order to be able to use the script. They might or might not want to build both a base image for the distribution and any modules within the same build. If they do - I see no reason for them to use different keys. If they don't, i.e. if they only want to build their modules to be passed to an integrator, who will build a base image with their own key - in that case they anyway will execute a complete build, but will only use their modules from it and discard the rest. Either way I so far see no case when it would be essential for anyone to be able to run a single build while signing different parts of it with different keys. If for some unimaginable reason somebody has to do that, I'd say, that this is unsupported which unfortunately for them will mean, that they have to build multiple times - once with each key. And I really don't see this as an insurmountable problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack - beyond scope of this PR. All will be signed with a single key initially.
add_subdirectory(${SOF_SAMPLES_PATH}/audio/smart_amp_llext | ||
${PROJECT_BINARY_DIR}/smart_amp_llext) | ||
add_subdirectory(${SOF_SAMPLES_PATH}/audio/smart_amp_test_llext | ||
${PROJECT_BINARY_DIR}/smart_amp_test_llext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the rename? was this a mistake that should be corrected separated, something done in a previous patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@plbossart previously this name wasn't a problem, since each module had its own supporting processing code, so names could be arbitrary. Now we need automation, so names must be unified as ${MODULE}_llext
@@ -899,6 +901,60 @@ def build_platforms(): | |||
symlinks=True, ignore_dangling_symlinks=True, dirs_exist_ok=True) | |||
|
|||
|
|||
def install_lib(platform, sof_output_dir, abs_build_dir): | |||
global signing_key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a module_signing_key? Is this different from the base firmware key? why is this global?
I just realized the current PR title "Install modules for CI testing" is a perfect demonstration of this problem. We should never install things "for CI testing". CI should never test things in its own, parallel reality. It has been but this must stop. CI should test the same thing that everybody else is testing, otherwise we're back to super low velocity with complete certainty. Long before extensive test coverage and expensive red tape, quality is simply about saying what you do and doing what you say. So what is desired for LLEXT here? We don't need a novel, we just need something that's short and readable: not buried in the implementation details. Something like thesofproject/sof-docs@f70388a4290 Again: some short-term solutions can be OK - as long as everyone is on the same page. Where is that page? |
@marc-hb thanks for the comments.
https://thesofproject.github.io/latest/getting_started/intel_debug/introduction.html#firmware-binary - note, it already has paths to loadable modules
In principle you can sign them with any key, logically you want one of those, accepted by the DSP. Here we use the same key as the one used to sign the base firmware. By whom - I don't know, I mean, if vendors supply their pre-built binary modules - presumably the integrator will have to sign them, because they have the keys, that the DSP will accept?
Sorry, not sure what your question is. This is already working, and documented above.
I previously submitted a PR to add SOF llext module signing to Zephyr zephyrproject-rtos/zephyr#66573 and it wasn't particularly popular. After this PR rimage will be able to also sign relocatable-type modules. Signing SOF llext modules now consists of two steps: pre-processing TOML and calling rimage to sign the module, using that configuration. TOML pre-processing is added to cmake by this PR. So in principle with this we now can just build the tree (and modules) and then just call rimage to sign each module.
6 isn't that many, and individual commits aren't huge, but of course this can be split.
ok, I'll push that separately as a dependency |
Part 1 #9024 has a lot of build failures so clearly smaller PRs are useful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 isn't that many, and individual commits aren't huge, but of course this can be split.
The number of commits or number and lines gives a rough and somewhat useful indication but it's not just that.
https://thesofproject.github.io/latest/getting_started/intel_debug/introduction.html#firmware-binary - note, it already has paths to loadable modules
So yes, it has paths to "UUID.bin" (least useful names ever, LOL) and... that's it! I thought LLEXT was a major new feature yet that's all the design documentation it needs?
In principle you can sign them with any key, logically you want one of those, accepted by the DSP. Here we use the same key as the one used to sign the base firmware. By whom - I don't know, I mean, if vendors supply their pre-built binary modules - presumably the integrator will have to sign them, because they have the keys, that the DSP will accept?
This is exactly the type of design considerations that should be documented somewhere. You're not sure about some aspects of the design? Great: leave blanks and let's discuss that in some documentation pull request.
I previously submitted a PR to add SOF llext module signing to Zephyr zephyrproject-rtos/zephyr#66573 and it wasn't particularly popular.
This was 4 months ago and there was basically ZERO CMake code for .llext files at the time. Things have changed considerably since.
${MODULE_LINKER_PARAMS} ${EXTRA_LINKED_PARAMS} -fPIC | ||
-o ${MODULE}_llext.so $<TARGET_OBJECTS:${MODULE}> | ||
${MODULE_LINKER_PARAMS} ${EXTRA_LINKER_PARAMS} -fPIC | ||
-o ${MODULE}_pre.so $<TARGET_OBJECTS:${MODULE}> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the key would need to be specified on a module-by-module basis.
Right.
I don't mind having a "demo" with some hacks and hard-coding for early testing purposes but we need:
- a mutually agreed, written down design of how this should look in the future.
- Clearly marking temporary hacks with the appropriate TODOs and FIXMEs so readers don't waste hours and hours wondering why things are done this way and trying to reverse-engineer a design that does not exist (cause something is just a temporary hack)
Otherwise we have an instant maintenance nightmare.
|
I don't remember a request to submit PRs that don't compile. "Smaller PRs" does not mean "all at the same time"; rather the opposite. Also, draft PRs, demos and other work in progress can be any size. |
0x8000 is the manifest .text offset, use an existing macro instead of open-coding it. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
When an ELF section isn't found .get_section_by_name() raises an AttributeError exception, catch it specitically instead of catching any exception. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
When installing modular components we need to create UUID-based soft links. Create a separate file with a list of UUIDs to assist in that. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see no showstoppers to merging this.
scripts/xtensa-build-zephyr.py
Outdated
for line in lines: | ||
symlink_or_copy(sof_lib_dir, llext_file, | ||
sof_lib_dir, line.strip() + '.bin') | ||
f_llext.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have example output in the commit message to explain what the intention is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kv2019i sure, here's the output when printing a deployment tree during the build:
build-sof-staging
|-- sof
| +-- intel
| |-- sof-ipc4
...
| +-- sof-ipc4-lib
| +-- mtl
| +-- dbgkey
| |-- 167A961E-8AE4-11EA-89F1-000C29CE1635.bin -> smart_amp_test.llext
| |-- 39656EB2-3B71-4049-8D3F-F92CD5C43C09.bin -> mixin_mixout.llext
| |-- 3C56505A-24D7-418F-BDDC-C1F5A3AC2AE0.bin -> mixin_mixout.llext
| |-- 5150C0E6-27F9-4EC8-8351-C705B642D12F.bin -> eq_iir.llext
| |-- eq_iir.llext
| |-- mixin_mixout.llext
| +-- smart_amp_test.llext
The bottom 3 lines above are llext modules and the four lines above them are UUID links for driver loading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
example output in the commit message
Maybe what would be even better is to switch .github/workflows/llext.yml
to invoke all this new code instead of west build
? This would not just show sample output on Github, but also provide test coverage immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switch
.github/workflows/llext.yml
to invoke all this new code
@marc-hb propose as a follow-up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be able to build and sign multiple modules within one build we need to unify name generation. For signing we invoke something like west sign -i smart_amp_test which means, that the west sign utility must be able to locate the module binary directory and recognise files in it using only that input file name. It will then create signed llext images inside those binary directories too, so that deployment scripts can find and recognise them there. We unify the naming as ${MODULE}_llext - for the binary directory, ${MODULE}.so - ELF file for signing in that directory ${MODULE}.llext - final signed extension llext.uuid - a file with a list of UUIDs, provided by this extension Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
When building smart-amp-test as a module, we also need to preprocess its TOML configuration file. Add it to cmake. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Set "load_type" to 2 in TOML configuration, when building for LLEXT. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will take a longer look later today.
scripts/xtensa-build-zephyr.py
Outdated
|
||
def rimage_west_configuration(platform_dict, dest_dir): | ||
"""Configure rimage in a new file `dest_dir/westconfig.ini`, starting | ||
from the workspace .west/config. | ||
Returns the pathlib.Path to the new file. | ||
""" | ||
|
||
global platform_wconfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because non-constant, global variables are bad generally speaking? This breaks referential transparency, now there is a hidden dependency between rimage_west_configuration()
and install_lib()
. Also, the build cannot be parallel anymore.
We had bugs exactly like that: 6bedd8e
scripts/xtensa-build-zephyr.py
Outdated
rimage_executable = platform_wconfig.get("rimage.path") | ||
if rimage_executable is None: | ||
rimage_executable = platform_wconfig.get("rimage.path", None, | ||
west_config.ConfigFile.LOCAL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reduce the indentation and not make it look like this depends on the scandir
(which is not "needed" if rimage configuration fails)
Not important.
sign_cmd = [str(rimage_executable), "-o", str(llext_output), | ||
"-e", "-c", str(rimage_cfg), | ||
"-k", str(signing_key), "-l", "-r", | ||
str(llext_input)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everything is temporary...
Yes of course, we all die in the end. Yet there is a huge difference between:
- This code MAY change some day - or not, who knows?
- We SHOULD change this ASAP and move it to Zephyr, using new API X, Z and Z.
Seeing comment 2. stops people from wasting too much time with the corresponding code.
scripts/xtensa-build-zephyr.py
Outdated
|
||
fdst.close() | ||
fman.close() | ||
fllext.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could have a single with
(and indentation level) for all these files at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience!
Just a few codestyle issues left + only one logical issue holding back my approval: the link.unlink()
looks suspicious to me, see below.
foreach(line IN LISTS uuids) | ||
string(REGEX REPLACE "^[ \t]*uuid *= \"([0-9A-F\-]*)\"" "\\1" uuid ${line}) | ||
file(APPEND ${PROJECT_BINARY_DIR}/llext.uuid "${uuid}\n") | ||
endforeach() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we plan to switch to using recently added Zephyr cmake support for LLEXT code ASAP
Corresponding FIXME now added in later commit, thanks! (I reviewed commits one by one and initially missed it)
scripts/xtensa-build-zephyr.py
Outdated
link.unlink() | ||
symlink_or_copy(sof_lib_dir, llext_file, | ||
sof_lib_dir, linkname) | ||
f_llext.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need f_llext.close()
, the entire purpose of the with
is to give the .close()
for free (no matter where you leave the block)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marc-hb ah, there you go - I didn't know leaving with
was releasing resources, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, windows Python version seems to have a problem with that... https://github.com/thesofproject/sof/actions/runs/8936988225/job/24548392500?pr=9013
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marc-hb I removed parentheses for which I had to put the whole with
statement on 1 line and now windows are happy. I.e.
with open(x1) as f1, open(x2) as f2, open(x3) as f3:
and now the line is 150 characters long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use the "black" Python formatter to solve all problems like this one:
apt install black
cp scripts/xtensa-build-zephyr.py for_black.py
black for_black.py # reformats everything in place
... then open for_black.py
and look at how black
solved the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, windows Python version seems to have a problem with that... https://github.com/thesofproject/sof/actions/runs/8936988225/job/24548392500?pr=9013
Wow, I've never seen any Linux/Windows difference like this one, amazing. Maybe some CRLF or weird tab/space mix?!
Anyway it's gone now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, windows Python version seems to have a problem with that... https://github.com/thesofproject/sof/actions/runs/8936988225/job/24548392500?pr=9013
Wow, I've never seen any Linux/Windows difference like this one, amazing. Maybe some CRLF or weird tab/space mix?!
Anyway it's gone now!
@marc-hb it's gone now, because I put it all in one line, right?
scripts/xtensa-build-zephyr.py
Outdated
for line in lines: | ||
symlink_or_copy(sof_lib_dir, llext_file, | ||
sof_lib_dir, line.strip() + '.bin') | ||
f_llext.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
example output in the commit message
Maybe what would be even better is to switch .github/workflows/llext.yml
to invoke all this new code instead of west build
? This would not just show sample output on Github, but also provide test coverage immediately.
a47bea7
to
d6b568c
Compare
@wszypelt could you check if the error is related? The failure doesn't look like something clearly unrelated to me, but the previous version passed and I don't think I changed anything relevant for QB tests since then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One hopefully last requested change - only because you just made a major change to avoid the global platform_wconfig - thanks!
.github/workflows/llext.yml
Outdated
--cmake-args=-DEXTRA_AFLAGS='-Werror -Wa,--fatal-warnings' \ | ||
--cmake-args=--warn-uninitialized \ | ||
--cmake-args=-DCONFIG_OUTPUT_DISASSEMBLY=y \ | ||
--cmake-args=-DCONFIG_OUTPUT_DISASSEMBLY_WITH_SOURCE=n \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need --no-interactive
(no one does TBH)
Since this workflow does not save artefacts, DCONFIG_OUTPUT_DISASSEMBLY is only slowing things down (almost twice slower build time total on some systems)
.github/workflows/llext.yml
Outdated
@@ -14,7 +14,7 @@ jobs: | |||
strategy: | |||
fail-fast: false | |||
matrix: | |||
platform: [mtl, lnl] | |||
platforms: [mtl] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create an empty, placeholder file sof/app/overlays/mtl/module_overlay.conf
and then use --overlay=sof/app/overlays/${{ matrix.platform }}/module_overlay.conf
Keep the field name singular: it's a field name not a vector name.
platforms: [mtl] | |
platform: [mtl, lnl] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marc-hb ehm, I guess you meant
+ platform: [mtl]
right? Can do, sure, but TBH I find a plural form with an array logical enough, even if currently that array happens to only have one member, and it's copied from other files.
UPDATE: ah, you probably meant to drop this change, and instead of removing currently non-working LNL add an empty overlay for it... Mmh, I'd rather not do that. It would just make a wrong impression of LLEXT already working on LNL while it isn't yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
find a plural form with an array logical enough,
Plural is debatable here on that line, but a plural ${{ matrix.platforms }}
would be very misleading further down where it's referenced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of removing currently non-working LNL add an empty overlay for it...
Yes that's what I meant except for the "currently non-working LNL" part. LNL was there and "something" was working for LNL so it's not clear why you're removing it now. Nevermind.
Install loadable LLEXT modules into the deployment tree and create symbolic links for them. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Switch the GitHub CI workflow to use xtensa-build-zephyr.py Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
after rerun, all green :) |
llext_input = entry_path / (llext_base + '.so') | ||
llext_output = entry_path / llext_file | ||
|
||
sign_cmd = [str(platform_wconfig.get("rimage.path")), "-o", str(llext_output), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sign_cmd = [str(platform_wconfig.get("rimage.path")), "-o", str(llext_output), | |
sign_cmd = [platform_wconfig.get("rimage.path"), "-o", str(llext_output), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in de87535
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.github/workflows/llext.yml
Outdated
@@ -14,7 +14,7 @@ jobs: | |||
strategy: | |||
fail-fast: false | |||
matrix: | |||
platform: [mtl, lnl] | |||
platforms: [mtl] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
find a plural form with an array logical enough,
Plural is debatable here on that line, but a plural ${{ matrix.platforms }}
would be very misleading further down where it's referenced.
.github/workflows/llext.yml
Outdated
@@ -14,7 +14,7 @@ jobs: | |||
strategy: | |||
fail-fast: false | |||
matrix: | |||
platform: [mtl, lnl] | |||
platforms: [mtl] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of removing currently non-working LNL add an empty overlay for it...
Yes that's what I meant except for the "currently non-working LNL" part. LNL was there and "something" was working for LNL so it's not clear why you're removing it now. Nevermind.
install modules into the deployment tree for CI testing