-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
intel_adsp/ace: power: pad the hpsram_mask passed to power_down #75285
intel_adsp/ace: power: pad the hpsram_mask passed to power_down #75285
Conversation
soc/intel/intel_adsp/ace/power.c
Outdated
@@ -339,16 +339,16 @@ void pm_state_set(enum pm_state state, uint8_t substate_id) | |||
(void *)rom_entry; | |||
sys_cache_data_flush_range((void *)imr_layout, sizeof(*imr_layout)); | |||
#endif /* CONFIG_ADSP_IMR_CONTEXT_SAVE */ | |||
uint32_t hpsram_mask = 0; | |||
uint32_t hpsram_mask[CONFIG_DCACHE_LINE_SIZE] = { 0 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to align this by the cache line size too to ensure it doesn't cross a boundary. And the kconfig is (I think) in units of bytes, not dwords. Also style: while I get that we have a kconfig option for that for use by portable code, in this case I think clarity and safety demand you use the actual hardware value, e.g.
__aligned(XCHAL_DCACHE_LINESIZE) uint32_t hpsram_mask[XCHAL_DCACHE_LINESIZE / sizeof(uint32_t)];
And a nitpick: no need to zero-fill it if you're only going to use one index out of the array that you're initializing yourself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, I also tested this yesterday - only without the oversight of using an array of 64 32-bit entries, but using 64 bytes / 4 == 16 entries. And yes, it works when using an array and aligning it on a cache-line size boundary. Only using one of them - either an array or aligning - didn't help. And we don't have a good explanation for this either, right? So, this introduces a 60 byte overhead - but only when powering down. Whereas my similarly "accidentally fixing" version with static
has an overhead of 4 bytes - but for the whole life-time of the firmware. So, we can choose between the two
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @andyross @lyakh , corrected based on feedback.
I did further tests and it's indeed the stack setup of the calling function pm_state_set() that is the triggering condition. As the exact conditions are not known, adding both alignment and size makes sense. This also makes sense as with call8 convention, there is window save space both before and after stack variables. With this change, we ensure any register save/restores happen on a different cachelines than the one we use for hp_sram.
4958624
to
02a1deb
Compare
V2 pushed, marking as ready for review. |
45d1980
to
00e36e0
Compare
V3:
|
Tested on actual devices at thesofproject/sof#9274 |
00e36e0
to
22dd1eb
Compare
V4:
|
The power_down() function will lock dcache for the hpsram_mask array. On some platforms, the dcache lock will fail if the array is on cache line that can be used for window register context saves. Work around this by aligning and padding the hpsram_mask to cacheline size. Link: thesofproject/sof#9268 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
22dd1eb
to
d8fcc20
Compare
V5:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks as clean as can be expected. Still not a root cause, but "feels like the right way to do it".
#75108 merged instead, closing this one (it won't apply anymore). |
The power_down() function will lock dcache for the hpsram_mask array. On some platforms, the dcache lock will fail if the array is on cache line that can be used for window register context saves.
Work around this by padding the hssram_mask to a full cacheline.
Link: thesofproject/sof#9268