-
-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High passive CPU utilization #274
Comments
Hey @ngocphamm! Thank you for this feedback! We had issues in the past with high passive CPU usage like this: #170. AltTab, under normal circumstances should use 0% CPU or close to that, maybe spiking at 0.1% if there is OS activity it is observing. Here is the cause, and why it's such a problem to avoid: we use macOS Accessibility APIs to observe the system. Observations such as "a new window was created", "a window was resized", "a window had its title changed". We use these events to build a history of the windows. However, these APIs are ancient, poorly documented, and full of bugs. There are 2 levels of observing: processes, and windows. The "window was created" event comes from observing a process for instance, but then the "window was destroyed" event comes form observing the window, not the process. Here we try to subscribe, using the https://github.com/lwouis/alt-tab-macos/blob/master/src/api-wrappers/AXUIElement.swift#L102 Depending on the code we get in return, we either retry to subscribe, or we stop: https://github.com/lwouis/alt-tab-macos/blob/master/src/api-wrappers/AXUIElement.swift#L114 The issue is with some processes that take a long time to be subscribe-able to. We had the example of #182 which had a process that would literally never be ready, which is not normal, and for which I added a hardcoded exception. But there are legitimate cases like launching the GIMP app. You can imagine Photoshop or Unity or Maya too. These things can take up minutes to finish launching, until which point the subscription will finally succeed. Because of that, if we see a new process, and try to subscribe to its events, if it fails we must retry on a loop, which will raise the CPU. There are ways to limit CPU usage, but they all have trade-offs:
alt-tab-macos/src/logic/Application.swift Line 88 in ba7b5d3
If we space the attempts more, let's say every 1min, then the user can have the app finally finish starting, and it's not showing in AltTab for 1 min.
I would guess that on your machine, you have some process never finishing to launch, and AltTab and looping on trying to subscribe to its accessibility events, never succeeding, and generating CPU usage infinitely. The more processes like this, the more retry loops happening. You can verify that easily by going into Activity Monitor, selecting AltTab in the list, then clicking View > Sample Process. You should see a bunch of time spent in the functions above I shared. Now to get to the bottom of this, you need to run the app locally, and check the values of these arrays: I hope these explanations help. If you find out which processes are the issue, I can attempt to reproduce the issue on my machine by installing the apps, the way I did with this Octave.app app, and either hardcode an exception for the specific app, or find a potential universal solution, but once again I don't think it's possible without a trade-offs were we don't show some slow-to-start apps. Please let me know! :) |
Thanks for the very detailed explanation of this issue @lwouis! I tried to do the Sample Process, but unfortunately I don't quite understand the output of it. If you want I can send you the dump file but let me do the stuff below (after I understand what exactly I have to do there) first. What do you mean by this? How can I check the arrays?
|
Yeah you can just share the sample output here. For the arrays, it's about getting the values of what's inside. You need to change the code and run the app locally to kind of experiment like this. Do you have XCode? Assuming you can run the project locally (see https://github.com/lwouis/alt-tab-macos/blob/master/docs/CONTRIBUTING.md), you can add line like: debugPrint("ngocphamm", Applications.appsInSubscriptionRetryLoop, Windows.windowsInSubscriptionRetryLoop) On the line after: https://github.com/lwouis/alt-tab-macos/blob/master/src/ui/App.swift#L157 Then after running the app, you will see logs starting with "ngocphamm", showing you the contents of the arrays. |
This is the link for the sample file https://send.firefox.com/download/381922b0eca6f63c/#wFCgumzYcnqbL_wyxIA9bQ And sorry I don't even have XCode installed. Would there be anyway you can make a test/dev copy of the app with the debugging line enabled? By the way I just restarted the laptop and it doesn't look like AltTab is using a lot of CPU any more. Both % CPU and CPU Time reported is very low right now (% CPU was around 8-9% like I showed in the first post, and its CPU time was on the top 5 processes, just after kernel_task, WindowServer, which are normal, and MS Remote Desktop, which I use a lot for work). I will see if it comes back up after a while or not. |
Of course! Here is a build where I added the code I mentioned: AltTab.app.zip You can just launch the app, and open Console.app on your mac. Then search for "ngocphamm" and any time you invoke AltTab UI, it will print the 2 arrays containing the apps and windows that are being tried to subscribed to in a loop. You can then use the number at the beginning of these to identify in Activity Monitor which processes are being a problem. Here is what Console.app shows for me (2 empty arrays, as all processes are successfully subscribed to on my machine):
Thank you! It seems you selected the "Sample Text" in the dropdown at the top, after sampling. I would need the "Percent of Parent" though to make a better assessment of what is taking the CPU time. Here is what it looks like if I run it right now, for instance: You can see a sampling of the method calls, and where time is spent. When your CPU was running high when AltTab was idle, I expect to see all the time spent in the |
Thanks a lot @lwouis! I've started running the custom build of the app. Right now it's not showing anything in the arrays and is using very little CPU as expected. I will keep it running and in case I notice the consistently high CPU usage, I will do the sample process again and let you know about that, and the arrays. Thanks for your help with this. It's truly appreciated! |
I think I'm seeing this behavior again @lwouis. Unfortuately I have no longer ran your custom version to see the debugging arrays 😞 I didn't see the behavior for a few days after running that custom version you sent me so I updated it to be the latest version with a few nice fixes (thumbnail memory, and full arrow keys support). Please find the sample file here https://send.firefox.com/download/6b2341bec71c27d0/#7BRsY-L5kmy4rVZm2_AAQg If you need me to run the debugging version again, please let me know! |
The logs you shared are still in the wrong format 😅See my instructions here. I still see a lot of retries, so even though I don't have the percentages, I'm assuming the issue is the AX subscription retry loop.
If you run the debug version now, does it also get to higher CPU? It's worth a try if you haven't. Basically I need to know which process are failing to get subscribed to. It's the only way I can imagine to fix it. As I said I blacklisted a faulty process in Octave.app last time. Either I blacklist more, since in your case, some other process is bugged, or we give up after N attempts, but then we risk missing out on legit slow processes like Gimp/Maya/etc. |
Sorry but I'm a bit confused. What should I do after I see this screen? I selected "Percent of Parent" and hit "Save..." button to get you the log file. Was that not what you wanted? I can run the debug version now, but it won't get this behavior probably until a few days later (what is happening is X days since the last update of AltTab), or maybe until the culprit app(s) got launched. I will run the debug version after I get to know how to properly get the sample file for you. |
Oh... then it means the export can't show the percentages I guess. Cause if you open the export file, you'll see there no percentage, so hard to know what's taking CPU time. Anyway on your screenshot I see that's it's the Next step has to be to identify the processes that can't be subscribed to. I thought about it, and decided that it may be generally useful to include info regarding these subscription retry loops in the debug profile attached when you send feedback through the in-app form. I'll release an update that contains that change in a few minutes. This means that in the future, when you see AltTab using higher CPU than usual, you'll just have to click: And that will open a ticket here with info on the problem apps or windows that fail to get subscribed to 👍 |
Nice. I just updated to latest version 3.17.2, and I will send the feedback the next time I notice the CPU spikes. Thank you! |
I'm wondering if a 5min timeout wouldn't be a pragmatic way to deal with this issue. After all, if some app needs more than 5min to launch, it could be considered a bug on their side. I can still imagine viable use-cases though like opening a huge Excel or Tableau file, or something that needs the network to start. The issue is that there processes declare themselves as front, background, or daemon. We try to subscribe only to front processes as they could spawn windows. However, if an app is incorrectly written and has a background process declared as front, and that process can never be subscribed to, we will loop infinitely, causing high CPU. It's really an issue on these third-party apps end (see #182), but the user is caught in the middle, and i'm not sure which side we should lean on: timeout and potentially have some apps not listed, or (current way) try indefinitely but wasting CPU if the third-party app is not letting us subscribe to it. |
IMO, timeout would be good for 99% of the users. Would be nice to have a configurable timeout value so anyone who really wants to wait can extend the timeout. |
@ngocphamm I think I'll add a timeout soon then. Before that, however, I would like to see on your system what's triggering the issue. Because by adding a timeout, we may be side-stepping a bug that's on AltTab's side, not on the third-party side. I wish I had more data from users seeing above 0.1% CPU usage when idle. |
Yup I will send a feedback with debug info as soon as I notice the high CPU usage! It might take days for it to get there, though, so please bear with me. |
Just reported! #302 |
@ngocphamm it's trying to subscribe to a window belonging to the "Location Menu" process. This seems like it's the OS process to track location. Maybe the report in #302 didn't reflect the actual issue. Could you try sending another one or two? If the CPU usage is still high, that is |
Sorry I just updated the app again seeing the new version fixed the issue when it didn't show all the windows from other spaces, so it doesn't use up CPU for now. Is it still okay submitting the report at this normal state? |
That's interesting. I would guess that this window should never have been made. To get this subscription loop going, it's supposed to first pass the test of being an actual window. Given that its parent process is "Location Menu", I doubt it's it's a real window. I'm curious to how it was deemed valid earlier, thus the loop, but after a restart, it's not deemed valid.
I mean do it why not, it doesn't hurt, but if the CPU is low then it shouldn't prove useful. Let's see it |
Also my guess on this is Location Menu is the menu bar icon that shows up whenever an app request access to my location (which I have a few doing so I think). This is the Open Files and Ports for that process (still running). Also, I have an app called Bartender to hide/show certain menu bar items, so maybe it's adding to the issue?
Yeah the app is always back to normal after a restart (of the app), by updating it. The issue comes back at some random times. Last time was after a few days. This time was under a day (updated the app yesterday too).
Just did #303 |
Yeah #303 is not very useful as there is not much CPU usage and no subscription retries. In #302, we see:
I don't get how this seemingly broken window got passed that test, and we started subscribing to it. |
Yeah this is not something that I know about... If this helps, according to System Preferences > Security & Privacy > Privacy tab, I'm having 5 apps that have access to location: Little Snitch Helper, Find My, Maps, Home, and Control Plane. Out of those 5, the location icon (that "indicates an app that has used your location within the last 24 hours") only shows for Maps. |
Could it be Control Plane, as I found a "fix" for a problem I've had with it since Catalina 18 days ago (by giving it access to location), and I reported this issue (with AltTab) 13 days ago 🤔 |
Here is the plan I suggest: let's wait, and please post another report if you see high CPU again. Let's see then if it's the same thing happening. In the meanwhile I will also pay attention to other tickets that people periodically open to discuss things, and check if they have high CPU. That may give more data points. After a while, maybe 2-4 weeks, I'll add the timeout. At the point, if there are rare bugs, they will be hidden which is not ideal, but at least it will stop wasting CPU after the timeout, which is a big upside. |
You should also ignore any window that reports back 0 as its ID: https://developer.apple.com/documentation/coregraphics/kcgnullwindowid?language=objc The actual definition is: |
Investigating how to deal with these apps that won't be subscribed to. I can reproduce this scenario in a legitimate app, Gimp, which replies with an The other app I use to investigate is Octave.app. This one, after starting, gets stuck in a loop or something, and won't respond. this means that we infinitely retry and get the same I thought: is there any way we can differentiate these 2 processes. Even a correlation may be a good workaround, better than a hard timeout. Interestingly, Octave.app takes a long time to respond to the I tried to compare these values: Here is a list of big apps, and how they deal with the issue:
|
After thinking about this thing for a while, I decided to go with a timeout. Doing a number of attempts like Snapp is incorrect though, as the time between attempts varies deeply between apps (between 10ms or lower, and 6s I believe). I just check the clock and after 2 min of trying, I stop trying. This means apps that takes more than 2 min to launch / be ready, will not have their windows listed in AltTab. It's really not ideal, but I believe these scenarios are less frequent than having zombies processes make AltTab uses constant CPU. Other ideas I considered:
|
## [3.22.5](v3.22.4...v3.22.5) (2020-05-10) ### Bug Fixes * implement a 2min timeout for unresponsive apps (closes [#274](#274)) ([7ab7c82](7ab7c82))
I had somehow never seen the case where such a filtering would be needed (I already filter based on many things such as role, subrole, title, etc). However, after working on #292 I discovered that specifically during login, some windows have ID 0, and should be ignored, thus I added your recommended filtering 👍 |
Just a question because I don't even know what defines "high" here. I just notice that AltTab is constantly among the top 3-5 processes that use CPU, and the percentage for it is also always around 8% like in the screenshot. Of course it will not when I play video, or run virtual machines, etc... but I'm wondering maybe something can be optimized here 🤔
The text was updated successfully, but these errors were encountered: