Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not clear how many swapchain images should application want #1137

Closed
krOoze opened this issue Dec 10, 2019 · 4 comments
Closed

Not clear how many swapchain images should application want #1137

krOoze opened this issue Dec 10, 2019 · 4 comments
Assignees

Comments

@krOoze
Copy link
Contributor

krOoze commented Dec 10, 2019

The current API makes it somewhat fundamentally unclear to applications how many swapchain images they should want.
That can be bit of a problem, because too little could lead to bad perf. Too many is waste of memory.

I just wonder if there is a room for improvement here.

Things that seem contributing to the problem:

  1. With minImageCount it is not clear if the count includes some internal PE buffers the application might never get, or not. This is somewhat important to guess the performance behavior. Some people still operate under "double buffering", "tripple buffering" terminology; which is hard to map to here if it is unclear how many actual buffers are there in the first place.

  2. Swapchain can create more images anyway. I mean after all that boilerplate querying stuff, the swapchain creation will do whatever it wants anyway?

  3. Some drivers feel like they are inflating minImageCount above the count that is absolutely neccesary, to coax apps into picking the "right" count. Other drivers might not be so "kind".

  4. minImageCount does not depend on presentation mode (and possibly other stuff), which could influence the nature of the value.

  5. It is not clear upront if the driver subscribes to the vkAcquireNextImage Semaphore mechanic, or if it blocks the call. (If it has blocking semantic, I would possibly want an extra image.)

  6. Only numImages - minImageCount + 1 are acquirable. But per spec it seems allowed to get more on some drivers. (And it is not clear upfront if driver will behave that way. Which in some cases could be good enough, and reduce the number of swapchain images needed.)

  7. I think vkQueuePresent is technically allowed to block. Which then makes asking for more than one image pointless. (And it is not clear upfront if driver will behave that way.)

There are only so many PEs. I wonder if better abstraction could be made. Or perhaps an extension covering only the "sane" subset. Or at least possibility for app to query as much as possible about the behavior of the PE, to be able to start making some educated guesses without trial and error.

@oddhack
Copy link
Contributor

oddhack commented Dec 16, 2019

@stonesthrow assigned to SI.

@cubanismo
Copy link

  1. minImageCount is intended to be the minimum number of images the swapchain can have such that it will satisfy the needs of the presentation engine while allowing the app to acquire one image for use in rendering the current frame. A notable error in the spec is that this needs to satisfy the needs of all supported presentation modes currently, so it is likely not a true minimum for some modes in all cases. We're working on a fix for that though. This has been fudged since the spec was written (E.g., a minImageCount of "1" was deemed confusing for some reason, and hence disallowed IIRC), but hopefully that's still reasonably clear in the spec. If not, can you point out any specific contradictions?

  2. Yes, the implementation may always allocate more images if it believes this will result in better behavior for some reason, or if there are constraints it can't properly express via the minImageCount. This allows implementations to cover spec errors like the per-present-mode omission above.

  3. This isn't good driver behavior, and is arguably a bug. Implementations should ideally use [2] instead, but of course, applications aren't perfect either and some don't handle the case of an implementation allocating more images than they requested well, so some compromise is needed. Ideally, implementations would work with all such application vendors and/or use application-specific work-arounds, but driver vendors generally opt for pragmatism, and hence are loath to ship implementations that are technically more correct but break a significant number of applications in the field, sometimes to the detriment of a minority of applications that have followed the spec to both the letter and subtext and expect drivers which do the same.

  4. See above.

  5. Complain to driver vendors that block longer than necessary. vkAcquireNextImageKHR() should block only until it is possible to signal the asynchronous sync objects it returns in finite time, and that shouldn't need to be until the previous frame is completely done being displayed on well-designed operating systems and drivers.

  6. I agree this logic probably doesn't hold up, and perhaps some spec fixes are needed here. It's difficult to boil down all the presentation scenarios supported across all possible vulkan implementations into simple equations like this.

  7. vkQueuePresentKHR() wasn't supposed to be allowed to block in any situation other than exhaustion of resources for any internal queue operations it requires (E.g., ran out of space in some internal command queuing mechanism in the VkQueue implementation itself). This is the entire reason behind the present-acquire split, and I find it unfortunate it isn't accurately reflected in the spec, but this opinion isn't entirely universal in the working group AFAIK. Regardless, it would be difficult or impossible to conformance test such behavior. Please complain to any implementation providers that block in vkQueuePresentKHR(). We have at one known bug here in our current windows implementation which we are in the process of addressing.

All the above taken into account, the intent is not that the spec dictate an ideal number of images to the application. No such universal logic was known (or is known now, that I'm aware of) such that we could bake hard advice into the spec via notes or requirements. Instead, we left it as a 3-step dance where the implementation could provide a minimum spec, the application could provide its minimum spec, and ultimately the implementation could again add additional images as necessary. For example, do deal with a poor platform that for whatever reason required blocking in vkAcquireNextImage(), the implementation could report an additional image itself or allocate one more image than the application requested. It's true the app doesn't have similar options due to lack of visibility, but the more complicated the mechanism becomes, the more likely it is to be misused on either side. Hence, I'm going to close this. If there are specific issues you'd like to spin out of the above 7 points/responses, please file separate targeted issues and/or propose spec or API changes.

@krOoze
Copy link
Contributor Author

krOoze commented Jan 15, 2020

the application could provide its minimum spec,

Ok, this is not clear from the spec.

For usual app where the acquire and present is in 1:1 pairs, technically only minImageCount images is needed (i.e. only one image is acquired at a time). Though I still want vkAcquire to be as unimpeded as possible, so I ask for minImageCount+1 images.

I mean even vkcube is confused at this point. It just always asks for 3 images, but it does only acquire present in 1:1 pairs. I have seen similar explanations in the Vulkan-Samples repo. vulkan-tutorial uses minImageCount + 1.
I mean when virtually no one knows or follows this, then that is a problem. The API is already abused. And the way it is, the chances are it is abused at both sides (the Implementation and the apps).

@stonesthrow
Copy link
Contributor

Linking to: #1158

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants