Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for VK_EXT_descriptor_buffer #1776

Open
spnda opened this issue Nov 17, 2022 · 16 comments
Open

Support for VK_EXT_descriptor_buffer #1776

spnda opened this issue Nov 17, 2022 · 16 comments
Assignees

Comments

@spnda
Copy link
Collaborator

spnda commented Nov 17, 2022

VK_EXT_descriptor_buffer is a new extension that released today. It essentially allows the device to manage the memory backing descriptor sets. This is more or less exactly what MoltenVK is already doing, so implementing this extension should not be too hard.

In short, this extension allows to query the offset and size of a descriptor within a descriptor layout. We then have to write the BDAs and handles into a buffer (ideally DEVICE_LOCAL) with these offsets and sizes. This buffer is then bound using a BDA in a command buffer, with byte offsets specifiable later. This is effectively the same as argument buffers with Metal 3, where we also just manually write handles and pointers into memory. The only difference is that handles in Metal are always sized the same unlike in Vulkan, where one has to query the size from the driver. The SPIR-V descriptor interface remains the same.

I'd like to implement this extension in the coming days. @billhollings Could you please assign me?

@billhollings
Copy link
Contributor

billhollings commented Nov 17, 2022

I agree that VK_EXT_descriptor_buffer sounds like is a good mapping for Metal3 argument buffers.

I'd be happy to have you work on this. Thanks very much for volunteering! And as requested, I've assigned it to you.

The changes to argument buffers in Metal3 should also allow us to upgrade support for argument buffers in general, and in particular for descriptor indexing. With that in mind, it would definitely be preferable to integrate the two uses of Metal argument buffers, and not fragment their use into two paths.

For instance, if Metal3 is available on the platform, maybe always enable the use of argument buffers, and in a manner that allows it to support general descriptor sets (eg. one Metal argument buffer per descriptor set), descriptor indexing, and descriptor buffers, ideally all using a single implementation of Metal argument buffers in MoltenVK.

I haven't had time to look into this in any detail yet, so I may be asking for too much. But I'd like to see it as a design intention, as you look into VK_EXT_descriptor_buffer.

We can use this thread for contributors to discuss design ideas for doing all this.

PS: @spnda I've granted you write access to the repo now. That should allow you to assign to tasks, and ask for reviewers on PR's. Please leave PR merging with me, though.

@spnda
Copy link
Collaborator Author

spnda commented Nov 21, 2022

So, I've come across a few issues trying to implement this extension. It's been fairly simple because the extension effectively provides a MTLArgumentEncoder interface, but the following two problems are making it quite complicated.

First of all, Metal does not have a function to bind a buffer by its address. Therefore I would have to track which buffers have been created with VK_BUFFER_USAGE_RESOURCE_DESCRIPTOR_BUFFER_BIT_EXT and store those in a map or array. I would then in the encoding of vkCmdBindDescriptorBuffersEXT search that map/array by the buffer's address (which might have an offset) and then bind that buffer. Not too sure how I would implement this nicely, but this should be doable.

The major issue I've just stumbled upon is that by giving the application control of writing the argument buffers we have no idea which resources are bound. As far as I know, Metal requires resources to be made resident with useResource or useHeap, meaning I would need to traverse the bound descriptor buffer every time the offsets change with vkCmdSetDesctiptorBufferOffsetsEXT and read all the handles and pointers from it, then lookup the corresponding MTLTexture, MTLBuffer, MTLSamplerState and then call useResource for those handles. This would be a very complicated and error-prone thing to implement, would likely also be quite slow and most importantly defeat the purpose of this extension at the same time. I don't feel like we should support any extensions when we have to do something like this. Though in theory this would still be possible, so how would you feel about this?

I've pushed my changes until now onto a branch: https://github.com/spnda/MoltenVK/tree/EXT_descriptor_buffer

@zmarlon
Copy link

zmarlon commented Dec 11, 2022

In my opinion, it would make sense to make a feature request for Metal at Apple. Because I don't think it makes sense to implement the extension in such a way.

@psionic12
Copy link

psionic12 commented Jan 6, 2023

I also don't understand why vkCmdBindDescriptorBuffersEXT takes addresses instead of buffer handles, do you guys know why? I asked a question, but nobody answers for now.

@rcaridade145
Copy link

@corporateshark
Copy link

First of all, Metal does not have a function to bind a buffer by its address. Therefore I would have to track which buffers have been created with VK_BUFFER_USAGE_RESOURCE_DESCRIPTOR_BUFFER_BIT_EXT and store those in a map or array. I would then in the encoding of vkCmdBindDescriptorBuffersEXT search that map/array by the buffer's address (which might have an offset) and then bind that buffer. Not too sure how I would implement this nicely, but this should be doable.

Can gpuResourceID from Metal 3 help with this? https://developer.apple.com/documentation/metal/mtlrenderpipelinestate/3974100-gpuresourceid

@rcaridade145
Copy link

@SwampyApple
Copy link

In my opinion, it would make sense to make a feature request for Metal at Apple. Because I don't think it makes sense to implement the extension in such a way.

As I've mentioned in the general Metal 3 features discussion I also feel like this is a good step to take. The people over at CodeWeavers got in touch with Apple over some Rosetta 2 bugs which got fixed. They've also added a raw image mode for the Asahi Linux development, instead of needing Mach-O kernel files.

@rcaridade145
Copy link

@billhollings @spnda

Was looking for info that could help here. Came across this when useHeap /useResource.

https://forums.kodeco.com/t/insights-from-wwdc/142554

https://forums.kodeco.com/t/when-to-call-useresource/134089

@K0bin
Copy link

K0bin commented Jun 8, 2023

The way to make this work would be:

  • Switch MoltenVK to untracked + unretained mode
  • Manually handle barriers, either in the non-prefill mode that encodes on submission combined with lots of tracking or in prefill mode with MTLEvent spammed everywhere
  • Maintain a per device list of writable textures and MTLHeap allocations and add all of them to each encoder with useResource and useHeap

Same thing should probably be done for VK_EXT_descriptor_indexing too, so it can properly support PARTIALLY_BOUND and huge descriptor sets.

@QuantumDeveloper
Copy link

QuantumDeveloper commented Jul 8, 2023

Is there any chance that this extension will be implemented in the nearest future?

@billhollings
Copy link
Contributor

Is there any chance that this extension will be implemented in the nearest future?

Soon. It's in the upcoming development roadmap.

@spnda
Copy link
Collaborator Author

spnda commented Jul 10, 2023

@billhollings Are you going to take over on this PR? I'm still technically free to work on this. There are solutions to make the feature work fully, but they would be quite broad changes that would take time.

Specifically, we would most likely need to refactor VkDeviceMemory to not map to a MTLHeap and essentially always allocate, say, 1MB MTLHeap blocks and then sub-allocate VkDeviceMemory and VkBuffer from those. We then always have a useHeap call so that all resources are available, which would fully fix the residency issue and allow the the GPU to write descriptors.

The issue that the Vulkan API binds by an address plus offset can probably be easily solved. So for this extension there's a lot of things having to be changed, and I also don't know if you're fine with the MTLHeap change before I actually start implementing that.

@billhollings
Copy link
Contributor

billhollings commented Jul 10, 2023

Are you going to take over on this PR? I'm still technically free to work on this. There are solutions to make the feature work fully, but they would be quite broad changes that would take time.

I am intending to work on that as part of the roadmap (#1975). But as you can see, it's further down on the priority list. I am definitely happy to have you continue to work on it, instead.

I recommend you/we wait until I complete the first item on the roadmap, the Metal 3 argument buffers. As part of that, I am trying to improve how the resources are bound and made resident. I'm sure there will be further improvements when work proceeds on VK_EXT_descriptor_buffer here too, but that will give us a start.

@spnda
Copy link
Collaborator Author

spnda commented Jan 24, 2024

I recommend you/we wait until I complete the first item on the roadmap, the Metal 3 argument buffers. As part of that, I am trying to improve how the resources are bound and made resident. I'm sure there will be further improvements when work proceeds on VK_EXT_descriptor_buffer here too, but that will give us a start.

@billhollings Has there been any updates regarding this? I can imagine the descriptor capabilities to be drastically improved when defaulting to the Metal 3 argument buffers and only falling back to using Metal 2 or Metal 1 functionality if absolutely necessary. I've recently seen a lot of people confused (on Discord) about the environment variable to enable the use of argument buffers. I'm also still interested in supporting descriptor buffer, and I saw that the KHR_acceleration_structure PR already addressed one of my concerns, and I think handling residency of resources could also be somewhat addressed with a revamped descriptor structure.

@billhollings
Copy link
Contributor

Has there been any updates regarding this?

Still on my ToDo list. Hoping to get to it within a month.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants