REEF for NVIDIA GPUs #7

anakli · 2023-04-15T20:55:17Z

Really interesting work :) Would it be possible to have access to the version of REEF for NVIDIA GPUs that you mention in the paper? Do you plan to make the NVIDIA GPU version open source or is it possible for researchers to get access to a separate repository with that version of REEF?

Thank you!

francis0407 · 2023-04-16T14:35:53Z

Hi @anakli,
Thank you for your interest in our work.

However, I have to clarify that the NVIDIA version of REEF only implemented the task preemption mechanism based on queue cleaning, and did not include all of the techniques in REEF. As such, it is not currently fully functional, and we do not plan to make it open-source or provide access to a separate repository.

That being said, we will soon be open-sourcing a preemption library that we extracted from REEF-N, which works on NVIDIA GPU with CUDA. This library will assist other inference systems in implementing preemption functionality, similar to what is available in REEF-N. Once this library is available, developers will be able to implement preemption capabilities in other systems similar to what is offered by REEF-N.

anakli · 2023-04-16T21:23:37Z

Thank you for the quick response!

Do you have an expected timeline for when you plan to release the preemption library extracted from REEF-N?

In the meantime, we can also prototype the approach described in Section 4.4 of the paper. I'm wondering about the following two parameters:

what size do you assume for the vHQ (how many kernel slots?)
what is the "fixed number" of kernels you submit at a time from the vHQ to the GPU runtime?

Thanks!

francis0407 · 2023-04-18T09:55:19Z

Do you have an expected timeline for when you plan to release the preemption library extracted from REEF-N?
Maybe

We plan to release it within the next two months. We are currently finalizing some additional features and working on code and documentation orgnizations.

In the meantime, we can also prototype the approach described in Section 4.4 of the paper. I'm wondering about the following two parameters:

what size do you assume for the vHQ (how many kernel slots?)

The vHQ is indeed implemented as a linked list, which means that there is no specific limitation on its size. Therefore, you can add as many kernel slots as you need.

what is the "fixed number" of kernels you submit at a time from the vHQ to the GPU runtime?

The number of GPU kernels within the GPU runtime should be related to the workload's characteristics. There is typically a trade-off between execution latency and preemption latency. As such, we recommend keeping the number of GPU kernels within the range of 4 to 16, which is a reasonable trade-off between preemption and execution latency.

anakli · 2023-04-19T19:35:20Z

Thank you!

ujay-zheng · 2023-07-24T10:59:24Z

@francis0407 I would like to ask whether the Reef-N mentioned before has implemented DKP? If not, is it because it's not implementable on Nvidia?(I read through the paper and tried to implement DKP on Nvidia, but my shallow ability is not enough to judge the possibility of this solution on Nvidia, and most of the work in the paper is based on AMD graphics cards, so I have such doubts.)If DKP can be implemented on Nvidia, I will learn to implement it. If not, I'd like to know what problems you encountered during the implementation.

ujay-zheng · 2023-07-24T13:28:58Z

I had a very rough look at the LLVM User Guide for AMDGPU and the User Guide for NVPTX Back-end. With my shallow knowledge, I guess it won't work on Nvidia GPU.

francis0407 · 2023-07-25T02:18:31Z

Hi @ujay-zheng ,

We didn't implement DKP in REEF-N on NVIDIA GPU. It is mainly because many optimizations in DKP need to modify the binary or assembly code of the GPU kernel. For example, when "call" the candidate kernel inside the proxy kernel, we use "jump" instruction instead of "call" to avoid register spilling.

Actually, DKP can be implemented on NVIDIA GPU, but with a lot engineering effort (i.e., hacking the CUDA SASS binary).

ujay-zheng · 2023-07-28T02:36:45Z

ok i got it, thank you!

pokerfaceSad · 2023-08-18T05:13:36Z

@francis0407 How is REEF-N going? Already published?

Hi @anakli, Thank you for your interest in our work.

However, I have to clarify that the NVIDIA version of REEF only implemented the task preemption mechanism based on queue cleaning, and did not include all of the techniques in REEF. As such, it is not currently fully functional, and we do not plan to make it open-source or provide access to a separate repository.

That being said, we will soon be open-sourcing a preemption library that we extracted from REEF-N, which works on NVIDIA GPU with CUDA. This library will assist other inference systems in implementing preemption functionality, similar to what is available in REEF-N. Once this library is available, developers will be able to implement preemption capabilities in other systems similar to what is offered by REEF-N.

atomicapple0 · 2023-11-23T03:48:38Z

Bump on this. I am interested in playing around with the device queue capacity restriction feature for nvidia gpus. @francis0407

Alex4210987 · 2024-05-01T16:05:58Z

Hi!
It's very interesting work, and I wonder if it can be run on nvida gpus or amd gpus other than AMD RADEON
INSTINCT™ MI50 ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REEF for NVIDIA GPUs #7

REEF for NVIDIA GPUs #7

anakli commented Apr 15, 2023

francis0407 commented Apr 16, 2023

anakli commented Apr 16, 2023

francis0407 commented Apr 18, 2023

anakli commented Apr 19, 2023

ujay-zheng commented Jul 24, 2023

ujay-zheng commented Jul 24, 2023

francis0407 commented Jul 25, 2023

ujay-zheng commented Jul 28, 2023

pokerfaceSad commented Aug 18, 2023

atomicapple0 commented Nov 23, 2023

Alex4210987 commented May 1, 2024

REEF for NVIDIA GPUs #7

REEF for NVIDIA GPUs #7

Comments

anakli commented Apr 15, 2023

francis0407 commented Apr 16, 2023

anakli commented Apr 16, 2023

francis0407 commented Apr 18, 2023

anakli commented Apr 19, 2023

ujay-zheng commented Jul 24, 2023

ujay-zheng commented Jul 24, 2023

francis0407 commented Jul 25, 2023

ujay-zheng commented Jul 28, 2023

pokerfaceSad commented Aug 18, 2023

atomicapple0 commented Nov 23, 2023

Alex4210987 commented May 1, 2024