-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can the MHANet run in real time #52
Comments
I did not get the chance to develop the model to run on a real-time system. It would need some more development, but I assume its possible. You could do things like reuse past keys and queries for the attention mechanism to speed up processing times and determine a window of time-steps for the model that will allow it to be run fast enough on a device such that it is real time. So a few compromises would need to be made I assume. Also, a device with a GPU would make things much easier. Maybe a paper like this could give you some ideas: https://arxiv.org/abs/2010.11395 I could be wrong, but I am sure it is very possible with some modifications. Aaron. |
Yes, i also think its possible that model run on a real-time system. a) For a masked attention matrix(full history, 0 lookahead), like b) For a masked attetnion matrix(N history, 0 lookahead), in which N is the window size, if N=3, we can get Thanks. |
Sounds like an interesting problem to investigate :) I am sure it could work with some constraints. Consider things like using previously computed keys to speed up processing, e.g., this is done with language models when generating text to speed up decoding: https://github.com/huggingface/transformers/blob/820c46a707ddd033975bc3b0549eea200e64c7da/src/transformers/models/gpt2/modeling_gpt2.py#L984 |
Thanks, i will learn relevant knowledge. |
Hi,
I am confused whether the MHANet works in real time. From my understrand, the masked attention only match causal scenario, may be not applicable to real tme.
Best Regards, looking forward to your reply.
The text was updated successfully, but these errors were encountered: