Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixed precision in pySDC #511

Open
brownbaerchen opened this issue Jan 7, 2025 · 2 comments
Open

Mixed precision in pySDC #511

brownbaerchen opened this issue Jan 7, 2025 · 2 comments

Comments

@brownbaerchen
Copy link
Contributor

As GPUs get increasingly more compute power in lower precision (down to four bits!) compared to double precision, we have to consider how to make efficient use of future machines with pySDC.
Reducing the precision in the implementation is simple, but the questions are:

  • Where can we reduce the floating point precision with minimal impact on the accuracy of the solution?
  • Where can we actually gain something by reduced floating point precision?

Note that the cost of individual arithmetic operations is limited from below by kernel launch cost, see for instance the discussion in this article. Therefore, I doubt that we win much by simply switching everything to lower precision.
A good starting point would be to choose a problem with an iterative linear solver that is launched as a single kernel. Then, we can start by doing only the implicit solves in single precision, or half precision, or four bit precision, and see what we gain. Possibly, we have to increase the precision between SDC iterations and probably we have to choose quite large problem resolution to see the difference.

The scope of this project appears well suited to a bachelor thesis, an internship, or the likes. If you or anyone you know would find this interesting, please get in touch!
There is no need for a deep understanding of SDC or Python. Basic proficiency with the latter and a low level of fear of maths is sufficient. This would be a nice opportunity to get to know GPU programming in Python.

@tlunet
Copy link
Member

tlunet commented Jan 7, 2025

Can we also have the same consideration with CPU ? My guess is that low precision computation is not very fast only with GPU, and considering inexact SDC that would make a lot of sense indeed.

Other question : how easy is it to switch from single to double precision on the GPU during computation ? Is it too expensive so it requires to do low precision computation on one GPU, and high order computation on another one ? ...

@brownbaerchen
Copy link
Contributor Author

Can we also have the same consideration with CPU ? My guess is that low precision computation is not very fast only with GPU, and considering inexact SDC that would make a lot of sense indeed.

I am not sure how much faster CPUs are with low precision, if there is a significant gain, sure we can try. On GPUs, the single precision speed is often quoted as twice the flops you can get with double precision.

Other question : how easy is it to switch from single to double precision on the GPU during computation ? Is it too expensive so it requires to do low precision computation on one GPU, and high order computation on another one ? ...

I don't know how expensive the casting is. I would be surprised if it was more expensive than communicating between distinct GPUs and I cannot think of a usecase for distributing like this off the top of my head.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants