You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Benchmarking AD tools has come up a lot recently, and this seems like a good place to implement some benchmarks, in addition to "correctness" testing.
I was thinking that they should be micro-benchmarks, and the benchmarks themselves shouldn't depend on any functionality outside of Base and the standard libraries, with the possible exception of things that are needed to test supports for accelerators eg. CuArrays.jl. Equally these could be supported by typing things sufficiently abstractly 🤷
The first thing to do is figure out what it people actually care about the performance of. For example, I really care about broadcasting and operations involving linear algebra, but not so much about control flow, but I know that the Turing team has a different set of priorities. So perhaps if everyone could solicit what sorts of things they're interested in benchmarking, we can start to think about how to chop up tests. For example, there's a distinction between control-flow that depends on values and control-flow that doesn't from the perspective of reverse-mode AD, so we should probably be testing that kind of thing.
Benchmarking AD tools has come up a lot recently, and this seems like a good place to implement some benchmarks, in addition to "correctness" testing.
I was thinking that they should be micro-benchmarks, and the benchmarks themselves shouldn't depend on any functionality outside of
Base
and the standard libraries, with the possible exception of things that are needed to test supports for accelerators eg.CuArrays.jl
. Equally these could be supported by typing things sufficiently abstractly 🤷The first thing to do is figure out what it people actually care about the performance of. For example, I really care about broadcasting and operations involving linear algebra, but not so much about control flow, but I know that the Turing team has a different set of priorities. So perhaps if everyone could solicit what sorts of things they're interested in benchmarking, we can start to think about how to chop up tests. For example, there's a distinction between control-flow that depends on values and control-flow that doesn't from the perspective of reverse-mode AD, so we should probably be testing that kind of thing.
cc @vchuravy @yebai @oxinabox
The text was updated successfully, but these errors were encountered: