-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generic Matrix kernel specializations #722
Conversation
…relevant funcs in matrix and gengivenvals)
- Removed newly added <cassert> include from kernels, since not used there (and we don't want to use assertions anymore). - Undid unnecessary move of Matrix code in EwBinaryObjSca.cpp. - Undid the trick about std::accumulate() in FilterCol and FilterRow kernels. - The input matrix sel could be a view into a larger matrix, but std::accumulate() does not take the rowSkip of sel into account. - Fixed indentation and a typo in the newly added code in Write.h. - Some more minor things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @AlexRTer, thanks for this contribution. The kernel specializations for the superclass Matrix
are very useful as fallbacks whenever the specialization for a particular physical matrix representation is missing. We expect such situations to become more important as we add additional matrix representations.
The code in this PR looks very good to me (even though I mostly skimmed some of the kernels and test cases). I made a few little changes (see my commit).
Regarding your question on the cast in the Order
-kernel: The concern is valid as that cast indeed looks a bit fishy. I added a fix for it that avoids the cast by leveraging the if constexpr (...)
to make sure that the ExtractRow
-kernel is instantiated/called only when VTRes
and VTArg
are the same. Otherwise, an exception is thrown.
From my point of view, this PR is ready to be merged. But as you created it as a draft, let me know if you have any further comments or if I may merge it in.
Thanks also for cleaning up the existing kernels and their test cases a bit (e.g., consistent file structure, corrections of misformattings and misnamings, compaction of multiple |
Thank you for looking at the cast in the |
This PR adds implementations using the generic
Matrix
type for most kernels.They only use methods that are sure to be supported by
Matrix.h
based classes, i.e.get
,set
,prepare-/finish-/append
.Hence, new classes based on
Matrix.h
do not necessarily have to implement specializations for all kernels in order to make use of them. The implementations do not assume any additional constraints (e.g. memory layout) which limits possibilities for optimization in some cases. Therefore, they do not replace specialized kernels entirely but save a great amount of time by providing basic utility "out of the box". Note that this PR only adds the necessary specialized kernels and does not make them accessible to the runtime yet.Changes:
Matrix.h
specializations have been added to almost all local kernels with exception of e.g. Neural Network related kernels.Existing tests have been extended to cover these as well. Currently, the kernels always default to a
DenseMatrix
ifres
was given as anullptr
, so tests also use aDenseMatrix
throughGenGivenVals
.DTEmpty
orDTView
to generate these before they are cast to the appropriate type.using DTEmpty = DenseMatrix<typename DTArg::VT>;
static_cast<DTArg *>(DataObjectFactory::create<DTEmpty>(0, 0, false));
Filter*
also expectsel
to be aDenseMatrix
, in which casesDTSel
is sometimes used.DTRes
type thanDTArg
, soDT
might be set usingstd::conditional
DenseMatrix.h
has also been changed to handle empty matrices ifprepare-/finish- Append
is called on them. I suspect this might also be an issue with e.g.CSRMatrix
but I have not yet confirmed this.To do:
The cast (l. 327) in
Order.h
should be fine, though restrictive, asVTRes = VTArg
is guaranteed in that case but please confirm this. The problem lies inextractRow
expectingres
andarg
to have the same value type, which is true ifreturnIdx = false
, but the compiler seems to also check if a cast is legal for the case ofreturnIdx = true
in which value types are not guaranteed to be the same - so a static cast does not work.