COOMatrix implementation #689

pxanthopoulos · 2024-04-17T15:58:41Z

We added the COOMatrix class implementation for COOrdinate based sparse matrices. We also extended kernels to support the new class (Aggregation, Element-wise matrix operations, Equality checking, Generation based on given values, Random generation and Transposition). For the implementation alone, we added a test file and for the kernels, we extended the existing test files.

… to support COOMatrix and corresponding tests Co-authored-by: fwtostatho <[email protected]> Co-authored-by: koutom111 <[email protected]>

- replace asserts with exceptions - add comments and briefs for clarity - fix colWidth formatting in print method and replace array allocation with vector - change print to use printValue utility function to handle all value types - move equality operator to COOMatrix.h like other datastructures - refactor AggAll kernel for CSR/COO to reduce code duplication. Add namespace for better usage in AggRow kernel - remove redundant getValues call in AggRow kernel - EwBinaryMat kernel: - fix a bug that would append a zero entry to sparse matrices if lhs and rhs entries added to zero - replace memcpy with fill_n for copy of identical row values - minor optimization in RandMatrix kernel to avoid potentially re-allocating resources in loop iteration - minor optimization and bug fix in Transpose kernel to avoid copying column indices for argsort and change to stable_sort to guarantee row index order stays valid - extend COOMatrixTest to verify proper ordering of coordinates - fix test errors from previous merge - minor formatting changes

AlexRTer

Hi @pxanthopoulos, @koutom111, and @fwtostatho, thank you all for this contribution.
The implementation of COOMatrix and some specialized kernels on them are a great addition to Daphne's support of sparse data types.

I've added some changes in the last commit, but I think the code already looks good overall.
Your idea of setting each column length to the longest string representation of the values within it when printing a COOMatrix is a very nice idea in my opinion and helps with readability.

This is an overview of the more noteable changes I made, please see the last commit for a more complete list.

COOMatrix.h:
- added a brief to explain the implementation and some class methods
- changed the indexing in print to correctly set colWidth
- switched to using the printValue utility function for handling different value type conversions
- moved the CheckEq kernel to the class definition (operator==) for consistency with other datastructures
Refactored the AggAll kernel as the current implementation is identical for CSR and COO format.
Fixed a bug in the EwBinaryMat kernel that would add a zero valued entry to the data buffers. This bug is present for CSRMatrix as well, but the fix should probably be a separate commit before this PR is merged.
Changed the implementation of an argsort in the Transpose kernel and switched to a stable_sort. Using an unstable sorting algorithm does not guarantee that row indices stay sorted using the new order of column indices.
Extended the test cases for COOMatrix to verify proper ordering of indices between operations.

There are some things that I believe should be addressed before the PR is merged. You do not need to work on them anymore unless you would like to of course.

The arrays that are used to hold values and indices in the COOMatrix implementation mark the last entry with a zero or size_t(-1) value to keep track of the space between the last added value and maxNumNonZeros. This space does not hold set/valid entries. In my opinion, this is not the optimal way to track this last entry. If switching to e.g. an std::vector for the data buffers is not an option, having a shared_ptr, shared between views of the same COOMatrix, that holds the index of the last set entry would be a possible solution. This would enable more bound checks, as it provides the same information as the current implementation in a more directly accessible way (a variable instead of a value inside the arrays that needs to be found first). Furthermore, this would simplify some of the existing loops because the loop body does not have to handle these special values separately anymore. To simplify updating this variable, adding 2 new methods setNumNonZeros(size_t nnz) and incNumNonZeros(int64_t inc) should help greatly.
Since this requires some bigger changes though, I would like to hear some feedback on the idea first. What is your opinion @pdamme ?
Reducing code duplication in the Aggregate kernels. Currently, the COOMatrix and CSRMatrix implementations for AggRow are mostly identical and differ only in the way the number of non-zeros in a row is accessed. By adding a getNumNonZerosRow method to CSRMatrix or changing the getNumNonZeros method of COOMatrix to accept a row index as an argument, these could be factored out like in the AggAll kernel.
As mentioned above, the last commit contains a bug fix for the EwBinaryMat kernel for both CSRMatrix and COOMatrix which is probably best addressed in a small, separate commit prior to merging this PR for better documentation.

Lastly, there are some possibilities for future issues that should not be addressed in this PR anymore:

In the current implementation of EwUnaryMat for COO -> Dense, every cell of the result DenseMatrix is written to individually. Given that with high sparsity most values are identical, this could be optimized to use memset between coordinates and iterate over the COOMatrix instead of the result DenseMatrix.
Similarly, for EwBinaryMat it is not necessary to iterate from 0 to numRows as it suffices to iterate over the row indices directly. This applies to the current implementation for CSRMatrix too.

Feel free to share your opinion on the changes I made or suggested and thank you again for your work.

AlexRTer · 2024-08-25T23:39:33Z

src/runtime/local/datastructures/COOMatrix.h

+        values.get()[0] = ValueType(0);
+        rowIdxs.get()[0] = size_t(-1);
+        colIdxs.get()[0] = size_t(-1);


This way of marking the current "head" of the array does not provide direct access to its length. Keeping the length in a separate value that is shared between views would make it more accessable (see review for more detail).

Suggested change

values.get()[0] = ValueType(0);

rowIdxs.get()[0] = size_t(-1);

colIdxs.get()[0] = size_t(-1);

// attribute before ctor

std::shared_ptr<size_t> numNonZeros;

// ctor

..., numNonZeros(std::make_shared<size_t>(0)), ...

pxanthopoulos · 2024-08-31T18:45:43Z

Hi @AlexRTer. Thank you for your kind comments and the fixes you made to our PR. We agree with your proposed changes. Especially the first bullet concerning the last entries of the data arrays is a very good idea. Unfortunately, due to curricular obligations, we currently do not have the time to address them.

Add COOMatrix implementation with corresponding tests, Extend kernels…

ac7b6ca

… to support COOMatrix and corresponding tests Co-authored-by: fwtostatho <[email protected]> Co-authored-by: koutom111 <[email protected]>

AlexRTer self-assigned this Jul 24, 2024

AlexRTer added 2 commits August 17, 2024 00:24

Merge branch 'main' into main

7a9dfc8

AlexRTer requested changes Aug 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

COOMatrix implementation #689

COOMatrix implementation #689

pxanthopoulos commented Apr 17, 2024

AlexRTer left a comment •

edited

Loading

AlexRTer Aug 25, 2024

pxanthopoulos commented Aug 31, 2024

COOMatrix implementation #689

Are you sure you want to change the base?

COOMatrix implementation #689

Conversation

pxanthopoulos commented Apr 17, 2024

AlexRTer left a comment • edited Loading

Choose a reason for hiding this comment

AlexRTer Aug 25, 2024

Choose a reason for hiding this comment

pxanthopoulos commented Aug 31, 2024

AlexRTer left a comment •

edited

Loading