All notable changes to HiOp are documented in this file.
- Checkpointing for quasi-Newton solver: new API for load and saving checkpoints, user options for rollover restarts, and scalable I/O via axom::sidre.
- New function for advanced warmstarts.
- Updated spack build with build cache.
- Update modules for CI tests on LLNL LC by @nychiang in #679
- Update cmake build system to require RAJA when GPU compute mode is used by @nychiang in #676
- Moving limits options for NLP IPM solvers by @cnpetra in #681
- Removed deprecated ALG2 for cusparseCsr2cscEx2 by @cnpetra in #671
- Addressed fixed buffer size vulnerability for vsnprintf by @nychiang in #673
- Removed stringent -Wall and -Werror from release builds to avoid downstream compilation errors
Default C++ standard remains C++14
- Fix LLNL CI by @nychiang in #663
- C++17 support: compile -Wall/-Werror proof by @tepperly in #653
- Fix a bug in copying an empty matrix into a bigger matrix by @nychiang in #666
- Fix the approach used to update mu by @nychiang in #664
Interfaces of various solvers reached an equilibrium point after HiOp was interfaced with multiple optimization front-ends (e.g., power grid ACOPF and SC-ACOPF problems and topology optimization) both on CPUs and GPUs. The PriDec solver reached exascale on Frontier after minor communication optimizations. The quasi-Newton interior-point solver received a couple of updates that increase robustness. The Newton interior-point solver can fully operate on GPUs with select GPU linear solvers (CUSOLVER-LU and Gingko).
- Instrumentation of RAJA sparse matrix class with execution spaces by @cnpetra in #589
- Fix Assignment Typo in hiopMatrixSparseCsrCuda.cpp by @pate7 in #612
- Use failure not failed in PNNL commit status posting by @cameronrutherford in #609
- rebuild modules on quartz by @nychiang in #619
- Use constraint violation in checkTermination by @nychiang in #617
- MPI communication optimization by @rothpc in #613
- fix memory leaks in inertia-free alg and condensed linsys by @nychiang in #622
- Update IPM algorithm for the dense solver by @nychiang in #616
- Use integer preprocessor macros for version information by @tepperly in #627
- use compound vec in bicg IR by @nychiang in #621
- Use bicg ir in the quasi-Newton solver by @nychiang in #620
- Add support to MPI in C/Fortran examples by @nychiang in #633
- Refactor CUSOLVER-LU module and interface by @pelesh in #634
- Add MPI unit test for DenseEx4 by @nychiang in #644
- Add more options to control NLP scaling by @nychiang in #649
- Development of the feasibility restoration in the quasi-Newton solver by @nychiang in #647
- GPU linear solver interface by @pelesh in #650
This release hosts a series of comprehensive internal developments and software re-engineering to improve the portability and performance on accelerators/GPU platforms. No changes to the user interface permeated under this release.
A new execution space abstraction is introduced to allow multiple hardware backends to run concurrently. The proposed design differentiates between "memory backend" and "execution policies" to allow using RAJA with Umpire-managed memory, RAJA with Cuda- or Hip-managed memory, RAJA with std memory, Cuda/Hip kernels with Cuda-/Hip- or Umpire-managed memory, etc.
- Execution spaces: support for memory backends and execution policies by @cnpetra in #543
- Build: Cuda without raja by @cnpetra in #579
- Update of RAJA-based dense matrix to support runtime execution spaces by @cnpetra in #580
- Reorganization of device namespace by @cnpetra in #582
- RAJA Vector int with ExecSpace by @cnpetra in #583
- Instrumentation of host vectors with execution spaces by @cnpetra in #584
- Remove copy from/to device methods in vector classes by @cnpetra in #587
- Add support for Raja with OpenMP into LLNL CI by @nychiang in #566
New vector classes using vendor-provided API were introduced and documentation was updated/improved
- Development of
hiopVectorCuda
by @nychiang in #572 - Implementation of
hiopVectorHip
by @nychiang in #590 - Update user manual by @nychiang in #591
- Update the code comments in
hiopVector
classes by @nychiang in #592
Refinement of triangular solver implementation for Ginkgo by @fritzgoebel in #585
- Refine the computation in normal equation system by @nychiang in #530
- Fix static culibos issue #567 by @nychiang in #568
- Fix segfault, remove nonsymmetric ginkgo solver by @fritzgoebel in #548
- Calculate the inverse objective scale correctly. by @tepperly in #570
- Fix
hiopVectorRajaPar::copyToStartingAt_w_pattern
by @nychiang in #569 - Gitlab pipeline refactor by @CameronRutherford in #597
- @tepperly made their first contribution in #570
Full Changelog: https://github.com/LLNL/hiop/compare/v0.7.1...v0.7.2
This minor release fixes a couple of issues found in the build system after the major release 0.7 of HiOp.
- Fortran interface and examples
- Bug fixing for sparse device linear solvers
- Implementation of CUDA CSR matrices
- Iterative refinement within CUSOLVER linear solver class
- Improved robustness and performance of mixed dense-sparse solver for AMD/HIP
This tag provides an initial integration with ginko, fixes a couple of issues, and add options for (outer) iterative refinement.
This version/tag provides a workaround for an issue in the HIP BLAS and updates the RAJA code to better operate with the newer versions of RAJA.
The salient features of v0.6.0 are
- the release of the primal decomposition (PriDec) solver for structured two-stage problems
- improved support for (NVIDIA) GPUs for solving sparse optimization problems via NVIDIA's cuSOLVER API and newly developed condensed optimization kernels.
Other notable capabilities include
- improved accuracy in the computations of the search directions via Krylov-based iterative refinement
- design of a matrix interface for sparse matrices in compressed sparse row format and (capable) CPU reference implementation
New algorithmic features related to the NLP solver(s) and associated linear algebra KKT systems
- soft feasibility restoration
- Relaxer of equality constraints at the NLP formulation level
- Krylov interfaces and implementation for CG and BiCGStab (ready for device computations)
- protype of the condensed linear system and initial Krylov-based iterative refinement
- update of the Magma solver class for the latest Magma API
- elastic mode
This release also includes several bug fixes.
xSDK compliance
- fixed bugs in the IPM solver: gradient scaling on CUDA, unscaled objective in the user callbacks, lambda capture fix in axpy for ROCm
- exported sparse config in cmake
- added user options for the algorithm parameters in PriDec solver
Modified the computation of the scaling factor to use the user-specified initial point
The salient features of this major release are
- update of the interface to MAGMA and capability for running mixed dense-sparse (MDS) problems solely in the device memory space
- added interface PARDISO linear solver
- porting of the sparse linear algebra kernels to device via RAJA performance portability layer
- various optimizations and bug fixes for the RAJA-based dense linear algebra kernels
- Primal decomposition solver HiOp-PriDec available as a release candidate