Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop stream 2024-09-12 #462

Open
wants to merge 45 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
9075421
Fixed overflow bug for large sizes in thrust::shuffle
Beanavil Jul 24, 2024
e9397cc
Added definitions of execution space macros
Beanavil Jul 24, 2024
a7a5d20
Add missing overloads for thrust::pow
Beanavil Jul 25, 2024
7006599
Refactors thrust::unique_by_key to use cub::DeviceSelect::UniqueByKey
Beanavil Jul 25, 2024
e5dbdaa
Fix a typo in thrust-config.cmake
Beanavil Jul 25, 2024
bd43018
Check that thrust::pair is trivially copyable
Beanavil Jul 25, 2024
b7b785e
Remove double ignore in discard_iterator.h docs
Beanavil Jul 25, 2024
93b72cd
Replace deprecated _VSTD macro with std
Beanavil Jul 25, 2024
f3e2676
Update mode example to use thrust::unique_count
Beanavil Jul 25, 2024
44d7369
Ensure that thrust fancy iterators are trivially_copy_constructible w…
Beanavil Jul 25, 2024
a32a67c
Use checked allocators in CUB catch2 tests
Beanavil Jul 30, 2024
b741017
Refactors thrust::copy_if to use cub::DeviceSelect
Beanavil Jul 25, 2024
158fa53
Refactor thrust::[stable_]partition[_copy] to use cub::DevicePartition
Beanavil Jul 25, 2024
bc6c83b
Fix include of <thrust/random.h> with NVC++
Beanavil Jul 25, 2024
489c073
Cleanup diagnostic handling
Beanavil Jul 25, 2024
9f5a3ba
Rework config.h
Beanavil Jul 29, 2024
1020a11
Bump version to 2.4.0
Beanavil Jul 25, 2024
917c255
Fix issues with ambiguous calls to addressof in thrust::optional
Beanavil Jul 25, 2024
5af1ef7
Try harder to unwrap nested thrust::tuple_of_iterator_references, CUD…
Beanavil Jul 29, 2024
bd5228c
Added missing element from thrust's tuple implementation
Beanavil Jul 25, 2024
099a901
Ensure that we can run reduce_by_key with const inputs
Beanavil Jul 25, 2024
9508470
Leave definitions of __host__ and __device__
Beanavil Jul 30, 2024
6791366
Patched up CI because of CCCL2.4.0 tests' build failure
Beanavil Jul 30, 2024
9fe0b04
Updated tests and examples for __host__ __device__ use
Beanavil Jul 31, 2024
15a07b0
Updated CHANGELOG
Beanavil Jul 31, 2024
158a1e1
Added operator to transform_reduce benchmark
NB4444 Aug 1, 2024
d0bf50f
Added mem allocator in benchmarks
NB4444 Aug 1, 2024
aa64ae7
Changes for review
NB4444 Aug 8, 2024
0673125
ci: set up sccache
Snektron Jul 26, 2024
75c44cf
Added helper functions for choosing between different custom reporter
NB4444 Aug 8, 2024
a36adac
Added json and csv custom reporter for benchmarks
NB4444 Aug 13, 2024
e00ad3a
Changes for review
NB4444 Aug 15, 2024
1cc4c8b
Added hipstdpar tests
Beanavil Aug 8, 2024
568f6f9
Relocated our ParallelSTL additions
Beanavil Aug 8, 2024
f584551
Fixed several naming issues
Beanavil Aug 8, 2024
387dbcf
Added missing unimplemented algorithms
Beanavil Aug 9, 2024
8aff938
Split hipstdpar_lib.hpp
Beanavil Aug 9, 2024
e2d548f
Added relevant information to README and CHANGELOG regarding HIPSTDPAR
Beanavil Aug 23, 2024
28da1b1
Clarified upstream LLVM offload support
Beanavil Sep 13, 2024
ed35b28
Emit error when HIPSTDPAR macros are not defined
Beanavil Sep 20, 2024
5023555
Move forwarding calls to rocPRIM to thrust's stubs
Beanavil Sep 20, 2024
e340947
Fix path to hipstdpar impl headers
Beanavil Nov 4, 2024
1f060d8
Prevent building hipstdpar tests when no compatible libstdc++ is present
Beanavil Nov 4, 2024
5fdf870
Disable TBB tests build
Beanavil Nov 6, 2024
5be02c2
Merge branch 'develop' into develop-upstream-12-9-2024
NB4444 Nov 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
114 changes: 79 additions & 35 deletions .clang-format
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Style file for MLSE Libraries based on the modified rocBLAS style

# Common settings
BasedOnStyle: WebKit
TabWidth: 4
IndentWidth: 4
BasedOnStyle: LLVM
TabWidth: 2
IndentWidth: 2
UseTab: Never
ColumnLimit: 100
ColumnLimit: 120

# Other languages JavaScript, Proto

Expand All @@ -20,14 +20,14 @@ Language: Cpp
# void formatted_code_again;

DisableFormat: false
Standard: Cpp11

AccessModifierOffset: -4
Standard: c++14
AccessModifierOffset: -2
AlignAfterOpenBracket: true
AlignConsecutiveAssignments: true
AlignConsecutiveDeclarations: true
AlignEscapedNewlinesLeft: true
AlignOperands: true
AllowAllArgumentsOnNextLine: true
AlignTrailingComments: false
AllowAllParametersOfDeclarationOnNextLine: true
AllowShortBlocksOnASingleLine: false
Expand All @@ -39,13 +39,26 @@ AlwaysBreakAfterDefinitionReturnType: false
AlwaysBreakAfterReturnType: None
AlwaysBreakBeforeMultilineStrings: false
AlwaysBreakTemplateDeclarations: true
AttributeMacros: [
'THRUST_DEVICE',
'THRUST_FORCEINLINE',
'THRUST_HOST_DEVICE',
'THRUST_HOST',
'_CCCL_DEVICE',
'_CCCL_FORCEINLINE',
'_CCCL_HOST_DEVICE',
'_CCCL_HOST',
'THRUST_RUNTIME_FUNCTION',
'THRUST_DETAIL_KERNEL_ATTRIBUTES',
]
BinPackArguments: false
BinPackParameters: false

# Configure each individual brace in BraceWrapping
BreakBeforeBraces: Custom
# Control of individual brace wrapping cases
BraceWrapping: {
AfterCaseLabel: 'false'
AfterClass: 'true'
AfterControlStatement: 'true'
AfterEnum : 'true'
Expand All @@ -56,52 +69,69 @@ BraceWrapping: {
BeforeCatch : 'true'
BeforeElse : 'true'
IndentBraces : 'false'
# AfterExternBlock : 'true'
SplitEmptyFunction: 'false'
SplitEmptyRecord: 'false'
}

#BreakAfterJavaFieldAnnotations: true
#BreakBeforeInheritanceComma: false
#BreakBeforeBinaryOperators: None
#BreakBeforeTernaryOperators: true
#BreakConstructorInitializersBeforeComma: true
#BreakStringLiterals: true
BreakBeforeConceptDeclarations: true
BreakBeforeBinaryOperators: NonAssignment
BreakBeforeTernaryOperators: true
BreakConstructorInitializers: BeforeComma
BreakInheritanceList: BeforeComma
EmptyLineAfterAccessModifier: Never
EmptyLineBeforeAccessModifier: Always

InsertBraces: true
InsertNewlineAtEOF: true
InsertTrailingCommas: Wrapped
IndentRequires: true
IndentPPDirectives: AfterHash
PackConstructorInitializers: Never
PenaltyBreakAssignment: 30
PenaltyBreakTemplateDeclaration: 0
PenaltyIndentedWhitespace: 2
RemoveSemicolon: false
SpaceAfterLogicalNot: false
SpaceAfterTemplateKeyword: true
SpaceBeforeCtorInitializerColon: true
SpaceBeforeInheritanceColon: true
SpaceBeforeRangeBasedForLoopColon: true


CommentPragmas: '^ IWYU pragma:'
#CompactNamespaces: false
CompactNamespaces: false
ConstructorInitializerAllOnOneLineOrOnePerLine: false
ConstructorInitializerIndentWidth: 4
ContinuationIndentWidth: 4
ContinuationIndentWidth: 2
Cpp11BracedListStyle: true
#SpaceBeforeCpp11BracedList: false
DerivePointerAlignment: false
SpaceBeforeCpp11BracedList: false
ExperimentalAutoDetectBinPacking: false
ForEachMacros: [ foreach, Q_FOREACH, BOOST_FOREACH ]
IndentCaseLabels: false
#FixNamespaceComments: true
IndentCaseLabels: true
FixNamespaceComments: true
IndentWrappedFunctionNames: false
KeepEmptyLinesAtTheStartOfBlocks: true
KeepEmptyLinesAtTheStartOfBlocks: false
MacroBlockBegin: ''
MacroBlockEnd: ''
#JavaScriptQuotes: Double
MaxEmptyLinesToKeep: 1
NamespaceIndentation: Inner
NamespaceIndentation: None
ObjCBlockIndentWidth: 4
#ObjCSpaceAfterProperty: true
#ObjCSpaceBeforeProtocolList: true
PenaltyBreakBeforeFirstCallParameter: 19
PenaltyBreakComment: 300
PenaltyBreakFirstLessLess: 120
PenaltyBreakString: 1000

PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 60
PenaltyBreakBeforeFirstCallParameter: 50
PenaltyBreakComment: 0
PenaltyBreakFirstLessLess: 0
PenaltyBreakString: 70
PenaltyExcessCharacter: 100
PenaltyReturnTypeOnItsOwnLine: 90
PointerAlignment: Left
SpaceAfterCStyleCast: false
SpaceAfterCStyleCast: true
SpaceBeforeAssignmentOperators: true
SpaceBeforeParens: Never
SpaceBeforeParens: ControlStatements
SpaceInEmptyParentheses: false
SpacesBeforeTrailingComments: 1
SpacesInAngles: false
SpacesInAngles: Never
SpacesInContainerLiterals: true
SpacesInCStyleCastParentheses: false
SpacesInParentheses: false
Expand All @@ -110,11 +140,25 @@ SpacesInSquareBrackets: false
#SpaceBeforeInheritanceColon: true

#SortUsingDeclarations: true
SortIncludes: true
SortIncludes: CaseInsensitive

# Comments are for developers, they should arrange them
ReflowComments: false
ReflowComments: true

#IncludeBlocks: Preserve
#IndentPPDirectives: AfterHash

StatementMacros: [
'THRUST_EXEC_CHECK_DISABLE',
'THRUST_NAMESPACE_BEGIN',
'THRUST_NAMESPACE_END',
'THRUST_EXEC_CHECK_DISABLE',
'CUB_NAMESPACE_BEGIN',
'CUB_NAMESPACE_END',
'THRUST_NAMESPACE_BEGIN',
'THRUST_NAMESPACE_END',
'_LIBCUDACXX_BEGIN_NAMESPACE_STD',
'_LIBCUDACXX_END_NAMESPACE_STD',
]
TabWidth: 2
UseTab: Never
---
32 changes: 25 additions & 7 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ include:
- /deps-rocm.yaml
- /deps-windows.yaml
- /deps-nvcc.yaml
- /deps-compiler-acceleration.yaml
- /gpus-rocm.yaml
- /gpus-nvcc.yaml
- /rules.yaml
Expand Down Expand Up @@ -46,17 +47,21 @@ copyright-date:
extends:
- .deps:rocm
- .deps:cmake-latest
- .deps:compiler-acceleration
before_script:
- !reference [".deps:rocm", before_script]
- !reference [".deps:cmake-latest", before_script]
- !reference [".deps:compiler-acceleration", before_script]

.cmake-minimum:
extends:
- .deps:rocm
- .deps:cmake-minimum
- .deps:compiler-acceleration
before_script:
- !reference [".deps:rocm", before_script]
- !reference [".deps:cmake-minimum", before_script]
- !reference [".deps:compiler-acceleration", before_script]

.install-rocprim:
script:
Expand All @@ -69,8 +74,11 @@ copyright-date:
-D CMAKE_CXX_COMPILER=hipcc
-D CMAKE_BUILD_TYPE=Release
-D BUILD_TEST=OFF
-D BUILD_HIPSTDPAR_TEST=OFF
-D BUILD_EXAMPLE=OFF
-D ROCM_DEP_ROCMCORE=OFF
-D CMAKE_C_COMPILER_LAUNCHER=phc_sccache_c
-D CMAKE_CXX_COMPILER_LAUNCHER=phc_sccache_cxx
-S $ROCPRIM_DIR
-B $ROCPRIM_DIR/build
- cd $ROCPRIM_DIR/build
Expand All @@ -91,7 +99,7 @@ copyright-date:
- !reference [.install-rocprim, script]
- | # Setup env vars for testing
rng_seed_count=0; prng_seeds="0";
if [[ $CI_COMMIT_BRANCH == "develop_stream" ]]; then
if [[ $CI_COMMIT_BRANCH == "develop_stream" ]]; then
rng_seed_count=3
prng_seeds="0, 1000"
fi
Expand All @@ -111,6 +119,9 @@ copyright-date:
-D AMDGPU_TEST_TARGETS=$GPU_TARGETS
-D RNG_SEED_COUNT=$rng_seed_count
-D PRNG_SEEDS=$prng_seeds
-D CMAKE_C_COMPILER_LAUNCHER=phc_sccache_c
-D CMAKE_CXX_COMPILER_LAUNCHER=phc_sccache_cxx
-D CMAKE_CUDA_COMPILER_LAUNCHER=phc_sccache_cuda
-S $CI_PROJECT_DIR
-B $CI_PROJECT_DIR/build
- cmake --build $CI_PROJECT_DIR/build
Expand Down Expand Up @@ -198,10 +209,10 @@ build:windows:
-D CMAKE_INSTALL_PREFIX:PATH="$ROCPRIM_DIR/build/install" *>&1
- \& cmake --build "$ROCPRIM_DIR/build" --target install *>&1
# Configure and build rocThrust
- \& cmake
-S "$CI_PROJECT_DIR"
-B "$CI_PROJECT_DIR/build"
-G Ninja
- \& cmake
-S "$CI_PROJECT_DIR"
-B "$CI_PROJECT_DIR/build"
-G Ninja
-D CMAKE_BUILD_TYPE=Release
-D GPU_TARGETS=$GPU_TARGET
-D BUILD_TEST=ON
Expand Down Expand Up @@ -327,10 +338,12 @@ test:rocm-windows-install:
- .deps:nvcc
- .gpus:nvcc-gpus
- .deps:cmake-latest
- .deps:compiler-acceleration
- .rules:manual
before_script:
- !reference [".deps:nvcc", before_script]
- !reference [".deps:cmake-latest", before_script]
- !reference [".deps:compiler-acceleration", before_script]

build:cuda-and-omp:
stage: build
Expand All @@ -340,7 +353,7 @@ build:cuda-and-omp:
tags:
- build
variables:
CCCL_GIT_BRANCH: v2.3.2
CCCL_GIT_BRANCH: v2.4.0
CCCL_DIR: ${CI_PROJECT_DIR}/cccl
needs: []
script:
Expand All @@ -349,16 +362,21 @@ build:cuda-and-omp:
- rm -R $CCCL_DIR/thrust/thrust
- cp -r $CI_PROJECT_DIR/thrust $CCCL_DIR/thrust
# Build tests and examples from CCCL Thrust
# CCCL 2.4.0 breaks compilation of tests. Compile examples only until we
# match v2.5.0.
- cmake
-G Ninja
-D CMAKE_BUILD_TYPE=Release
-D CMAKE_CUDA_ARCHITECTURES="$GPU_TARGETS"
-D THRUST_ENABLE_TESTING=ON
-D THRUST_ENABLE_TESTING=OFF
-D THRUST_ENABLE_EXAMPLES=ON
-D THRUST_ENABLE_BENCHMARKS=OFF
-D THRUST_ENABLE_MULTICONFIG=ON
-D THRUST_MULTICONFIG_ENABLE_SYSTEM_OMP=ON
-D THRUST_MULTICONFIG_ENABLE_SYSTEM_CUDA=ON
-D CMAKE_C_COMPILER_LAUNCHER=phc_sccache_c
-D CMAKE_CXX_COMPILER_LAUNCHER=phc_sccache_cxx
-D CMAKE_CUDA_COMPILER_LAUNCHER=phc_sccache_cuda
-B $CI_PROJECT_DIR/build
-S $CCCL_DIR/thrust
- cmake --build $CI_PROJECT_DIR/build
Expand Down
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,19 @@
Documentation for rocThrust available at
[https://rocm.docs.amd.com/projects/rocThrust/en/latest/](https://rocm.docs.amd.com/projects/rocThrust/en/latest/).

## (Unreleased) rocThrust 3.2.0 for ROCm 6.4
## (Unreleased) rocThrust 3.3.0 for ROCm 6.4

### Additions
* Added regression tests to `rtest.py`. Regression tests are a subset of tests that caused hardware problems for past emulation environments.
* Can be run with `python rtest.py [--emulation|-e|--test|-t]=regression`
* Added smoke test options, which runs a subset of the unit tests and ensures that less than 2gb of VRAM will be used
* Smoke tests can be run using `[--emulation|-e|--test|-t]=smoke`
* Added `--emulation` option for `rtest.py`
* Merged changes from upstream CCCL/thrust 2.4.0

### Changes
* `--test|-t` is no longer a required flag for `rtest.py`. Instead, the user can use either `--emulation|-e` or `--test|-t`, but not both.
* Split the contents of HIPSTDPAR's forwarding header into several implementation headers.

## (Unreleased) rocThrust 3.2.0 for ROCm 6.3

Expand All @@ -39,6 +41,7 @@ Documentation for rocThrust available at

* Merged changes from upstream CCCL/thrust 2.2.0
* Updated the contents of `system/hip` and `test` with the upstream changes to `system/cuda` and `testing`
* Added HIPSTDPAR library as part of rocThrust.

### Changes

Expand Down
5 changes: 3 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ endif()
# Disable -Werror
option(DISABLE_WERROR "Disable building with Werror" ON)
option(BUILD_TEST "Build tests" OFF)
option(BUILD_HIPSTDPAR_TEST "Build hipstdpar tests" OFF)
option(BUILD_EXAMPLES "Build examples" OFF)
option(BUILD_BENCHMARKS "Build benchmarks" OFF)
option(DOWNLOAD_ROCPRIM "Download rocPRIM and do not search for rocPRIM package" OFF)
Expand Down Expand Up @@ -143,14 +144,14 @@ if(BUILD_TEST OR BUILD_BENCHMARKS)
endif()

# Tests
if(BUILD_TEST)
if(BUILD_TEST OR BUILD_HIPSTDPAR_TEST)
rocm_package_setup_client_component(tests)
if (ENABLE_UPSTREAM_TESTS)
enable_testing()
endif()
# We still want the testing to be compiled to catch some errors
#TODO: Get testing folder working with HIP on Windows
if (NOT WIN32)
if (NOT WIN32 AND BUILD_TEST)
add_subdirectory(testing)
endif()
enable_testing()
Expand Down
Loading
Loading