diff --git a/README.md b/README.md index 87ba43d..24bc12e 100644 --- a/README.md +++ b/README.md @@ -5,49 +5,42 @@ Thread pool-based implementation of [parallel standard library algorithms](https://en.cppreference.com/w/cpp/algorithm). -Those algorithms are great, but compiler support is inconsistent. -PoolSTL is a *supplement* to fill in the support gaps so we can use parallel algorithms now. - -It is not meant as a full implementation, only the basics are expected to be covered. - -Use this if: -* you only need the basics -* you must support a [compiler that lacks native support](https://en.cppreference.com/w/cpp/compiler_support/17) (see the "Parallel algorithms and execution policies" row) -* you cannot link against TBB for whatever reason -* the [Parallel STL](https://www.intel.com/content/www/us/en/developer/articles/guide/get-started-with-parallel-stl.html) is too heavy +Those algorithms are great, but compiler support varies. +PoolSTL is a *supplement* to fill in the support gaps, so we can use parallel algorithms now. +It is not meant as a full implementation, only the basics are expected to be covered. Use this if: +* you only need the basics. +* to support a [compiler lacking native support](https://en.cppreference.com/w/cpp/compiler_support/17) (see "Parallel algorithms and execution policies"). +* you cannot link against TBB for whatever reason. +* the [Parallel STL](https://www.intel.com/content/www/us/en/developer/articles/guide/get-started-with-parallel-stl.html) is too heavy. Supports C++11 and higher, C++17 preferred. Tested in CI on GCC 7+, Clang/LLVM 5+, Apple Clang, MSVC. ## Implemented algorithms -Algorithms are added on an as-needed basis. If you need one that is not present feel free to open an issue or submit a PR. +Algorithms are added on an as-needed basis. If you need one [open an issue](https://github.com/alugowski/poolSTL/issues) or contribute a PR. ### `` -* [std::all_of](https://en.cppreference.com/w/cpp/algorithm/all_of), [std::any_of](https://en.cppreference.com/w/cpp/algorithm/any_of), [std::none_of](https://en.cppreference.com/w/cpp/algorithm/none_of) -* [std::copy](https://en.cppreference.com/w/cpp/algorithm/copy) -* [std::fill](https://en.cppreference.com/w/cpp/algorithm/fill) -* [std::fill_n](https://en.cppreference.com/w/cpp/algorithm/fill_n) -* [std::find](https://en.cppreference.com/w/cpp/algorithm/find) -* [std::find_if](https://en.cppreference.com/w/cpp/algorithm/find_if) -* [std::find_if_not](https://en.cppreference.com/w/cpp/algorithm/find_if_not) -* [std::for_each](https://en.cppreference.com/w/cpp/algorithm/for_each) -* [std::for_each_n](https://en.cppreference.com/w/cpp/algorithm/for_each_n) -* [std::transform](https://en.cppreference.com/w/cpp/algorithm/transform) +* [all_of](https://en.cppreference.com/w/cpp/algorithm/all_of), [any_of](https://en.cppreference.com/w/cpp/algorithm/any_of), [none_of](https://en.cppreference.com/w/cpp/algorithm/none_of) +* [copy](https://en.cppreference.com/w/cpp/algorithm/copy) +* [fill](https://en.cppreference.com/w/cpp/algorithm/fill), [fill_n](https://en.cppreference.com/w/cpp/algorithm/fill_n) +* [find](https://en.cppreference.com/w/cpp/algorithm/find), [find_if](https://en.cppreference.com/w/cpp/algorithm/find_if), [find_if_not](https://en.cppreference.com/w/cpp/algorithm/find_if_not) +* [for_each](https://en.cppreference.com/w/cpp/algorithm/for_each), [for_each_n](https://en.cppreference.com/w/cpp/algorithm/for_each_n) +* [transform](https://en.cppreference.com/w/cpp/algorithm/transform) ### `` -* [std::reduce](https://en.cppreference.com/w/cpp/algorithm/reduce) -* [std::transform_reduce](https://en.cppreference.com/w/cpp/algorithm/transform_reduce) (C++17 only) +* [reduce](https://en.cppreference.com/w/cpp/algorithm/reduce) +* [transform_reduce](https://en.cppreference.com/w/cpp/algorithm/transform_reduce) (C++17 only) + +All in `std::` namespace. -Note: All iterators must be random access. +**Note:** All iterators must be random access. ## Usage -PoolSTL defines `poolstl::par` and `poolstl::par_pool` execution policies. Pass either one of these as the first argument +PoolSTL provides `poolstl::par` and `poolstl::par_pool` execution policies. Pass either one of these as the first argument to one of the supported algorithms and your code will be parallel. -In other words, use `poolstl::par` as you would use `std::execution::par`. - -Complete example: +In other words, use `poolstl::par` as you would use [`std::execution::par`](https://en.cppreference.com/w/cpp/algorithm/execution_policy_tag). Complete example: ```c++ #include #include @@ -64,21 +57,22 @@ int main() { ### Pool control -Use `poolstl::par_pool` with your own [thread pool](https://github.com/alugowski/task-thread-pool) to have full control over thread count, thread startup/shutdown, etc.: +The thread pool used by `poolstl::par` is managed internally by poolSTL. It is started on first use. + +Full control over thread count, startup/shutdown, etc. with your own [thread pool](https://github.com/alugowski/task-thread-pool) +and `poolstl::par_pool`: ```c++ task_thread_pool::task_thread_pool pool; -std::for_each(poolstl::par_pool(pool), v.cbegin(), v.cbegin(), [](auto) {}); +std::reduce(poolstl::par_pool(pool), v.cbegin(), v.cbegin()); ``` -The pool used by `poolstl::par` is managed internally by poolSTL. It is started on first use. - ## Installation ### Single File -You may download a single-file amalgamated `poolstl.hpp` from the [latest release](https://github.com/alugowski/poolSTL/releases) and simply copy into your project. +Copy a single-file amalgamated `poolstl.hpp` from the [latest release](https://github.com/alugowski/poolSTL/releases) and into your project. ### CMake @@ -100,12 +94,46 @@ Alternatively copy or checkout the repo into your project and: add_subdirectory(poolSTL) ``` +# Benchmark + +See [benchmark/](benchmark) to compare poolSTL against the standard sequential implementation, and (if available) the +native `std::execution::par` implementation. + +Results on an M1 Pro (6 power, 2 efficiency cores), with GCC 13: +``` +------------------------------------------------------------------------------------------------------- +Benchmark Time CPU Iterations +------------------------------------------------------------------------------------------------------- +all_of()/real_time 19.8 ms 19.8 ms 35 +all_of(poolstl::par)/real_time 3.87 ms 0.113 ms 175 +all_of(std::execution::par)/real_time 3.84 ms 3.27 ms 198 +find_if()/needle_percentile:5/real_time 1.01 ms 1.00 ms 708 +find_if()/needle_percentile:50/real_time 9.91 ms 9.90 ms 71 +find_if()/needle_percentile:100/real_time 19.8 ms 19.7 ms 35 +find_if(poolstl::par)/needle_percentile:5/real_time 0.391 ms 0.045 ms 1787 +find_if(poolstl::par)/needle_percentile:50/real_time 1.83 ms 0.081 ms 353 +find_if(poolstl::par)/needle_percentile:100/real_time 3.58 ms 0.085 ms 197 +find_if(std::execution::par)/needle_percentile:5/real_time 0.234 ms 0.227 ms 3051 +find_if(std::execution::par)/needle_percentile:50/real_time 1.87 ms 1.79 ms 377 +find_if(std::execution::par)/needle_percentile:100/real_time 3.91 ms 3.51 ms 177 +for_each()/real_time 94.8 ms 94.8 ms 7 +for_each(poolstl::par)/real_time 20.2 ms 0.041 ms 37 +for_each(std::execution::par)/real_time 17.1 ms 14.2 ms 45 +transform()/real_time 95.8 ms 95.8 ms 7 +transform(poolstl::par)/real_time 20.8 ms 0.041 ms 38 +transform(std::execution::par)/real_time 16.8 ms 14.3 ms 41 +reduce()/real_time 15.1 ms 15.1 ms 46 +reduce(poolstl::par)/real_time 4.21 ms 0.046 ms 165 +reduce(std::execution::par)/real_time 3.55 ms 3.09 ms 199 +``` + # poolSTL as `std::execution::par` substitute **USE AT YOUR OWN RISK!** -Two-line fix for missing compiler support. A no-op on compilers with support. +Two-line hack for missing compiler support. A no-op on compilers with support. -If `POOLSTL_STD_SUPPLEMENT` is defined and native support is not found then poolSTL will alias its `poolstl::par` as `std::execution::par`: +If `POOLSTL_STD_SUPPLEMENT` is defined then poolSTL will check for native compiler support. +If not found then poolSTL will alias its `poolstl::par` as `std::execution::par`: ```c++ #define POOLSTL_STD_SUPPLEMENT diff --git a/include/poolstl/algorithm b/include/poolstl/algorithm index b8777c0..9973eb8 100644 --- a/include/poolstl/algorithm +++ b/include/poolstl/algorithm @@ -81,7 +81,7 @@ namespace std { extremum.compare_exchange_weak(old, k); } } - }, 10); // use small tasks so later ones may exit early if item is already found + }, 8); // use small tasks so later ones may exit early if item is already found poolstl::internal::get_futures(futures); return extremum == n ? last : first + extremum; } diff --git a/include/poolstl/poolstl.hpp b/include/poolstl/poolstl.hpp index b861ef3..1aaf836 100644 --- a/include/poolstl/poolstl.hpp +++ b/include/poolstl/poolstl.hpp @@ -17,15 +17,14 @@ #include "numeric" /* - * Optionally alias poolstl::par as std::execution::par to enable poolSTL to fill in for missing compiler support. + * Optionally alias `poolstl::par` as `std::execution::par` to enable poolSTL to fill in for missing compiler support. * * USE AT YOUR OWN RISK! * - * To do this define POOLSTL_STD_SUPPLEMENT before including poolstl.hpp. + * To use this define POOLSTL_STD_SUPPLEMENT=1 before including poolstl.hpp. * - * This aliasing will not happen if native support exists. If this autodetection fails for you: - * - define POOLSTL_ALLOW_SUPPLEMENT=0 to disable - * - define POOLSTL_FORCE_SUPPLEMENT to force enable (use with great care!) + * This aliasing will not happen if native support exists. If this autodetection fails for you, + * define POOLSTL_ALLOW_SUPPLEMENT=0 to disable this feature. */ #ifndef POOLSTL_ALLOW_SUPPLEMENT #define POOLSTL_ALLOW_SUPPLEMENT 1 @@ -39,7 +38,7 @@ #endif #endif -#if !defined(__cpp_lib_parallel_algorithm) || defined(POOLSTL_FORCE_SUPPLEMENT) +#if !defined(__cpp_lib_parallel_algorithm) namespace std { namespace execution { using ::poolstl::execution::parallel_policy;