Skip to content

Commit

Permalink
Add collector accumulator (#390)
Browse files Browse the repository at this point in the history
Co-authored-by: Hans Dembinski <[email protected]>
  • Loading branch information
wiso and HDembinski authored Apr 25, 2024
1 parent cd3e111 commit 90867e2
Show file tree
Hide file tree
Showing 46 changed files with 578 additions and 86 deletions.
2 changes: 1 addition & 1 deletion doc/changelog.qbk
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
* Added new `accumulators::fraction` to compute fractions, their variance, and confidence intervals
* Added interval computers for fractions: `utility::clopper_pearson`, `utility::wilson_interval`, `utility::jeffreys_interval`, `utility::wald_interval` which can compute intervals with arbitrary confidence level
* Added `utility::confidence_level` and `utility::deviation` types to pass confidence levels as probabilities or in multiples of standard deviation for all interval computers, respectively
* Fixed internal `sub_array` and `span` in C++20
* Fixed internal `static_vector` and `span` in C++20

[heading Boost 1.80]

Expand Down
1 change: 1 addition & 0 deletions doc/concepts/Accumulator.qbk
Original file line number Diff line number Diff line change
Expand Up @@ -92,5 +92,6 @@ An [*Accumulator] is a functor which consumes the argument to update some intern
* [classref boost::histogram::accumulators::weighted_sum]
* [classref boost::histogram::accumulators::mean]
* [classref boost::histogram::accumulators::weighted_mean]
* [classref boost::histogram::accumulators::collector]

[endsect]
1 change: 1 addition & 0 deletions doc/guide.qbk
Original file line number Diff line number Diff line change
Expand Up @@ -449,6 +449,7 @@ The library provides several accumulators:
* [classref boost::histogram::accumulators::weighted_mean weighted_mean] accepts a sample and a weight. It computes the weighted mean of the samples. [funcref boost::histogram::make_weighted_profile make_weighted_profile] uses this accumulator.
* [classref boost::histogram::accumulators::fraction fraction] accepts a boolean sample that represents success or failure of a binomial trial. It computes the fraction of successes. One can access the number of successes and failures, the fraction, the estimated variance of the fraction, and a confidence interval. The standard confidence interval is the Wilson score interval, but more interval computers are implemented in
`boost/histogram/utility`. Beware: one cannot pass `std::vector<bool>` to [classref boost::histogram::histogram histogram::fill], because it is not a contiguous sequence of boolean values, but any other container of booleans works and any sequence of values convertible to bool.
* [classref boost::histogram::accumulators::collector collector] consists of a collection of containers, one per bin. It accepts samples and sorts the sample value into the corresponding container. The memory consumption of this accumulator is unbounded, since it stores each input value. It is useful to compute custom estimators, in particular, those which require access to the full sample, like a kernel density estimate, or which do not have online update algorithms (for example, the median).

Users can easily write their own accumulators and plug them into the histogram, if they adhere to the [link histogram.concepts.Accumulator [*Accumulator] concept]. All accumulators from [@boost:/libs/accumulators/index.html Boost.Accumulators] that accept a single argument and no weights work out of the box. Other accumulators from Boost.Accumulators can be made to work by using them inside a wrapper class that implements the concept.

Expand Down
1 change: 1 addition & 0 deletions include/boost/histogram/accumulators.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
[1]: histogram/reference.html#header.boost.histogram.accumulators.ostream_hpp
*/

#include <boost/histogram/accumulators/collector.hpp>
#include <boost/histogram/accumulators/count.hpp>
#include <boost/histogram/accumulators/fraction.hpp>
#include <boost/histogram/accumulators/mean.hpp>
Expand Down
110 changes: 110 additions & 0 deletions include/boost/histogram/accumulators/collector.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
// Copyright 2024 Ruggero Turra, Hans Dembinski
//
// Distributed under the Boost Software License, version 1.0.
// (See accompanying file LICENSE_1_0.txt
// or copy at http://www.boost.org/LICENSE_1_0.txt)

#ifndef BOOST_HISTOGRAM_ACCUMULATORS_COLLECTOR_HPP
#define BOOST_HISTOGRAM_ACCUMULATORS_COLLECTOR_HPP

#include <algorithm> // for std::equal
#include <boost/core/nvp.hpp>
#include <boost/histogram/detail/detect.hpp>
#include <boost/histogram/fwd.hpp> // for collector<>
#include <initializer_list>
#include <type_traits>

namespace boost {
namespace histogram {
namespace accumulators {

/** Collects samples.
Input samples are stored in an internal container for later retrival, which stores the
values consecutively in memory. The interface is designed to work with std::vector and
other containers which implement the same API.
Warning: The memory of the accumulator is unbounded.
*/
template <class ContainerType>
class collector {
public:
using container_type = ContainerType;
using value_type = typename container_type::value_type;
using allocator_type = typename container_type::allocator_type;
using const_reference = typename container_type::const_reference;
using iterator = typename container_type::iterator;
using const_iterator = typename container_type::const_iterator;
using size_type = typename container_type::size_type;
using const_pointer = typename container_type::const_pointer;

// make template only match if forwarding args to container is valid
template <typename... Args, class = decltype(container_type(std::declval<Args>()...))>
explicit collector(Args&&... args) : container_(std::forward<Args>(args)...) {}

// make template only match if forwarding args to container is valid
template <class T, typename... Args, class = decltype(container_type(std::initializer_list<T>(),std::declval<Args>()...))>
explicit collector(std::initializer_list<T> list, Args&&... args)
: container_(list, std::forward<Args>(args)...) {}

/// Append sample x.
void operator()(const_reference x) { container_.push_back(x); }

/// Append samples from another collector.
template <class C>
collector& operator+=(const collector<C>& rhs) {
container_.reserve(size() + rhs.size());
container_.insert(end(), rhs.begin(), rhs.end());
return *this;
}

/// Return true if collections are equal.
///
/// Two collections are equal if they have the same number of elements
/// which all compare equal.
template <class Iterable, class = detail::is_iterable<Iterable>>
bool operator==(const Iterable& rhs) const noexcept {
return std::equal(begin(), end(), rhs.begin(), rhs.end());
}

/// Return true if collections are not equal.
template <class Iterable, class = detail::is_iterable<Iterable>>
bool operator!=(const Iterable& rhs) const noexcept {
return !operator==(rhs);
}

/// Return number of samples.
size_type size() const noexcept { return container_.size(); }

/// Return number of samples (alias for size()).
size_type count() const noexcept { return container_.size(); }

/// Return readonly iterator to start of collection.
const const_iterator begin() const noexcept { return container_.begin(); }

/// Return readonly iterator to end of collection.
const const_iterator end() const noexcept { return container_.end(); }

/// Return const reference to value at index.
const_reference operator[](size_type idx) const noexcept { return container_[idx]; }

/// Return pointer to internal memory.
const_pointer data() const noexcept { return container_.data(); }

allocator_type get_allocator() const { return container_.get_allocator(); }

template <class Archive>
void serialize(Archive& ar, unsigned version) {
(void)version;
ar& make_nvp("container", container_);
}

private:
container_type container_;
};

} // namespace accumulators
} // namespace histogram
} // namespace boost

#endif
14 changes: 14 additions & 0 deletions include/boost/histogram/accumulators/ostream.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,20 @@ std::basic_ostream<CharT, Traits>& operator<<(std::basic_ostream<CharT, Traits>&
return detail::handle_nonzero_width(os, x);
}

template <class CharT, class Traits, class U>
std::basic_ostream<CharT, Traits>& operator<<(std::basic_ostream<CharT, Traits>& os,
const collector<U>& x) {
if (os.width() == 0) {
os << "collector{";
auto iter = x.begin();
if (iter != x.end()) os << *iter++;
for (; iter != x.end(); ++iter) os << ", " << *iter;
os << "}";
return os;
}
return detail::handle_nonzero_width(os, x);
}

} // namespace accumulators
} // namespace histogram
} // namespace boost
Expand Down
6 changes: 3 additions & 3 deletions include/boost/histogram/detail/axes.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
#include <boost/histogram/detail/priority.hpp>
#include <boost/histogram/detail/relaxed_tuple_size.hpp>
#include <boost/histogram/detail/static_if.hpp>
#include <boost/histogram/detail/sub_array.hpp>
#include <boost/histogram/detail/static_vector.hpp>
#include <boost/histogram/detail/try_cast.hpp>
#include <boost/histogram/fwd.hpp>
#include <boost/mp11/algorithm.hpp>
Expand Down Expand Up @@ -381,13 +381,13 @@ std::size_t offset(const T& axes) {
// make default-constructed buffer (no initialization for POD types)
template <class T, class A>
auto make_stack_buffer(const A& a) {
return sub_array<T, buffer_size<A>::value>(axes_rank(a));
return static_vector<T, buffer_size<A>::value>(axes_rank(a));
}

// make buffer with elements initialized to v
template <class T, class A>
auto make_stack_buffer(const A& a, const T& t) {
return sub_array<T, buffer_size<A>::value>(axes_rank(a), t);
return static_vector<T, buffer_size<A>::value>(axes_rank(a), t);
}

template <class T>
Expand Down
103 changes: 103 additions & 0 deletions include/boost/histogram/detail/chunk_vector.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
// Copyright 2019 Hans Dembinski
//
// Distributed under the Boost Software License, version 1.0.
// (See accompanying file LICENSE_1_0.txt
// or copy at http://www.boost.org/LICENSE_1_0.txt)

#ifndef BOOST_HISTOGRAM_DETAIL_CHUNK_VECTOR_HPP
#define BOOST_HISTOGRAM_DETAIL_CHUNK_VECTOR_HPP

#include <boost/core/span.hpp>
#include <boost/throw_exception.hpp>
#include <stdexcept>
#include <vector>

namespace boost {
namespace histogram {
namespace detail {

// Warning: this is not a proper container and is only used to
// test the feasibility of using accumulators::collector with a
// custom container type. If time permits, this will be expanded
// into a proper container type.
template <class ValueType>
class chunk_vector {
public:
using base = std::vector<ValueType>;
using allocator_type = typename base::allocator_type;
using pointer = typename base::pointer;
using const_pointer = typename base::const_pointer;
using size_type = typename base::size_type;
using const_reference = boost::span<const ValueType>;
using reference = boost::span<ValueType>;
// this is wrong and should make a copy; it is not a problem for
// the current use-case, but a general purpose implementation cannot
// violate concepts like this
using value_type = const_reference;

template <class Pointer>
struct iterator_t {
iterator_t& operator++() {
ptr_ += chunk_;
return *this;
}

iterator_t operator++(int) {
iterator_t copy(*this);
ptr_ += chunk_;
return copy;
}

value_type operator*() const { return value_type(ptr_, ptr_ + chunk_); }

Pointer ptr_;
size_type chunk_;
};

using iterator = iterator_t<pointer>;
using const_iterator = iterator_t<const_pointer>;

// this creates an empty chunk_vector
explicit chunk_vector(size_type chunk, const allocator_type& alloc = {})
: chunk_(chunk), vec_(alloc) {}

chunk_vector(std::initializer_list<value_type> list, size_type chunk,
const allocator_type& alloc = {})
: chunk_(chunk), vec_(list, alloc) {}

allocator_type get_allocator() noexcept(noexcept(allocator_type())) {
return vec_.get_allocator();
}

void push_back(const_reference x) {
if (x.size() != chunk_)
BOOST_THROW_EXCEPTION(std::runtime_error("argument has wrong size"));
// we don't use std::vector::insert here to have amortized constant complexity
for (auto&& elem : x) vec_.push_back(elem);
}

auto insert(const_iterator pos, const_iterator o_begin, const_iterator o_end) {
if (std::distance(o_begin, o_end) % chunk_ == 0)
BOOST_THROW_EXCEPTION(std::runtime_error("argument has wrong size"));
return vec_.insert(pos, o_begin, o_end);
}

const_iterator begin() const noexcept { return {vec_.data(), chunk_}; }
const_iterator end() const noexcept { return {vec_.data() + vec_.size(), chunk_}; }

value_type operator[](size_type idx) const noexcept {
return {vec_.data() + idx * chunk_, vec_.data() + (idx + 1) * chunk_};
}

size_type size() const noexcept { return vec_.size() / chunk_; }

private:
size_type chunk_;
base vec_;
};

} // namespace detail
} // namespace histogram
} // namespace boost

#endif
2 changes: 2 additions & 0 deletions include/boost/histogram/detail/detect.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ struct detect_base {
// reset has overloads, trying to get pmf in this case always fails
BOOST_HISTOGRAM_DETAIL_DETECT(has_method_reset, t.reset(0));

BOOST_HISTOGRAM_DETAIL_DETECT(has_method_push_back, (&T::push_back));

BOOST_HISTOGRAM_DETAIL_DETECT(is_indexable, t[0]);

BOOST_HISTOGRAM_DETAIL_DETECT_BINARY(is_transform, (t.inverse(t.forward(u))));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
// (See accompanying file LICENSE_1_0.txt
// or copy at http://www.boost.org/LICENSE_1_0.txt)

#ifndef BOOST_HISTOGRAM_DETAIL_SUB_ARRAY_HPP
#define BOOST_HISTOGRAM_DETAIL_SUB_ARRAY_HPP
#ifndef BOOST_HISTOGRAM_DETAIL_STATIC_VECTOR_HPP
#define BOOST_HISTOGRAM_DETAIL_STATIC_VECTOR_HPP

#include <algorithm>
#include <boost/throw_exception.hpp>
Expand All @@ -15,10 +15,11 @@ namespace boost {
namespace histogram {
namespace detail {

// Like std::array, but allows to use less than maximum capacity.
// Cannot inherit from std::array, since this confuses span.
// A crude implementation of boost::container::static_vector.
// Like std::vector, but with static allocation up to a maximum capacity.
template <class T, std::size_t N>
class sub_array {
class static_vector {
// Cannot inherit from std::array, since this confuses span.
static constexpr bool swap_element_is_noexcept() noexcept {
using std::swap;
return noexcept(swap(std::declval<T&>(), std::declval<T&>()));
Expand All @@ -34,19 +35,19 @@ class sub_array {
using iterator = pointer;
using const_iterator = const_pointer;

sub_array() = default;
static_vector() = default;

explicit sub_array(std::size_t s) noexcept : size_(s) { assert(size_ <= N); }
explicit static_vector(std::size_t s) noexcept : size_(s) { assert(size_ <= N); }

sub_array(std::size_t s, const T& value) noexcept(
static_vector(std::size_t s, const T& value) noexcept(
std::is_nothrow_assignable<T, const_reference>::value)
: sub_array(s) {
: static_vector(s) {
fill(value);
}

sub_array(std::initializer_list<T> il) noexcept(
static_vector(std::initializer_list<T> il) noexcept(
std::is_nothrow_assignable<T, const_reference>::value)
: sub_array(il.size()) {
: static_vector(il.size()) {
std::copy(il.begin(), il.end(), data_);
}

Expand Down Expand Up @@ -90,7 +91,7 @@ class sub_array {
std::fill(begin(), end(), value);
}

void swap(sub_array& other) noexcept(swap_element_is_noexcept()) {
void swap(static_vector& other) noexcept(swap_element_is_noexcept()) {
using std::swap;
const size_type s = (std::max)(size(), other.size());
for (auto i = begin(), j = other.begin(), end = begin() + s; i != end; ++i, ++j)
Expand All @@ -104,12 +105,12 @@ class sub_array {
};

template <class T, std::size_t N>
bool operator==(const sub_array<T, N>& a, const sub_array<T, N>& b) noexcept {
bool operator==(const static_vector<T, N>& a, const static_vector<T, N>& b) noexcept {
return std::equal(a.begin(), a.end(), b.begin(), b.end());
}

template <class T, std::size_t N>
bool operator!=(const sub_array<T, N>& a, const sub_array<T, N>& b) noexcept {
bool operator!=(const static_vector<T, N>& a, const static_vector<T, N>& b) noexcept {
return !(a == b);
}

Expand All @@ -119,8 +120,9 @@ bool operator!=(const sub_array<T, N>& a, const sub_array<T, N>& b) noexcept {

namespace std {
template <class T, std::size_t N>
void swap(::boost::histogram::detail::sub_array<T, N>& a,
::boost::histogram::detail::sub_array<T, N>& b) noexcept(noexcept(a.swap(b))) {
void swap(
::boost::histogram::detail::static_vector<T, N>& a,
::boost::histogram::detail::static_vector<T, N>& b) noexcept(noexcept(a.swap(b))) {
a.swap(b);
}
} // namespace std
Expand Down
Loading

0 comments on commit 90867e2

Please sign in to comment.