Skip to content

Latest commit

 

History

History
401 lines (266 loc) · 18.8 KB

File metadata and controls

401 lines (266 loc) · 18.8 KB

Sorting and searching algorithms

Table of contents


Introduction and overview


Comparison sorting

In comparison sorting one may compare two element (checking whether ai < aj). Other operations on element (e.g., using them as indices) are not allowed. Any comparison-based algorithm of sorting an array of size N requires Ω(N log N) comparisons in the worst case. Determining the exact number of comparisons is a computationally hard problem even for small N. No simple formula for the solution is known. For practical applications one should always consider constant factors hidden in the big-O notation. Typically, O(N2) algorithms (e.g., insertion sort) are faster than O(N log N) ones (e.g., quick sort) for small inputs. For example, std::sort implementation in libstdc++ resorts to the insertion sort if the input size doesn’t exceed 16 elements, and Microsoft’s implementation uses the value 32.

🔗

🎥

📖

📄

💫

Insertion sort

Insertion sort iterates, consuming one input element each repetition, and growing a sorted output range. At each iteration, insertion sort removes one element from the input data, finds the location it belongs within the sorted range, and inserts it there. It repeats until no input elements remain. Insertion sort is commonly used to sort a small number of elements. It is employed in many std::sort implementations as a final step of recursion when a sub-range is small enough.

template<class Bidir_it>
void linear_insert(Bidir_it first, Bidir_it last) {
    auto value = std::move(*last);
    for (auto curr = std::prev(last); value < *curr; --curr) {
        *last = std::move(*curr);
        if ((last = curr) == first)
            break;
    }
    *last = std::move(value);
}

template<class Bidir_it>
void insertion_sort(Bidir_it first, Bidir_it last) {
    if (first == last)
        return;
    for (auto next = std::next(first); next != last; ++next)
        linear_insert(first, next);
}

🔗

📖

  • Ch. 9: Medians and order statistics – T.H.Cormen, C.E.Leiserson, R.L.Rivest, C.Stein. Introduction to algorithms (2009)
  • Ch. 8: Elementary sorting methods, Sec.: Insertion sort – R.Sedgewick. Algorithms (1983)

Selection sort

Selection sort divides the input range into two parts: a sorted sub-range of items which is built up from left to right at the front of the range and a sub-range of the remaining unsorted items that occupy the rest of the range. The algorithm proceeds by finding the smallest element in the unsorted sub-range, swapping it with the leftmost unsorted element (putting it in sorted order), and moving the sub-range boundaries one element to the right. Selection sort makes only O(N) writes in the average and the worst cases, and is useful when writes are significantly more expensive than reads, e.g. when elements have small keys and very large associated data or when elements are stored in flash memory.

template<class Forward_it>
Forward_it min_element(Forward_it first, Forward_it last) {
    auto min = first++;
    for (; first != last; ++first)
        if (*first < *min)
            min = first;
    return min;
}

template<class Forward_it>
void selection_sort(Forward_it first, Forward_it last) {
    for (; first != last; ++first)
        std::iter_swap(first, min_element(first, last));
}

🔗

📖

  • Ch. 8: Elementary sorting methods, Sec: Selection sort – R.Sedgewick. Algorithms (1983)

Heap sort

🔧

  • Implementation of std::partial_sort that rearranges a given range such that k smallest elements become the first k elements of the range, in sorted order.

Merge sort

📝

  • Merge sort is useful for sorting linked lists and for external sorting of data that doesn’t fit into main memory.

🔗

📖

  • Sec. 5.1.: A first recurrence: The mergesort algorithm – J.Kleinberg, É.Tardos. Algorithm designPearson (2005)

Inversions counting

Problem: count the number of inversions in a permutation P = (a1, ..., aN), i.e. the number of pairs (ai, aj) with i < j and ai > aj.

🔗

🎥

📖

Quick sort

Gist of the algorithm: choose the pivot value; partition the range into two sub-ranges, according to whether they are < pivot, == pivot or > pivot, then sort sub-ranges recursively.

template<class Rand_it>
void quicksort(Rand_it first, Rand_it last) {
    if (last - first <= 1)
        return;
    const auto part = partition_hoare(first, last);
    quicksort(first, part);
    quicksort(part, last);
}

🔗

📖

📄

💫

Lomuto partition

Lomuto partition scheme permutes the range [first, last) into {{< pivot}, pivot, {>= pivot}}, where pivot is the value of some pivotal element.

template<class Bidir_it>
Bidir_it partition_lomuto(Bidir_it first, Bidir_it last) {
    const auto& pivot = *--last;
    auto part = first;
    for (; first != last; ++first)
        if (*first < pivot)
            std::iter_swap(part++, first);
    std::iter_swap(part, last);
    return part;
}

🔗

📖

Hoare partition

Hoare partition scheme permutes the range [first, last) into {{<= pivot}, {>= pivot}}, where pivot is the value of some pivotal element. Hoare partition does three times fewer swaps on average than Lomuto’s partition, and it creates efficient partitions even when all values are equal. This partition scheme is used to implement std::sort in libstdc++.

template<class Rand_it>
Rand_it partition_hoare(Rand_it first, Rand_it last) {
   const auto pivot = *(first + (last - first) / 2);
   while (true) {
       while (*first < pivot)
           ++first;
       --last;
       while (pivot < *last)
           --last;
       if (!(first < last))
           return first;
       std::iter_swap(first++, last);
   }
}

🔗

📖

📄

“Fat pivot” partition / Dutch national flag problem

“Fat pivot” partition scheme permutes the range [first, last) into {{< pivot}, {= pivot}, {> pivot}}, where pivot is the value of some pivotal element.

template<class Rand_it>
std::pair<Rand_it, Rand_it> partition_fat_pivot(Rand_it first, Rand_it last) {
    const auto& pivot = *--last;
    auto part1 = first, part2 = last;
    while (first != part2)
        if (*first < pivot)
            std::iter_swap(first++, part1++);
        else if (pivot < *first)
            std::iter_swap(first, --part2);
        else
            ++first;
     std::iter_swap(part2++, last);
     return {part1, part2};
}

🔗

📖

📄

Order statistics and quick select

The Kth order statistic of an array is its Kth smallest element. Any comparison-based algorithm of finding the smallest element in an array of size N requires at least N - 1 comparisons in the worst case. Any comparison-based algorithm of finding both the smallest and the largest elements in an array of size N requires at least ⌈3N / 2⌉ - 1 comparisons in the worst case. Any comparison-based algorithm of finding both the smallest and the second smallest element in an array of size N requires at least N + ⌈log2 N⌉ - 2 comparisons in the worst case.

🔗

🎥

📖

  • Ch. 6: Linear-time selection – Roughgarden T. Algorithms illuminated (Part 1): The basics – Soundlikeyourself Publishing (2017)
  • Ch. 9: Medians and order statistics – T.H.Cormen, C.E.Leiserson, R.L.Rivest, C.Stein. Introduction to algorithms (2009)
  • Sec. 6.5.2: Finding the kth smallest element, Sec. 6.11.2: Finding the two largest elements in a set – U.Manber. Introduction to algorithms: A creative approach (1989)
  • Sec. 3.6: Order statistics, Sec. 3.7: Expected time for order statistics – A.V.Aho, J.E.Hopcroft, J.D.Ullman. The design and analysis of computer algorithms (1974)

📄

Median finding

🎥


Linear-time sorting

🎥

Count sort

Radix sort

🔗

🎥

Sorting by swapping

Adjacent swaps

The minimum number of adjacent swaps required to sort a permutation P (i.e. convert it into the identity one) is equal to the number of inversions in P.

Arbitrary swaps

The minimum number of arbitrary swaps required to sort a permutation P (i.e. convert it into the identity one) is equal to the size of P minus the number of cycles in P.

🔗


Other sorting algorithms

Pancake sorting

📝

  • The problem of finding the shortest sequence of flips for a given stack of pancakes is NP-hard (L.Bulteau, G.Fertin, I.Rusu, 2011)

🔗

Spreadsort

🔗


Searching

📖

Binary search

See also Binary search – Algorithm analysis.

📖

  • Col. 4: Writing correct programs, Sec. 9.4: Major surgery – binary search – J.Bentley. Programming pearls (1999)