Skip to content

Commit

Permalink
Merge pull request julycoding#382 from tastynoodle/master
Browse files Browse the repository at this point in the history
add ebook/en/03.0.md
  • Loading branch information
julycoding committed Oct 17, 2014
2 parents 34aeba4 + b9485d9 commit 9ff103d
Show file tree
Hide file tree
Showing 2 changed files with 109 additions and 3 deletions.
105 changes: 105 additions & 0 deletions ebook/en/03.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
Chapter 3: Find Smallest K Numbers
==========

## Problem Description

Given n integers, output the smallest k numbers.

## Analysis and Solutions
### Solution One

To get the smallest k numbers, the simplest way is, we sort the array, then output the first k numbers.

As for the sorting algorithms,I guess quick-sort is a good choice ( as we known, the average time complexity of Quick-Sort is `n*logn`), then we just output the first k numbers. So the total time complexity is: `O(n * log n) + O(k)=O(n * log n)`.

### Solution Two

The problem does not require the smallest k numbers to be sorted, neither are the last n-k numbers. So we do not need to sort all the numbers. We can use Selection-Sort or Swap-Sort:

1. Traverse the n numbers, put the first k numbers into a new array, assume they are the smallest k numbers;
2. For these k numbers, use Selection-Sort or Swap-Sort to get the biggest element kmax (need to traverse these k numbers, the time complexity is `O(k)`);
3. Traverse the rest n-k numbers. Suppose each time the new element is x, compare x with kmax: if `x < kmax`, replace kmax with x, then jump to step 2 and find the biggest element kmax; if `x >= kmax`, just continue traversing.

In each traverse, the time complexity is `O(k)` if we update the array, `O(1)` if we do not. So totally the time complexity is `n*O(k) = O(n*k)`.

### Solution Three

A better way is maintaining a max-heap with capacity k, it is similiar with the solution two:

1. Store the first k numbers in a max-heap with capacity k, assume they are the smallest k numbers;
2. Sort the heap, make k1<k2<...<kmax (kmax is the biggest number in the heap);
3. Traverse the rest n-k numbers. Suppose each time the new element is x, compare x with kmax: if `x < kmax`, replace kmax with x and reheapify the heap (`O(logk)`); Otherwise do nothing.

In this case the time complexity is: `O(k+(n-k)*logk)=O(n*logk)`. This is because in the heap we just need `O(logk)` to operate a search or reheapify (In solution two, to find the biggest number in an array we need `O(k)`).

### Solution Four

The book \<Date Structures and Algorithm Analysis in C\>, in Chapter7, Section7.7.6, introduced a Quick-Selection algorithm. In average case, its time complexity is `O(n)`:

- Suppose the number set is S, in S we choose a number v as the pivot. Divide S-{v} into S1 and S2, like the Quick-Sort.
- If k <= |S1|, the k-th smallest number is in S1. In this case, return QuickSelect(S1, k).
- If k = 1 + |S1|, the pivot is the k-th smallest number, return it.
- Otherwise, the k-th smallest number is in S2. It is the (k - |S1| - 1) smallest number in S2, so we return QuickSelect(S2, k - |S1| - 1).

The time complexity is O(n).

You can find the code below:

```cpp
//Quick-Select, put the k-th smallest number in a[k-1]
void QuickSelect( int a[], int k, int left, int right )
{
int i, j;
int pivot;

if( left + cutoff <= right )
{
pivot = median3( a, left, right );
//set the median as the pivot, this can avoid the wort case, to some extent
i = left; j = right - 1;
for( ; ; )
{
while( a[ ++i ] < pivot ){ }
while( a[ --j ] > pivot ){ }
if( i < j )
swap( &a[ i ], &a[ j ] );
else
break;
}
//reset the pivot
swap( &a[ i ], &a[ right - 1 ] );

if( k <= i )
QuickSelect( a, k, left, i - 1 );
else if( k > i + 1 )
QuickSelect( a, k, i + 1, right );
}
else
InsertSort( a + left, right - left + 1 );
}

```
This Quick-Select algorithm is kind of like the partition method in Quick-Sort. Store the N numbers in array S, set the 'median of median' as the pivot. Then divide array into Sa and Sb, Sa<=X<=Sb. If there are more than k elements in Sa, return the smallest k numbers in Sa, otherwise return all the elements in Sa and the smallest k-|Sa| numbers in Sb. In average case the time complexity is `O(n)`.
Furthermore, the book \<Introduction to Algorithms\>, in Chapter9 Section9.3, introduced a Select Algorithm whose time complexity is O(n) in the worst case. I recommend you to have a look.
## Think Deeper
1.Google interview: input two arrays consisting of integers, sum each two of them we can get a new array, how to find the k-th smallest numbers in this new array?
Analysis:
‘Suppose the two arrays are A and B, they all have N numbers, sum each two of them we can get a new array C, which has N^2 elements.
Then consider these as N sorted arrays:
A[1]+B[1] <= A[1]+B[2] <= A[1]+B[3] <=…
A[2]+B[1] <= A[2]+B[2] <= A[2]+B[3] <=…
A[N]+B[1] <= A[N]+B[2] <= A[N]+B[3] <=…
The problem turns into finding the smallest k numbers in these N^2 sorted array’
2.Suppose there are two increasing arrays A and B, A=(a1,a2,...,ak), B=(b1,b2,...,bk). For 1<=i, j<=k, find the smallest k (ai+bj). The algorithm is required to be efficient.
3.Given an array a1,a2,a3,...,an and m triples, each triple represents a query. For each query (i, j, k), output the k-th smallest number in ai,ai+1,...,aj.
7 changes: 4 additions & 3 deletions ebook/en/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,14 @@
- [1.1 - Left Rotating String](01.0.md)

* Chapter 2 Array's Mystery
- [2.6 - Maximum Subarray Problem](07.0.md)
- [2.1 - Find Smallest K Numbers](03.0.md)
- [2.6 - Maximum Subarray Problem](07.0.md)

* Chapter 3 Searching and Sorting
- [3.1: Binary Search](25.0.md)

* Chapter 4 Algorithm Optimization
- [4.3: Perfect shuffle algorithm](35.0.md)
* Chapter 4 Algorithm Optimization
- [4.3: Perfect shuffle algorithm](35.0.md)

* Chapter 5 Dynamic Programming
- [5.1 - Longest Common Subsequence](11.0.md)
Expand Down

0 comments on commit 9ff103d

Please sign in to comment.