Finding K Closest Numbers in Linear-Time

Given an array of n distinct numbers and a positive integer k, we need to:

Find the kth smallest number in the array
Return the k numbers that are closest to this value

For example, if we have the array [7, 3, 9, 1, 5] and k=3, we would:

First find the 3rd smallest number (5)
Then return the 3 numbers closest to 5 (3, 5, 7)

The Solution Approach

FIND-K-CLOSEST(A, n, k)
    # Step 1: Find the kth smallest number
    target = MM-SELECT(A, n, k)
    
    # Step 2: Create array to store differences
    diffs = new array[1...n]
    for i = 1 to n
        diffs[i] = {
            value: A[i],
            diff: |A[i] - target|
        }
    
    # Step 3: Find k numbers with smallest differences
    result = new array[1...k]
    for i = 1 to k
        minDiffIndex = 1
        for j = 2 to n
            if diffs[j].diff < diffs[minDiffIndex].diff
                minDiffIndex = j
            else if diffs[j].diff == diffs[minDiffIndex].diff and 
                     diffs[j].value < diffs[minDiffIndex].value
                minDiffIndex = j
        
        result[i] = diffs[minDiffIndex].value
        diffs[minDiffIndex].diff = INFINITY  # Mark as processed
    
    return result

Breaking Down the Algorithm

Step 1: Finding the Target

We begin by finding the kth smallest number using the Median of Medians (MM-SELECT) algorithm. This is crucial because it guarantees linear time performance in the worst case, unlike its randomized counterpart. This number becomes our reference point for finding the closest values.

Step 2: Computing Differences

Next, we calculate how far each number in our array is from our target value. We store both the original value and its difference from the target. This step helps us track which numbers are closest to our target value while maintaining the ability to return the original numbers.

Step 3: Selecting the Closest Numbers

Finally, we select the k numbers with the smallest differences from our target. We do this by repeatedly finding the minimum difference and adding the corresponding number to our result. To handle ties (when two numbers are equally distant from the target), we prefer the smaller number.

Time Complexity Analysis

Let’s analyze each component of our algorithm:

Finding the kth smallest number: O(n)
Computing differences: O(n)
Selecting k closest numbers: O(k*n)

The total time complexity is O(k*n). When k is considered a constant, this gives us a linear-time algorithm. Even in the worst case where k = n, the algorithm runs in O(n²) time.

Space Complexity

The algorithm uses additional space to store:

The differences array: O(n)
The result array: O(k)

This gives us a total space complexity of O(n).