Counting Sort & Radix Sort: Linear-Time Sorting

Understand non-comparison sorting algorithms that achieve O(n+k) time complexity by exploiting known bounds on input values.

published: reading time: 12 min read author: GeekWorkBench

Counting Sort & Radix Sort: Linear-Time Sorting

Most sorting algorithms compare elements pairwise—quick sort, merge sort, heap sort all fundamentally need O(n log n) comparisons to make decisions. But when you know something about your input range, comparison-based sorting is overkill. Counting sort and radix sort exploit structure: the range of possible values for counting sort, and digit-by-digit processing for radix sort. Both achieve linear O(n + k) time complexity, making them favorites when the input meets specific constraints.

Counting sort works by tallying occurrences of each value, then reconstructing the sorted array from those counts. Radix sort extends this by sorting on each digit position from least to most significant, using counting sort as a stable subroutine for each digit. The catch: both require additional memory proportional to the input range or number of digits, and counting sort only works when the range is bounded and reasonable.

When to Use

Counting sort and radix sort excel when:

  • Input range is bounded — Counting sort needs O(range) auxiliary space
  • Sorting integers or fixed-length strings — Radix sort processes character by character
  • Need stable sorting — Counting sort is stable by nature, preserving equal-element order
  • Hardware sorting acceleration — These algorithms parallelize well on GPUs and FPGAs
  • Bucket-based data — When data naturally clusters within a known range

When Not to Use

  • Floating-point numbers with large ranges
  • When memory is severely constrained and range is huge
  • General-purpose sorting where you can’t bound the input
  • When O(n log n) comparison sort is fast enough in practice

Architecture: Counting Sort Flow

graph TD
    A[Input: arr = 2,5,0,2,8,3,2,1] --> B[Count array: index=value, count=frequency]
    B --> C[Prefix sum: cumulative count]
    C --> D[Output array: place elements using count]
    D --> E[Sorted: 0,1,2,2,2,3,5,8]

The prefix sum step converts counts to positions, enabling stable sorting where equal elements maintain their original order.

Trade-off Table

AlgorithmTimeSpaceStableConstraints
Counting SortO(n + k)O(k)Yesk = max value range
Radix SortO(d × (n + b))O(n + b)Yes (using counting sort)b = base, d = digits
Quick SortO(n log n)O(log n)NoNone
Merge SortO(n log n)O(n)YesNone

Implementation

Counting Sort

def counting_sort(arr):
    """
    Sort array of non-negative integers.
    Time: O(n + k), Space: O(k) where k = max value.
    """
    if not arr:
        return arr

    max_val = max(arr)
    count = [0] * (max_val + 1)

    # Count frequencies
    for val in arr:
        count[val] += 1

    # Build output array
    output = []
    for val, freq in enumerate(count):
        output.extend([val] * freq)

    return output

Radix Sort (Base 10)

def counting_sort_by_digit(arr, exp):
    """
    Stable counting sort by digit at position exp (1, 10, 100, ...).
    """
    n = len(arr)
    output = [0] * n
    count = [0] * 10  # Digits 0-9

    # Count occurrences of each digit
    for i in range(n):
        digit = (arr[i] // exp) % 10
        count[digit] += 1

    # Cumulative count for positions
    for i in range(1, 10):
        count[i] += count[i - 1]

    # Build output (traverse from right for stability)
    for i in range(n - 1, -1, -1):
        digit = (arr[i] // exp) % 10
        count[digit] -= 1
        output[count[digit]] = arr[i]

    return output


def radix_sort(arr):
    """
    Sort integers using radix sort (least significant digit first).
    Time: O(d × (n + b)), Space: O(n)
    """
    if not arr:
        return arr

    max_val = max(arr)
    exp = 1
    output = arr[:]

    while max_val // exp > 0:
        output = counting_sort_by_digit(output, exp)
        exp *= 10

    return output

Production Failure Scenarios

FailureCauseMitigation
Large range causing memory blowupUsing counting sort on 32-bit integers with max=2³¹Use radix sort instead, which spaces by digits
Negative numbersCounting sort assumes non-negative indicesOffset negatives, sort separately, combine
String sorting length mismatchRadix sort assumes equal-length stringsProcess from rightmost, pad shorter strings
Overflow in prefix sumAccumulated counts exceed integer rangeUse appropriate integer types

Common Pitfalls

  1. Memory explosion for large values — Counting sort on values up to 10⁹ requires 10⁹-element array
  2. Negative number handling — Not handling offset for negative values
  3. Confusing stability requirement — Radix sort needs stable digit sort to work correctly
  4. Integer division rounding — Use // (floor division) not / (float division) for digit extraction

Quick Recap Checklist

  • Input range bounded and reasonable? Use counting sort
  • Need stable sort? Counting sort preserves order
  • Sorting integers? Radix sort works digit-by-digit
  • Large range but many digits? Radix sort uses less memory
  • Test with: negative numbers, all same values, already sorted

Interview Q&A

1. Why is radix sort O(d × (n + b)) but called "linear"?

Expected answer points:

  • Radix sort is linear in the number of digits d, not the input size n
  • For k-bit integers, d = k/b where b is the base (usually 10 or 2ⁿ)
  • Worst case is O(k × n / b), which is linear in n but depends on key length
  • For fixed-length keys like 32-bit integers, this is effectively O(n)
2. Why does radix sort require counting sort to be stable?

Expected answer points:

  • Stability ensures that when sorting by digit d, elements with equal digit d-1 preserve their relative order
  • Without stability, sorting by more significant digits could scramble the order established by less significant digits
  • Example: sorting "21" and "22" by second digit (1 vs 2), then by first digit—the stable version ensures "21" comes before "22"
3. When would you choose counting sort over quick sort?

Expected answer points:

  • Choose counting sort when: the range k is O(n) (so O(n+k) ≈ O(n))
  • You need stability and memory is sufficient
  • Quick sort's O(n log n) becomes worse than O(n+k) when k << n log n
  • Example: sorting 10⁶ students by score (0-100) needs only 101 counters—counting sort is vastly superior
4. How does counting sort achieve O(n + k) time complexity?

Expected answer points:

  • Three sequential passes: count frequencies O(n), compute prefix sums O(k), build output O(n + k)
  • No comparisons between elements needed—exploits known value range
  • Each pass is linear in its domain size, making the total O(n + k)
5. What is the difference between LSD and MSD radix sort?

Expected answer points:

  • LSD (Least Significant Digit) sorts from rightmost digit first—simpler, natural fit for fixed-width integers
  • MSD (Most Significant Digit) sorts from leftmost digit first—can finish early for short keys, better for variable-length strings
  • LSD is cache-friendly and easier to implement with counting sort
  • MSD enables early termination but requires recursion
6. How do you handle negative numbers in counting sort?

Expected answer points:

  • Apply an offset to shift all values into non-negative range before sorting
  • Common approach: find minimum value, add |min| to all elements
  • After sorting, subtract the offset to restore original values
  • Alternative: sort negative and non-negative separately, then combine
7. Why is counting sort not used more widely in practice?

Expected answer points:

  • Memory requirement O(k) where k can be very large (e.g., 32-bit integers need 4B elements)
  • Only works for bounded, discrete value ranges—not general purpose
  • O(n log n) comparison sorts are "good enough" for most inputs
  • Cache performance suffers when k >> n (sparse count array)
8. What determines the base b in radix sort complexity?

Expected answer points:

  • Base determines how many digits d needed to represent max value
  • Common bases: 10 (decimal), 2 (binary), 256 (byte-sized chunks)
  • Larger base = fewer digits but larger count array per digit (b elements)
  • Memory vs iterations trade-off: b × d = log_b(max_val) relationship
9. Can radix sort be used for floating-point numbers?

Expected answer points:

  • Yes, with careful handling of sign bit, exponent, and mantissa
  • Floats have fixed 32-bit or 64-bit representation (IEEE 754)
  • Sign bit must be processed separately—flip it for positive representation
  • Then treat mantissa and exponent as integer parts for sorting
10. How does radix sort compare to bucket sort?

Expected answer points:

  • Radix sort is a specialized bucket sort with fixed base and digit-by-digit processing
  • Bucket sort distributes based on value ranges; radix sort uses digit positions
  • Radix sort is more cache-friendly for integer sorting
  • Bucket sort can use variable bucket sizes based on data distribution
11. What is the space complexity of radix sort and why?

Expected answer points:

  • Radix sort typically uses O(n + b) space: n for output array, b for count array per digit
  • Count array size equals base b (commonly 10 for decimal)
  • Space can be reduced to O(n) by reusing arrays in-place, but requires careful index management
  • Not O(k) like counting sort because we process one digit position at a time
12. Why must counting sort be stable when used as a subroutine in radix sort?

Expected answer points:

  • Radix sort processes digits from least to most significant
  • When two numbers have the same digit at position i, their relative order from position i-1 must be preserved
  • Stability guarantees this ordering carries forward
  • If unstable, the sort by more significant digits can destroy progress from less significant digits
13. What happens to counting sort's performance when k >> n?

Expected answer points:

  • Time complexity becomes O(k) dominated by the large range
  • Count array becomes sparse, wasting memory
  • Cache performance degrades significantly
  • In this regime, comparison sorts like quick sort (O(n log n)) become faster
14. How would you sort strings of different lengths with radix sort?

Expected answer points:

  • Pad shorter strings to equal length with a sentinel value (e.g., 0 or -1)
  • Process from the rightmost character (LSD) to handle variable lengths naturally
  • Ensure the sentinel sorts after all real characters
  • Alternatively, MSD with early termination for shorter strings
15. What is the worst-case scenario for radix sort?

Expected answer points:

  • When all numbers are identical—still O(d × n) since we process all n elements d times
  • Maximum number of digits d for the data type (e.g., 2³² for 32-bit integers)
  • No early termination because max_val // exp > 0 for all digit positions
  • Best case: small max_val with few digits—sorts in O(n) with d=1
16. How does parallel processing improve counting sort and radix sort?

Expected answer points:

  • Counting sort: parallelize frequency counting with atomic operations on count array
  • Radix sort: independent digit sorts can run in parallel per pass
  • GPU implementations achieve near-linear speedup for large arrays
  • Main bottleneck is memory bandwidth, not computation
17. When should you NOT use either counting or radix sort?

Expected answer points:

  • When input range is unbounded or much larger than n
  • For floating-point numbers with huge range (use comparison sort)
  • When memory is severely constrained
  • For variable-length keys like arbitrary strings (use MSD radix with caution)
  • When O(n log n) is fast enough and implementation simplicity matters
18. How do you choose between base 2, base 10, and larger bases for radix sort?

Expected answer points:

  • Base 2 is simple but requires 32 passes for 32-bit integers—more cache misses
  • Base 10 is intuitive for decimal numbers but requires division/mod by 10 (slower than bit ops)
  • Larger bases (256, 2¹⁶) reduce passes but increase count array size and iteration count
  • Modern CPUs: base 2¹⁶ or 2⁸ often optimal due to cache and SIMD considerations
19. What is the relationship between bucket sort and counting sort?

Expected answer points:

  • Counting sort is a special case of bucket sort with one bucket per possible value
  • General bucket sort uses variable-size buckets and a secondary sort method within buckets
  • Counting sort is deterministic—no hashing or distribution assumptions
  • Both achieve O(n + k) in the average and best cases
20. Explain how radix sort handles very large integers beyond native word size.

Expected answer points:

  • Break the large integer into multiple machine-word-sized chunks
  • Process chunk-by-chunk from least to most significant using counting sort per chunk
  • Handle carry/borrow between chunks carefully
  • Alternative: use arbitrary-precision libraries and MSD with recursive handling

Further Reading

  • Cormen, Leiserson, Rivest, & Stein - Introduction to Algorithms (CLRS) - Chapters 8.2 and 8.3
  • “Sorting in Linear Time” - University of Washington Algorithms Course Notes
  • “Radix Sort” - Wikipedia (historical context and variants)
  • “Counting Sort vs Radix Sort” - GeeksforGeeks comparison article
  • “Parallel Counting Sort” - GPU sorting implementations

Conclusion

Counting sort and radix sort break the O(n log n) barrier for comparison-based sorting by exploiting structure in the input. Counting sort excels when the range k is O(n), delivering true linear time. Radix sort extends this to arbitrary integers by processing digit-by-digit, using counting sort as a stable subroutine. The trade-off is auxiliary space: O(k) for counting sort, O(n + b) per digit for radix sort.

In practice, these algorithms shine for bounded integer data—student scores, age ranges, coordinate systems, and fixed-width keys. For general-purpose sorting or unbounded ranges, comparison sorts remain the safer choice. When implementing, watch for negative number handling, range overflow in count arrays, and the critical requirement that radix sort’s digit sort subroutine be stable.

Category

Related Posts

Individual Sorting Algorithms: Bubble, Selection, Insertion, Merge, Quick, Heap

Deep dive into each major sorting algorithm with implementations, complexity analysis, and when to use each.

#bubble-sort #selection-sort #insertion-sort

Sorting Algorithm Comparison: When to Use Which

Compare all major sorting algorithms by time complexity, space usage, stability, and practical use cases.

#sorting #algorithms #comparison

Space Complexity Analysis: Measuring and Optimizing Memory

Master space complexity analysis — learn how to measure, analyze, and optimize the memory footprint of algorithms from O(1) to O(n²) and beyond.

#space-complexity #memory #algorithms