open encyclopedia * Article Search: * *
*
*

Counting sort

From open-encyclopedia.com - the free encyclopedia.

Counting sort (like bucket sort) is a sorting algorithm which takes advantage of knowing the maximum value within the array to be sorted. If the maximum value is not known, an initial pass of the data will be necessary to find the largest value (this pass will be O(n)). It uses this to create an array (referenced by the values within the array to be sorted) that will serve to count the number of values of a certain number and, after this, the future positions of the values on a third array, which is the sorted permutation of the second array (i.e., A[i] <= A[j], i < j).

This algorithm is stable and it is O(n+k), where k is the maximum value within the array. In order for this algorithm to have any semblance of efficiency, k must be comparatively small. The values within the original array have to be known and non-negative. If there are some negative numbers in the array, an initial pass of O(n) is necessary to identify the most negative number, and then the absolute value of this number can be added to all numbers in the list for purposes of calculating which array bin to increment.

Note that the first array that is used in the counting sort has a length of the number of possible elements between 0 and the maximum known number. This is not a problem for small ranges of numbers, but large ranges of numbers will eat up more memory (or even hard drive space) than current systems have available. For instance, if one is ordering a list of numbers whose range is between 0 and 100, the counting sort may be the best possible sort. However, if one is trying to order a list of names alphabetically, the array would have to be of length 27n (using a numeric representation of letters, where 27 accounts for all the letters of the alphabet plus "space" to pad out names which are not of the maximum length), where n is the length of the largest name in characters. It is important to note that the length of the sorting array is independent on the number of items in the list to be sorted and dependent solely on the total possible values in the search space. Even a sort of a list of names where the maximum name length is 8 would require prohibitive amounts of RAM for array storage.

References:

  • Cormen, et al. Introduction to Algorithms 2nd. ed. 2001. The MIT Press.
  • Seward, Harold H. Information sorting in the application of electronic digital computers to business operations Masters thesis. MIT 1954.
Contribute Found an omission? You can freely contribute to this Wikipedia article. Edit Article
Copyright © 2003-2004 Zeeshan Muhammad. All rights reserved. Legal notices. Part of the New Frontier Information Network.