Talk:Counting sort

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

sort[edit]

Hello. I am Diego R. Martinez, the original author of this page. I'd like to find out where did you get the second refe,mn.mn.mn.nj.jkrence for this page and what does it refer to?

Hi Diego. I am the inventor of counting sort and radix sort. "Information sorting..." is my 1954 Masters thesis. Many other sorts are included in its 60 pages. Knuth's sorting and searching (vol. 3) describes my sorts, including "replacement selection." How do I contact you again?

I'd like to request this paper :) bogdan.barbu at ymail dot com —Preceding unsigned comment added by 79.114.19.49 (talk) 21:52, 8 July 2010 (UTC)[reply]

Harold H. Seward hseward@alum.mit.edu

Hello Harold[edit]

I am the inventor of the Rapid sort, Check sort, Instant sort and Simple routines who's publication in the Wikipedia has sparked a controversy that these sorts are duplications of your Counting sort although I invented them quite independently and without any knowledge of your Counting sort prior to their invention. Although both sorts use array indexing and increment the value of a numerical array as prescribed by the algorithms they follow I strongly disagree that their differences are insufficient to tell them apart. For one thing the Rapid sort can be duplicated directly in hardware by simply substituting an address bus of a random access memory chip for the array index and a single bit data bus as the means of indicating the presence of checkmark data (a single digit binary count) in the case of the Check sort or its hardware version the Instant sort and a multiple bit data bus in the case of the Rapid sort which has the capability of counting the number of identical pieces of data when they occur. The single bit memory is used to eliminate duplicates while the multiple digit data bus is used to include a count of duplicates. Along with a counter that can be sequentially incremented to retrieve the sorted data and a comparator and another counter to recognize memory content which has been incremented (even if only once as in the case of the checkmark function) and to accommodate its decrement to zero in order to provide a means of duplicating the count the random access memory chip make up the core of the hardware digital circuit. I would therefore greatly appreciate having your opinion as to the differences between the two sorts so that this argument might be resolved. Thanks. Patrick C Eberhart. ...IMHO (Talk) 23:38, 3 June 2006 (UTC)[reply]

Potential implementation error[edit]

Hello.

I think there is an error in the C and Java implementations presented here. The variable 'range' is assigned a value of 'max-min+1', and the array 'count' has this many elements, indexed from 0 to 'range-1'. However, in filling the array (in the second for loop), an index 'c' equal to 'array[i]+1-min' is used. With a maximum value of 'array[i]' being, presumably, 'max', 'c' may be given a maximum value of 'max+1-min' which is equal to 'range' and is therefore beyond the bounds of the array 'count'. (Similarly, the minimum value of 'c' is 'min+1-min' = 1, meaning count[0] is never used).

There are probably a number of ways to correct this, but as I'm in the process of trying to understand and implement the algorithm myself (as opposed to actually knowing it well), I'm probably not the best person to be suggesting a solution.

If and when a fix is made, please let me know. Thanks.

Cheers, David Fernandez (i.am.david.fernandez(at)gmail.com)

I believe you're correct - for some reason the code here was written for one-based arrays rather than zero-based. If the original author doesn't object I'll fix it. Deco 03:29, 25 November 2005 (UTC)[reply]

No, the arrays are not one-based - surely you can see that by looking at the other array accesses? The problem was that the first element of the count array is never written to (because there are zero elements lower than the first) and that the one-past-the-end element was written to (but never read). I increased the size of the count array by one to allow for this. JonathanWakely 09:52, 28 January 2006 (UTC) P.S why are there two almost identical implementations? The Java one should be removed or replaced with something sufficiently different to be worthwhile.[reply]

I believe this C implementation (as of 20 May 2006) is not stable. To make it stable, shouldn't the sorting loop (the 2nd last loop of the code) be run from end-1 to 0 and count[i]-- be used instead of count[i]++?

--221.134.24.54 18:32, 20 May 2006 (UTC)[reply]


I noticed the C implementation had some typos and did not compile. I took the liberty of fixing the typos and enhancing the comments, including a Doxygen style header, as well as giving some of the variables more descriptive names. I did not change the algorithm and tested it to make sure it works as advertised. Barbarab111 06:10, 14 December 2006 (UTC)[reply]

I'm new to C++ but the code doesn't compile as is. "int [] counts = new int[distinct_element_count];" raises "expected unqualified-id before '[' token" compiler error (g++). Shouldn't it be "int counts[distinct_element_count];" or, if you want to allocate the memory manually "int * counts = new int[distinct_element_count]"?

I've reduced the example to a much smaller C example that also compiles for C++. Really, the code should become pseudocode in the style that is found on the other sorting algorithm pages on Wikipedia. --Ashawley (talk) 23:45, 3 April 2009 (UTC)[reply]

Remove Performace Measurement from Visual Basic Implementation[edit]

In my opinion the performance measurement goes beyond the presentation of the Sorting algorithm in one single implementation. It could be presented in an extra section. It should be programming language independent, if possible.

The Basic code will be much more readable. And it could do with a bit of indenting.

-- Torsten T. Will

Stability[edit]

The article claims that counting sort is a stable sort... I don't believe this is true. The question of stability is not applicable when sorting integers, since repeated integers are not distinct. I'm removing all references to stability. ~ Booya Bazooka 19:19, 14 September 2006 (UTC)[reply]

The question of stability is actually a valid one, since you may be sorting aggregate data by a numeric key - which is the short version of what Rob was trying to say below. Dcoetzee 03:34, 24 September 2008 (UTC)[reply]



I think the stability references MUST be undeleted! The reason is in this example:
Suppose you're a teacher and you've sorted the exams by name. The exams have signs from 1 to 5 (or A to E). Then you make 5 places to put the exams. Starting from the last put the exams to the right place. Then if you examine the packs, each will be ordered by name. Here the "repeated integers" ARE distinct, because each belongs to another student. This is why it's called stable, it keeps the previous ordering of not distinct elements.
Try this example with some papers, and you will find out... if you don't beleive me :P TWiStErRob 01:53, 6 December 2006 (UTC)[reply]

So called "Tally" sort[edit]

The reference cited was published in 2000. Even though the reference describes the method in very computer scientific terms, as if coming out of the era when computers were programed in binary or with patch cords and provides pseudo code, the reference does not use the term "Tally sort" or any term or name for that matter to refer to the "Pearl" of which it's author, Jon Bentley, speaks. The method, however, was copyrighted and published online here in 1996 prior to Bentley's publication. Because the reference cited only describes the method in a very computer scientific way and does not provide a source of any kind for the method and does not refer to the method by any name including the "Tally" sort, the question must be asked where did the name "Tally" sort and the method it names come from?. While the Rapid sort and the Counting sort are both listed by the National Institute of Standards and Technology, the so called "Tally" sort is not.

At one time the method described as the "Tally" sort was published here in the Wikipedia under a different name but was immediately nominated for deletion as original research. In that regard the claim that the method is a "well-known variant" of the counting sort is bogus. This is also evident from it's inability to count multisets. 71.100.3.239 (talk) 07:47, 13 September 2008 (UTC) [reply]

Key also is the date the Counting sort was first published. Other clues might be found in the dates the "count" functions were first included in Excel or Mathcad or other computer programs. 71.100.10.11 (talk) 17:43, 14 September 2008 (UTC) [reply]

71.100.*.* has a history of creating original research articles such as Articles for deletion/Rapid sort and Articles for deletion/Optimal classification. This is just his perverted way of trying to get his articles back to Wikipedia. Ignore everything he says, and contact the appropriate parties if personal attacks gets too much (see Wikipedia:Administrators' noticeboard/IncidentArchive459#User:Julie Dancer, repeated personal attack and harrassment and Requests for checkuser/Case/Julie Dancer) --Jiuguang (talk) 12:55, 15 September 2008 (UTC)[reply]

The algorithm[edit]

The algorithm appears to disagree with the description and the algorithm referenced by the National Institute of Standards and Technology. 71.100.10.11 (talk) 05:50, 15 September 2008 (UTC) [reply]


Who cares about US standardisations? Do they use the metric system by now? — Preceding unsigned comment added by 2A02:8108:96C0:2E3C:BD73:8080:718E:1DE7 (talk) 14:53, 3 August 2023 (UTC)[reply]

More implementations[edit]

How does the In-Place Count Sort algorithm compare with the counting sort algorithm?[edit]

The In-Place Count Sort article has been nominated for deletion, however it looks related to the counting sort algorithm. Could an expert comment on that article? --50.53.53.93 (talk) 15:11, 27 September 2014 (UTC)[reply]

Regarding variant algorithms -- eliminate duplicate keys[edit]

Hi! I've read the following paragraph many times and I still get confused (pay attention to the bold sentence)

If additionally the items are the integer keys themselves, both second and third loops can be omitted entirely and the bit vector will itself serve as output, representing the values as offsets of the non-zero entries, added to the range's lowest value. Thus the keys are sorted and the duplicates are eliminated in this variant just by being placed into the bit array.

Is this clearly stated? Or, is it that just I don't get it? --Ustcgcy (talk) 03:19, 20 April 2018 (UTC)[reply]

Can we change this line of pseudocode?[edit]

I'm talking about this line:

count[i], total = total, count[i] + total

It is hard to unerstand exactly what is assigned to what here. — Preceding unsigned comment added by Celsiuss (talkcontribs) 01:50, 23 February 2021 (UTC)[reply]

Do you really think it is easier to read and understand as three lines
useless_temporary_variable = count[i]
count[i] = total
total = count[i] + useless_temporary_variable
? —David Eppstein (talk) 02:17, 23 February 2021 (UTC)[reply]
Yes actually I find that way more clear. Celsiuss (talk) 02:40, 23 February 2021 (UTC)[reply]