Talk:Quicksort/Archive 2

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1 Archive 2 Archive 3

Fat partitioning complexity

Here's an interesting remark:

The standard version of Quicksort requires n log n comparisons, where n is the number of input items; the UNIX version requires at most n log m comparisons, where m is the number of distinct input values. —McMahon, Lee E.; Cherry, Lorinda L.; Morris, Robert (1978). "UNIX Time-Sharing System: Statistical text processing" (PDF). Bell System Technical Journal. 57 (6).

This seems to refer to the use of a "fat" (three-way) partition function, which was implemented in Unix's qsort at a very early stage. QVVERTYVS (hm?) 12:54, 24 April 2014 (UTC)

dual-pivot

The variant described at http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628 is apparently missing from the article. — Preceding unsigned comment added by 68.183.92.232 (talk) 23:30, 10 May 2014 (UTC)

first time through loop in "In Place" code

    storeIndex := left
    for i from left to right - 1
        if array[i] ≤ pivotValue
            swap array[i] and array[storeIndex]
            storeIndex := storeIndex + 1

On the first time through, i is "left" (why not just say 0 for a zero-based array?) and storeIndex is also "left"

if array[left] is actually less than pivotValue then: swap array[left] and array[left]

What?

68.183.92.16 (talk) 00:10, 13 May 2014 (UTC)

This special case of "swap" is commonplace as it simplifies many algorithms. It just means "do nothing". McKay (talk) 00:46, 13 May 2014 (UTC)

Two answers for two questions.

  1. 'Why left, not 0?'
    Because the same routine is called in a recursion for many different parts of the array, and it is usually described with three parameters: array, leftmost index of the part being handled and rightmost index of the part being handled. If you program e.g. in a C language then of course you can pass a reference to the leftmost item and the part's length. However, some languages do not offer such flexibility in transforming array item's address into array, so describing the algorithm this way would make it 'impossible' for those languages. Anyway this kind of optimization does not speed things up significantly – the asymptotic complexity remains the same.
  2. 'Why swap item with itself?'
    Because it's simpler. Such swaps do not hurt (they don't destroy data) nor they slow down the program significantly, whilst dropping the test makes the algorithm more readable. You can easily check inside the swapping routine whether the two objects are actually the same and then skip swapping and return. You can also add the test in the caller routine, if you care of additional jump into the subroutine.

--CiaPan (talk) 20:24, 14 May 2014 (UTC)

Re: impossible in other languages, that doesn't matter because this is pseudocode. I've actually been pondering a pointer-based pseudocode for some time, but I haven't found a syntax that is obvious enough. Pointer-based quicksort can be written more compactly and is IMHO easier to verify than the version using indices. QVVERTYVS (hm?) 22:07, 14 May 2014 (UTC)
I disagree. That does matter, especially in pseudocode, because pseudocode is for outlining the general algorithm structure for those who can't read a specific programming language. That also means non-programmers. And for many readers it is easy to consider 'sorting THIS PART of a GIVEN ARRAY' and rather strange to accept 'sorting some (NEW? ANOTHER?!) array, overlapping (?) a part of the given array'. The latter is common in C, but the pseudocode should translate easily to as many languages as possible. We explain the algorithm here, so we need notation simple and general, with no tricks and shortcuts. We do not introduce any optimizations specific to some language or environment – that is an implementor's role.
BTW, I'm a bit curious about 'Pointer-based quicksort can be written more compactly' – have you tried to write a 'pointer-based quicksort' in Delphi/Pascal? in Fortran? or in Java? Do they support such flexible pointer arithmetics as C does, allowing to use 'dereference(pointer + number)' as 'n-th item of the array'? --CiaPan (talk) 22:41, 16 May 2014 (UTC)
I beg to differ. I prefer my pseudocode consistent, short and readable, and that means I'll invent any language feature that I need to get rid of irrelevant details. Easy translation is not the goal, understanding is. Hence, I don't care how well an algorithm description maps to Pascal, Fortran, Java, or the languages that I use (which include C and languages without pointer arithmetic). I want it to be written in an easily understood pseudo-language. QVVERTYVS (hm?) 11:00, 18 May 2014 (UTC)
I side with CiaPan here. Pointers are not needed for the algorithm and if it is in place then it should be possible for exposition purposes to say quickly exactly where in place the subarray is. However as far as swapping with itself is concerned I think there should at least be a comment saying that is possible on the line. In pointer terminology it is important to note if pointers are restricted or not. Dmcq (talk) 11:51, 18 May 2014 (UTC)
I won't change the algorithm to use pointers. But I don't see the point in explicating the possibility of a self-swap. It should be pretty obvious that swap(a, i, i) is a no-op. QVVERTYVS (hm?) 13:50, 18 May 2014 (UTC)
I agree about swap, but it is good way to catch smarty-pants C programmers who define SWAP(a,b) as a ^= b; b ^= a; a ^= b so that they don't need a temporary variable. (If a and b have the same location, this just sets it to 0.) McKay (talk) 06:25, 21 May 2014 (UTC)

Wrong algorithm

Several of the statements on this page including the definition of the "simple" algorithm are quite baffling when considered in the context of Hoare's original 1962 paper about quicksort, which explicitly defined quicksort as an in-place algorithm that used in-situ exchanges to partition with the explicitly stated goal of minimizing memory consumption in order to improve performance. These characteristics are what makes the genuine quicksort algorithm quick. The "simple" algorithm described on this page captures none of these characteristics and, consequently, is far from quick. Specifically, it is orders of magnitude slower. Perhaps this "simple" algorithm should be stripped from this page along with the unjustified statements about its relative popularity and a citation to the authoritative primary source be added?
194.74.155.52 (talk) 19:59, 17 February 2011‎

I put a {{disputed section}} on the section Simple algorithm, because I agree with you. I've seen publications from the FP (Haskell) community where it's claimed that this is quicksort:
-- Off the top of my head
sort []  = []
sort [x] = [x]
sort (p:xs) =  sort left ++ [p] ++ sort right
  where (left:right) = partition p
        partition p xs = ([x | x <- xs, x < p], [x | x <- xs, x >= p])
... and this is what this section is describing, except in "simple", quasi-imperative pseudocode. As for WP:V, it is possible to find sources that describe this kind of "quicksort". However, it is not how CLSR define quicksort:
Quicksort(A, p, r)
 if p < r
  q = Partition(A, p, r)
  Quicksort(A, p, q - 1)
  Quicksort(A, q + 1, r)
To me, this is classic quicksort. It's an imperative, in-place algorithm, with no append or concatenate operations. Just move the pivot around, and then recursively sort the subarrays as completely independent subproblems, with no postprocessing. QVVERTYVS (hm?) 21:21, 28 May 2014 (UTC)

algorithm is incorrect

the shown partition function has a different signature then the one used in quicksort:

partition(array, left, right, pivotIndex) // 4 arguments — Preceding unsigned comment added by 62.178.185.230 (talk) 09:25, 5 June 2014 (UTC)

Fixed that. QVVERTYVS (hm?) 11:17, 5 June 2014 (UTC)

Article needs restructuring

We currently have the following sections after the History:

  • Algorithm. Includes a subsection Implementation issues that describes small practical tricks, but also an alternative pivot algorithm.
  • Formal analysis. Contains what it promises.
  • Relation to other algorithms. This contains a list of Variants that are not really different algorithms, but quicksort variants (formally these are different algorithms, but the algorithms literature does not award all of them a custom name).

I suggest we restructure to get:

  • Basic algorithm (or Overview?)
  • Variants and improvements
  • Formal analysis (put here because we need to describe the main algorithms first)
  • Relation to other algorithms (selection goes here, but what about Generalization?)

Comments? QVVERTYVS (hm?) 09:55, 5 September 2014 (UTC)

Parallelization description needs work

I'm not happy with the discussion on parallelization.

Consider this text:

If the split is blind, ignoring the values, the merge naïvely costs O(n). If the split partitions based on a succession of pivots, it is tricky to parallelize and naïvely costs O(n). Given O(log n) or more processors, only O(n) time is required overall, whereas an approach with linear speedup would achieve O(log n) time for overall. One advantage of this simple parallel quicksort over other parallel sort algorithms is that no synchronization is required, but the disadvantage is that sorting is still O(n) and only a sublinear speedup of O(log n) is achieved.

Now, as given, the text suggests that writing a scalable quicksort is hard and that we should prefer this sub-optimal parallelization that can achieve at most logarithmic speedup. It isn't that hard to explain (or write in code for that matter) a scalable quicksort, however.

As a quick example, merging two sorted arrays of total size n can be done in O(log^2 n) span and O(n) work --- or put another way: O(n/P + log^2 n) time. See slide 51 of http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2010/video-lectures/lecture-14-analysis-of-multithreaded-algorithms/MIT6_172F10_lec14.pdf for more details on the parallel merge. This would allow you to merge P sorted arrays in O(log(P)log^2(n/p)) span and O(n log P) work by performing merges in a balanced binary tree with the P sorted arrays at the leaves.

I'm not sure the solution to this problem is to add a description or reference to a parallel merge routine to the current text. I don't see why we don't just describe the recursive parallelization of quicksort that performs parallel partitions using extra space (standard parallel packing algorithm). That would give you an O(n log n) work algorithm with O(log^2 n) span. This alternative algorithm is also arguably a lot simpler than one that explicitly partitions among processors.

Any thoughts?

-Tim Kaler 23:07, 19 August 2014 (UTC) — Preceding unsigned comment added by Timkaler (talkcontribs)

I made a revision to the parallelization section a few moments ago. I think that it still needs some work, but it seems a lot better (to me at least) than what was there before. I think it would be good to add a description of how to implement quicksort via explicitly partitioning the array --- as this may be more readily understandable to those unfamiliar with parallel algorithms. We should probably tell them, in such an explanation, that the sorted partitions can be merged in parallel and give them a citation to find that algorithm :) (the slides I linked to earlier have the algorithm, but the right citation is the textbook CLRS Introduction To Algorithms) -Tim Kaler 23:38, 10 October 2014 (UTC) — Preceding unsigned comment added by Timkaler (talkcontribs)

Thanks for the citation. I looked up the 2009 ed. of CLRS in the meantime, but it only discusses parallel mergesort, not quicksort. QVVERTYVS (hm?) 22:29, 13 October 2014 (UTC)

Algorithm notation/style is unprofessional

The best notation for pseudo-code is used in MIT's wonderful standard textbook on the subject: http://www.amazon.com/Introduction-Algorithms-Edition-Thomas-Cormen/dp/0262033844

It is language agnostic, and allows short expression of sophisticated algorithms.

The pseudo-C/pseudo-Pascal like notation tried in this article is hard on the eyes, it does not look like an easily readable pseudo-code at all, and seems unpleasant from a computer science point of view. It is plain text. No mathematical notation. It looks quite amateurish, in my opinion. I will replace the algorithm here with a much better looking scientific notation. You are welcome.

Eray Ozkural, PhD Computer Scientist — Preceding unsigned comment added by Exa (talkcontribs) 15:32, 4 February 2015 (UTC)

Hello! You're more than welcome to contribute to the article by providing a better version of the pseudocode. Please do so, and we'll start from there. — Dsimic (talk | contribs) 15:53, 4 February 2015 (UTC)
+1 for the Cormen et al. pseudocode, it's very pretty and it's something to emulate. QVVERTYVS (hm?) 17:48, 4 February 2015 (UTC)

Visualisation image and text.

As a newcomer, I see this: "Visualization of the quicksort algorithm. The horizontal lines are pivot values." THERE ARE NO HORIZONTAL LINES IN THAT IMAGE! They're all vertical!

This may seem unimportant to those who know, but the whole point of Wikipedia, and any teaching, is to teach those who DO NOT know. Therefore the FIRST thing is not to make errors like that, because they leave a newcomer floundering and wondering what they missed when it is not their mistake that leaves them that way. Judging by the other comments on this talk page, the whole article is riven with problems like this, and we're probably better off ignoring it entirely and searching on Stackoverflow.com instead. It may need to be re-written from base principles in one go by someone fully competent to do it. If it's a turd, no amount of polishing will do.

The only improvement I know enough to offer is to reinforce someone else's call to distinguish any mention of quicksort from qsort, which may not be the same. I read on Stackoverflow that calling qsort can be very slow, and in some cases, surprisingly quick when known indications suggested it should be otherwise, probably because it may change to a different internal method if the set to be sorted is very small. Therefore, this article should NOT be linked directly with a standard, possibly complex and not explicitly described, function, in any language. It should be restricted to spelling out exactly what quicksort is, with good example code, ideally in C, and why it may be chosen above another method. 86.143.141.236 (talk) 16:59, 29 January 2015 (UTC)

False alarm: there's a blue horizontal line in the picture. We've removed all the example code from the article because language-neutral pseudocode suffices, and actual code tends to clutter the article while going stale. RosettaCode offers example implementations (of varying quality). QVVERTYVS (hm?) 18:19, 29 January 2015 (UTC)
Possibly OP's browser does not play the GIF animation? Then the user can see only the first frame, which actually doesn't have any horizontal lines... (But then it would not be an 'algorithm visualization'.) --CiaPan (talk) 07:08, 30 January 2015 (UTC)
If some browsers don't display the animation, then maybe we should put "animation" in the caption to show people their browser is broken. QVVERTYVS (hm?) 08:31, 30 January 2015 (UTC)
Good idea. Could be discussed as a more general recommendation for editors. But I would wait with that and try to see if OP confirms their browser doesn't play animated GIFs, because the source of their problem might be totally different. --CiaPan (talk) 09:32, 30 January 2015 (UTC)
Hello! Hm, it's 2015 and modern browsers can almost serve as coffee makers. :) Anyway, there are old platforms incapable of many things, but putting that note below each animated GIF would IMHO be really redundant; extending the MediaWiki preview so it detects whether a browser is capable or not and displays an appropriate message within the preview itself, would be a much better option. — Dsimic (talk | contribs) 13:24, 30 January 2015 (UTC)
That's right. However sometimes those coffee makers may have the option of milk foaming switched off. Anyway I don't think Wikipedia servers should determine whether the option is present and on, it should be enough to add the 'Animation' word in the caption: 'Animation of the quicksort algorithm' or 'Animated visualization of the quicksort algorithm'. Something like that, a bit redundant, but informative in special cases. --CiaPan (talk) 13:33, 30 January 2015 (UTC)
Well, I've never thought of disabling animated GIFs in a browser; Flash, yes, but animated GIFs, no. On second thought, saying "Animated visualization of..." in a caption would be a good thing to do. — Dsimic (talk | contribs) 14:05, 30 January 2015 (UTC)

Actually I did miss it. :) I hate distractions on web pages, so had long since disabled animations and flash, etc... Today I noticed the edit on the main page, and it's a wise one. That said, the really helpful thing was the RosettaCode suggestion in the first response here. Thanks for that. There is no substitute for being able to watch the code output, tweak the code, learn from changing (including breaking) it.

I understand that this may be something difficult in the context of Wikipedia which strives for a very scholarly presentation, but the danger is that it may leave ONLY an erudite scholar able to understand it! Which strongly defeats the point, no? Many of Wikipedia's pages on advanced technical subjects have this problem, which I usually resolve by looking for practical examples (Great Circle derivations for accurate speed from GPS, or phase modulation synthesis are two examples I have worked on, where Wikipedia was mainly useful for its links rather than its own content). As in this case, examples of practical study really help, nearly every time, where beautifully presented maths nearly always fails. I learned to write any text guidance on my own work, as I go along, as I learn, not after the fact. (I wish I had been that wise in school! :)

I think the moral of the tale is: Wikipedia would do well to forge stronger links with applied examples. Given that RosettaCode seems to use Wikipedia software, it appears that much of the required work may already be done. Forging links with StackOverflow would help a great deal too in the context of computing (and StackExchange cover many other studies in a very practical way). One danger that anyone may face, is that while agonising over complexity (like how and why and when and where a certain sort algorithm is fast when new to the subject and needing to make a quick decision), they may be too deeply involved to notice that a GIF was meant to be animated. :) No single quicksort example is perfect, so no matter how carefully the 'pseudocode' is written in the main page, it still opens up more questions than it answers, and to solve that in any situation, a person MUST be able to test actual code directly, so I think, as do many people on Stackoverflow, that providing an easy access to complete working examples is important, and a link might best be placed as prominently as that GIF image, for best effect. :) Coming back to the Wikipedia article is likely to make better sense after trying at least one of those, so it's probably in Wikipedia's interests to do it.86.147.255.80 (talk) 03:35, 1 February 2015 (UTC)

This is not a direct answer, but a defense of the use pseudocode that I've been wanting to write for some time.
I try to write pseudocode so that the algorithms are actually implementable. That's what most algorithms textbooks do, and they've worked fine for generations of CS students, not just "erudite scholars". (It's harder on the absolute beginner who hasn't mastered their programming language, but an experienced programmer can translate good pseudocode without too much trouble.)
I started replacing code snippets by pseudocode after I'd tried, several times, to run example code coming from WP and finding it broken. Too many people think they can program and start "fixing" (breaking) the code, "optimizing" (obfuscating) it or translating it into their favorite programming language (cluttering the page). Since we don't have unit test suites, edits to code have to be checked for correctness by editors. The experience is that this is much easier to do with pseudocode than with actual code, if only because checking editors need not be familiar with any programming language's syntactic and semantic quirks.
The Wikipedia model just doesn't work for code, even aside from the fact that Wikipedia is a reference, not a howto. QVVERTYVS (hm?) 12:31, 1 February 2015 (UTC)
Totally agreed with QVVERTYVS regarding code vs. pseudocode. At the same time, if an article starts with a C implementation, for example, someone else will eventually add a Java implementation, then there comes Python a bit later, etc. and the whole thing turns slowly into a source code repository, multiplying the issues. In other words, using pseudocode also leaves no language favored over the others. — Dsimic (talk | contribs) 14:36, 1 February 2015 (UTC)
I agree too, what you both say makes good sense, but there is a good case for placing a direct and prominent link to a repository of code, no? The difference is that if you know the subject well, you are in a better place than the newcomer visiting Wikipedia, to know which repository is likely to be of greatest use. This will also make people less likely to bring snippets back here if they know that Wikipedia appears to know of them already. I had already been looking around Stackoverflow, but the discussion there is usually between experts, on subtleties like choosing a 'pivot' in this specific case. Not one of them mentioned RosettaCode. :) Very few of them provide a complete example, even though they often upvote those who do! I found disparate structures for Quicksort too, in various places, some do the partition in the main function with a secondary function doing nothing but swapping values, others (as with the pseudocode here) do most of the work in a secondary function. Even without resorting to actual code, the pseudocode itself only describes one of these possible structures. If alternatives are viable, maybe there should be some pseudocode for them too. For myself, I'm looking directly at the base principle of putting low values one side, high values the other, of a movable 'pivot' and being willing to reinvent every 'wheel' from that point on just so I know I've exhausted the good possibilities, and understood why to eliminate any bad ones. I am ok with that, it's thorough and will teach me a lot, but having to do that is partly a result of not being able to trust that the presented examples I find, pseudocode or otherwise, have already eliminated the worst or found the best. It is also a result of my certainty that if I do not do this, I probably will not understand many assertions made in the Wikipedia test either. ANYTHING that can ease this task is worth considering, and placing links to a source of likely good external code is a small but vital thing that you can do to help others who otherwise have to guess on their own till they do their own hard work. As I say, I'm ok with doing it, but I thought the point of a good reference was to reduce the wasted time. :) 86.144.135.179 (talk) 17:16, 1 February 2015 (UTC)
Partitioning in the main routine or a subroutine is a matter of implementation/presentation and anyone with basic programming skills can determine to inline such routines or create new ones. Demanding we discuss all the cosmetic variations is like requiring the article noun to discuss fonts in which nouns can be printed. QVVERTYVS (hm?) 17:43, 1 February 2015 (UTC)
True, IF they are merely cosmetic. I'm not convinced they are, and I think people who know a lot more than me will dispute it too. My view probably counts for nothing, but I think the calling of a simple swap-values routine as secondary function is far more elegant and easy to follow (and a simple swap-values routine is more likely useful in external contexts too, so is a better division of labour!). To me it looks like there is an easy way, and a hard way, and Wikipedia shows the hard way. As I mentioned in a much earlier posting here, I usually find useful, comprehensible answers elsewhere, more often than not. Unless I am EXTREMELY unconventional, you may wisely assume I am not alone. Something inherent to Wikipedia's methods seems to make this happen. I don't know enough to try to explain it (except perhaps that assuming certain conventions and prior teaching is mandatory before visiting here, itself a BAD assumption), but it seems to me that the fact needs exploring, by those who directly affect how Wikipedia operates. I'm taking the time to express an outsider's point of view. You may value it or discount it as you wish, but you're probably lucky you're getting one at all. 86.144.135.179 (talk) 18:18, 1 February 2015 (UTC)
"I think people who know a lot more than me will dispute it too" is one of the weakest arguments from authority that I've ever seen. Partitioning in quicksort or separately is a cosmetic issue. I don't see how it represents "the hard way" of doing quicksort. It's just a presentation device: we can put narrative in between the routines and separate discussion of the two tricky parts of quicksort (partitioning and recursion).
Weak it may be, but I'm not familiar with Wikipedia's ways as you are, so what's your excuse? :) I've been looking at Wikipedia is a reference, not a howto and seen that Wikipedia is not a science journal, either, and that no article should start out with assumptions of reader's familiarity. When you suggest that any 'experienced programmer who has mastered their language' will understand, as if my lack of grasp is entirely my fault, you make a mistake that Wikipedia directly advise you not to make. So sure, add that narrative you mentioned. Also, a revision of the opening paragraphs would help. The terminology currently there makes exactly the kind of assumptions of a reader than the Wikipedia page you cited says it should not do. It gets clearer later, but the point of entry is the page top, not the middle. :) I find that Wikipedia has been less than helpful, too often, leaving me to get clearer guidance from working examples that sometimes take a lot of effort to find, having to make my own best decision about which unfamiliar example is most useful. Despite this obstacle, it is OFTEN easier than wading though Wikipedia's technical presentations. And Wikipedia's own guidelines indicate why so many articles fail to help a newcomer, apparently. Please read them before you hit newcomers with them! You can hit us with the rulebook when we start erroneously editing the main pages, not here.
We seem to have a disagreement on how detailed an encyclopedic presentation should be. My point is that not every article about an algorithm can explain basic programming from the bottom up, so we have make some minimal assumptions and leave things to the reader's imagination. You're absolutely right about the lede paragraphs, though, as it presumes the reader is familiar with big-O notation. I'll have a go at it, and so can you. QVVERTYVS (hm?) 10:28, 2 February 2015 (UTC)
Thankyou. Likewise, I could do a LOT worse than learn this notation, it's not the first time I've seen it, I just never learned it because it never seemed as indispensible as Ohm's and Watt's laws in electronics... I tend to go with empirical testing to gauge code speed, rather than plan to meet an ideal case, but the O notation looks useful in idealised comparisons.
I think I'm still too new to this to attempt to edit the main point of entry. :) Even so, I did try to write a paragraph that says the sort of thing that I think might have helped me most, in plain text: "Quicksort divides a list, usually at midpoint, and swaps that value with the one at the far end. It then runs from the near end looking for a value higher (or lower if reversing sort order) than the new far end value. If it finds one, it swaps it with the near end value, using the next nearest for the next swap it needs to make. It continues until it finds the last value, which it always swaps, marking a boundary between values higher or lower than itself. For each side, the function calls itself until there is no more work to do. Values may be swapped more than once, and equal values may end up reversed (the sort is not 'stable'), but in efficient functions it is much faster than comparing every value with every other value, hence the name." If any of that is useful, have at it. The one caveat I can think of is that this description does not match exactly with the both-ends iteration implied by the diagram, but I think the method that diverges from the start end is easier to imagine, and uses less steps to do its stuff. 86.144.135.179 (talk) 13:05, 2 February 2015 (UTC)
Re: swap routines, who says we're not calling one? The code just says "swap A[pivotIndex] and A[hi]". You can implement that as a subroutine, use one from the library (C++ has std::swap) or use the appropriate idiom in your programming language (Python has A[pivotIndex], A[hi] = A[hi], A[pivotIndex]). QVVERTYVS (hm?) 23:41, 1 February 2015 (UTC)
My point was that two functions will work, you now seem to need three! You can do this with only the swap in the second function, the main one can hold everything else. Like this:
void Exchange(int *A,int X,int Y){int Z=A[X]; A[X]=A[Y]; A[Y]=Z;}

void FastSort(int *A,int S,int E){
  if(E>S){int T=(S+E)/2,V=A[T],C,I,X;
    Exchange(A,T,E);
    I=C=S; while(I<E){
      if(A[I]>V)Exchange(A,I,C++); I++;
    }
    Exchange(A,C,E);
    FastSort(A,S,C-1); FastSort(A,C+1,E);
  }
}
86.144.135.179 (talk) 02:21, 2 February 2015 (UTC)
Of course you can. But then you don't have a separate partition function, which you can reuse in introsort, quickselect or quickselsort, and it's harder to explain later on that the partition function can be replaced by a different one to handle repeated keys gracefully. QVVERTYVS (hm?) 10:28, 2 February 2015 (UTC)
Ok, I'll concede that. Unreusable code was what I was trying to avoid, but if it's got a context for reuse, all's good. 86.144.135.179 (talk) 13:05, 2 February 2015 (UTC)
86.144.135.179, I know what you're talking about, but it isn't rare that even including links to different implementations into "External links" sections leads to an overgrown selection. For example, in the Bloom filter article we had almost 20 entries in its "External links" section, each linking to an implementation in another language, what really made a mess of it and went against WP:LINKFARM. And that list of implementations just kept growing over time. :) — Dsimic (talk | contribs) 17:52, 1 February 2015 (UTC)
Fair enough, but what about the specif of RosettaCode? Two things seem to commend it: 1, it uses Wikipedia web code, so some useful compatibility likely exists. 2, it's got many examples neatly in one place, so you only need one link. For bang per buck, it looks hard to beat, and likely easier to justify the link too. 86.144.135.179 (talk) 18:14, 1 February 2015 (UTC)
Totally makes sense, went ahead and added it to the Quicksort § External links section. A single external link could hardly fit better or produce more bang for a buck. :) — Dsimic (talk | contribs) 03:43, 2 February 2015 (UTC)
Thanks for that. :) The C example is actually nice, very. I may be missing something, because I'm still new to this, but the animated diagram (beautiful though it is, my having now seen it properly), and many examples I found roaming at large, suggest a concurrent iteration from each end, converging. This require more tests, more loops, etc. The one I found (modified to help me grasp it better, cited above on this page, starts with convergence, only moves the low end up when business results, and is actually a lot easier to grasp than the both-ends-at-once-converging interation implied by the diagram. Whether this influences the main page here is not a decision I will risk taking, but I have to say that this worked for me. +1 for RosettaCode. So also, thanks to Qwertyus whose hint first got me there. 86.144.135.179 (talk) 03:54, 2 February 2015 (UTC)

RosettaCode EL

Hey Glrx, regarding your preference not to have RosettaCode page as an entry in the "External links" section, please read the discussion above. I really don't see why having that external link would be such a bad thing, despite the fact there's already a Wikibook page providing different implementations? It shouldn't hurt, unless I'm missing something. — Dsimic (talk | contribs) 17:02, 7 February 2015 (UTC)

Follow the wikibook links, and you will reach lots of implementations for QS and other sorting algorithms; the link for RC belongs there; if somebody is interested in programming details, that should be the place to go. WP is not intended to be a programming tutorial or guide. WP should not put multiple imps in the text; it is not code repository. The purpose of ELs are to cover what the article is currently missing or provide more details of encyclopedic value; multiple imps don't have much encyclopedic value; it says the same thing over and over ad nauseum; an encyclopedia is not intended to cover a topic in all possible detail. See WP:ELNO #1 and #12 (maybe RC has enough editors to pass the wiki test, but what is it saying that is not already in the article?). Glrx (talk) 17:29, 7 February 2015 (UTC)
I was already very well aware of the guidelines for external links, and I still think you're treating the RosettaCode page too strictly. If we were to apply all those guidelines to their last words, I'd say that more than 50% of external links in all articles could be deleted. Regarding what is RosettaCode page providing that the article doesn't already have, well, obviously implementations in different programming languages. However, if you insist I'm fine with leaving out the RosettaCode page. — Dsimic (talk | contribs) 17:44, 7 February 2015 (UTC)

Infobox animation description

Hello, CiaPan! Regarding this edit, please watch the animation – pivotal value isn't always "the rightmost value" so noting that is somewhat misleading. It is the case only in the beginning of the animation. — Dsimic (talk | contribs) 09:51, 2 March 2015 (UTC)

Animation too fast

The animation given here is way too fast. It does not really enable the reader to have a clearer understanding of the algorithm. It would be wonderful is it could be slowed down if possible. — Preceding unsigned comment added by Vinay93 (talkcontribs) 07:40, 3 May 2015 (UTC)

qsort

qsort should not redirect to this page, but rather something C-related as it is part of the standard library. Whenever qsort is used, it is always used in reference to the C-implementation which may not necessarily implement the Quicksort algorithm. To any extent; this should at least be mentioned.

— Preceding unsigned comment added by 88.88.94.133 (talk) 14:20, 31 January 2012 (UTC)
and modified in next 8 minutes by the same anonymous user.

Error in code

The code added in Sytelus's version from 3 August 2015, section Hoare partitioning scheme is not a Hoare's algorithm. It just resembles that of Hoare, at best.

Even worse, it is incorrect. --CiaPan (talk) 10:07, 5 August 2015 (UTC)


Whoops, the code has been changed by Qwertyus on 5 August 2015. Alas that one is wrong, too. They fail on different input data, anyway they both fail. --CiaPan (talk) 10:20, 5 August 2015 (UTC)


Ping: @Sytelus and Qwertyus:. --CiaPan (talk) 11:50, 5 August 2015 (UTC)

I only changed the pseudocode cosmetically, or so I thought, but you may be right. I just translated the Hoare partition function into Python and it goes into an infinite loop. It also doesn't match the Hoare partition discussed by Cormen et al. (3e, p. 185):
Hoare-Partition(A, p, r)
    x = A[p]
    i = p - 1
    j = r + 1
    while TRUE
        repeat
            j = j - 1
        until A[j] ≤ x
        repeat
            i = i + 1
        until A[i] ≥ x
        if i < j
            exchange A[i] with A[j]
        else return j
Note the additional swap in our version, and the lack of (hi+1) when initializing j. QVVERTYVS (hm?) 11:56, 5 August 2015 (UTC)
I reverted my edit since it changes the algorithm; the do-while loops execute at least once, but my while loops don't. QVVERTYVS (hm?) 11:59, 5 August 2015 (UTC)
After edit-conflict:
The Sytelus's code fails for example on the array [7, 5]. First pivot becomes 5. The first do-while loop increments i and exits, because 7 is not less than 5. The second loop decrements j, then finds A[lo] greater than pivot, decrements again and ...tests the item which is outside (before, strictly speaking) the range being partitioned, because i == lo-1.
Your modified version fails to increment i if the first (leftmost) item of the array is greater than, or equal to, pivot. As a result it will later similary swap item A[i] which is before the array. In addition, if all items from lo till hi-1 are greater than the last one, the second loop will continue decrementing j until it reaches value of lo, and in the next iteration it will compare item which is outside the array... :( --CiaPan (talk) 12:12, 5 August 2015 (UTC)
Fixed:
The code was from Sebastian Wild's thesis, Page 119, Listing 4. I also confirmed that code is wrong and fails to maintain invariants. The reason I'd used this version as opposed to CLRS version was that this code required recursion on [lo..p) and (p..hi] partitions, i.e., same as Lomuto's version. The CLRS version requires recursing on [lo..p] and (p..hi] instead. But I don't see any other choice that can be used with good quality citation so I've reverted the change to use CLRS version as well as have both schemes use their own quicksort() function. I think this is good in long run because in future if new schemes are added then they are not forced to use same quicksort() function as Lomuto's. Thanks for pointing out this error. I'll also contact Sebastian Wild about the error in his thesis. — Preceding unsigned comment added by Sytelus (talkcontribs) 01:45, 7 August 2015 (UTC)
Ok, but now we have two versions of the main routine that only differ in their convention for handling the last index (inclusive/exclusive). Surely that's not necessary. QVVERTYVS (hm?) 09:03, 7 August 2015 (UTC)
As far as I can see the current version of the Hoare partition is not working at all - tests result in an infinite loop. In my tests the version discussed in Sebastian Wild's thesis, Page 119, Listing 4 works as intended. TS 138.246.2.232 (talk) 13:55, 12 October 2015 (UTC)
The current revision, originally made on 23:33, 27 September 2015, seems to be fixed now. Rcgldr (talk) 11:52, 19 November 2015 (UTC)

Pseudocode notation

Hey, Qwertyus! Sorry for reverting your edit, as I've noted in my edit summary we should simply let such complaints go by. IMHO, := is much more readable than and I really don't see the need for using an inferior notation. Hope you agree. — Dsimic (talk | contribs) 22:47, 5 June 2015 (UTC)

I don't see why := is superior to ←. Both are used in the literature and in actual programming languages (R uses <-), but in mathematics, := denotes a definition. I'm trying to avoid that meaning and find ← a good substitute. QVVERTYVS (hm?) 08:43, 6 June 2015 (UTC)
To me, := is simply more readable than , which is much easier to overlook when glancing over pseudocode, and the pseudocode is all about good readability. Also, as we know Pascal uses := as the assignment operator, which qualifies it for such use. — Dsimic (talk | contribs) 21:55, 6 June 2015 (UTC)
(ec) ...and it's not just Qwertyus' invention – see for example https://www.cs.rochester.edu/~gildea/csc282/slides/C07-quicksort.pdf page 4 or http://www.mif.vu.lt/~valdas/ALGORITMAI/LITERATURA/Cormen/Cormen.pdf page 25. --CiaPan (talk) 22:05, 6 June 2015 (UTC)
I've never called to be a Qwertyus' invention, I've just called := more readable. :) — Dsimic (talk | contribs) 22:19, 6 June 2015 (UTC)
Well, I didn't mean literary Qwertyus' invention. It was rather a fig. 'this is nothing new', as Qwertyus' was not the first one, some known authors (like Cormen) have used the notation before. BTW, I am quite familiar with both symbols ( and :=) and can read them with no trouble, but it's a personal point of view which notation is 'more readable' so I'm not going to advocate on either side :) --CiaPan (talk) 06:59, 24 June 2015 (UTC)
Agreed, the whole thing depends on one's point of view. — Dsimic (talk | contribs) 07:02, 24 June 2015 (UTC)

Fylwind just changed the whole thing to Python. There are several reasons for not doing this:

  • Python code has several constructs that are not familiar to, say, novice Java programmers, among them the notation for swapping (i, j = j, i.
  • Actual code tends to attract edits that "optimize" the code for some metric or other, making it less readable.
  • We can't change the algorithm to use non-Python control structures such as a do-while loop, which hurt readability.

So I suggest we keep the pseudocode unless there is consensus not to do that. QVVERTYVS (hm?) 10:33, 29 November 2015 (UTC)

Statement in relative efficiency between Lomuto and Hoare is a null-op

Hi, the statement that: "... and the next two segments that the main algorithm recurs on are [lo..p] and (p..hi] as opposed to [lo..p) and (p..hi] as in Lomuto's scheme" ... is a null statement: The two 'segment' descriptions are in fact identical, not contrasting. The statement is actually confusing to the reader. Despite studying Algorithmics I'm not sure what the correct expression is that's a contrast to Hoare in Lomuto's scheme that the writer is trying to convey, or I'd edit it myself. Needless to say, it does need to be fixed by someone familiar with Lomuto's scheme and how it should be described as differing from Hoare. Bladeborn comment added 00:50, 6 March 2016 (UTC)

wrong 'fix' in Hoare's method

Just reverted the special:diff/748314999 edit by 106.187.101.74. The modified code fails e.g. on array like A = [3, 1, 2]. Let's see how the 'partition' routine proceeds with it.

Suppose array indices range from 1 to 3. First, A[1] = 3 is chosen for pivot. The first for loop stops at i=1 and the second one at j=3. Items A[1] and A[3] are swapped, so A = [2, 1, 3], and algorithm starts a new iteration of loop forever.

Now the first for skips A[2] < pivot and stops at i=3, then the second for decrements j from 3 to 2 and stops, because A[2] = 1 is not greater than pivot. Now i >= j and the 'partition' routine returns 2, the j value.

The modified 'quicksort' routine would then handle separately a single-item part of A from lo = 1 to p–1 = 1 and a single-item part from p+1 = 3 to hi = 3, eventually leaving the unsorted array A = [2, 1, 3]. --CiaPan (talk) 19:20, 7 November 2016 (UTC)

Please provide a reference for the changed algorithm. It does not match the current reference, Introduction to Algorithms, which gives the algorithm in the form before the change. There may be other versions of the algorithm in that book. StarryGrandma (talk) 21:14, 7 November 2016 (UTC)
The problem was not in the sort algorithm. The problem is in the partition algorithm. Someone changed i = low - 1 to i = lo at some point. I'm changing the whole thing back to match the original editor's entry. You are right. The algorithm as changed fails. StarryGrandma (talk) 21:47, 7 November 2016 (UTC)
Plus I was looking a the wrong algorithm. I was looking at and changed the Lomuto partition, not the Hoare. Please check that this all makes sense. StarryGrandma (talk) 22:06, 7 November 2016 (UTC)

Three-way partitioning has O(n) on best case performance

The info box states that three-way partitioning is O(n) on best case performance but there is no further discussion about it, nor reference. — Preceding unsigned comment added by Dmelo87 (talkcontribs) 18:01, 18 September 2016 (UTC)

By inspection. Sort an array whose keys are all the same. The 3-way partition chooses the only key as the partition value and throws all entries into the middle (fat) partition. Low and high partitions are empty, so there is nothing left to recursively sort. The sort is over after Ω(n) comparisons. Glrx (talk) 18:30, 18 September 2016 (UTC)
That's true – however, inspection is a kind of an original research. Another possible reasoning is that 'any routine which essentially depends on n pieces of data, can not run in a time shorter than O(n)' and sorting must at least inspect every piece of data to make sure they are sorted – but that might also be a kind of original research for 'an average' reader, even though it may be pretty obvious for professionals. --CiaPan (talk) 06:03, 19 September 2016 (UTC)
See WP:CALC. Your argument suggests that all sorting algs are Ω(n), but that bound is not tight. Ordinary quicksort and mergesort are also Ω(n log n). Glrx (talk) 02:14, 22 September 2016 (UTC)
@Glrx: My argument suggests nothing like that; I haven't used Ω symbol and I explicitly said 'at least', which means 'or more', not 'exactly'. --CiaPan (talk) 21:07, 23 November 2016 (UTC)
Ω says bounded below -- the bound cannot be beaten. See Big O notation. Your statement is about a lower bound, so Ω is the appropriate complexity measure. Using a qualifier like "at least" with an upper bound notation such as O is confusing. Think about what "Cannot run in a time shorter than O(n)" means. An algorithm with a tight O(n) bound will take k*n steps to complete on some inputs; it may finish in k steps on some other inputs. Glrx (talk) 22:52, 23 November 2016 (UTC)

Hoare partition scheme

The reference for the Hoare partition scheme is written in terms of indices into an array, not programming code starting at zero. So it is incorrect to say "Where lo is index [0] and hi is index [n-1] for an n element array:". StarryGrandma (talk) 00:20, 2 March 2017 (UTC)

 Fixed Fixed by StarryGrandma on 02 March 2017 in Special:Diff/768134661. --CiaPan (talk) 13:30, 10 April 2017 (UTC)

External links modified

Hello fellow Wikipedians,

I have just modified one external link on Quicksort. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 16:30, 17 September 2017 (UTC)

Bug in Partition

I implemented partition as presently suggested:

algorithm partition(A, lo, hi) is
    pivot := A[hi]
    i := lo - 1    
    for j := lo to hi - 1 do
        if A[j] < pivot then
            i := i + 1
            swap A[i] with A[j]
    if A[hi] < A[i + 1] then
        swap A[i + 1] with A[hi]
    return i + 1

In python. Here is the ugly implementation:

 def partition_broken(arr, lo, hi):
       pivot = arr[hi]
       print "starting partition.  pivot is arr[%d]=%d" % (hi, pivot)
       i = lo - 1
       for j in range(lo, hi):
               print "i=%d, j=%d" % (i, j), arr
               if arr[j] < pivot:
                       print "arr[j] < pivot, swapping"
                       i = i + 1
                       arr[i], arr[j] = arr[j], arr[i]
       if arr[hi] < arr[i + 1]:
               print "arr[%d]=%d < arr[%d]=%d; swapping" % (hi, arr[hi], i + 1, arr[i+1])
               arr[i + 1], arr[hi] = arr[hi], arr[i + 1]
       print arr
       return i + 1
 partition_broken([1, 2, 1, 3, 1, 4, 1], 0, 6)

Here is what happens when you run this code:

  starting partition.  pivot is arr[6]=1
  i=-1, j=0 [1, 2, 1, 3, 1, 4, 1]
  i=-1, j=1 [1, 2, 1, 3, 1, 4, 1]
  i=-1, j=2 [1, 2, 1, 3, 1, 4, 1]
  i=-1, j=3 [1, 2, 1, 3, 1, 4, 1]
  i=-1, j=4 [1, 2, 1, 3, 1, 4, 1]
  i=-1, j=5 [1, 2, 1, 3, 1, 4, 1]
  [1, 2, 1, 3, 1, 4, 1]

The problem seems to stem from the if arr[j] < pivot: which should be if arr[j] <= pivot:. Restoring the <= seems to make the algorithm work properly:

  starting partition.  pivot is arr[6]=1
  i=-1 j=0 [1, 2, 1, 3, 1, 4, 1]
  arr[j] <= pivot, swapping
  i=0 j=1 [1, 2, 1, 3, 1, 4, 1]
  i=0 j=2 [1, 2, 1, 3, 1, 4, 1]
  arr[j] <= pivot, swapping
  i=1 j=3 [1, 1, 2, 3, 1, 4, 1]
  i=1 j=4 [1, 1, 2, 3, 1, 4, 1]
  arr[j] <= pivot, swapping
  i=2 j=5 [1, 1, 1, 3, 2, 4, 1]
  arr[6]=1 < arr[3]=3; swapping
  [1, 1, 1, 1, 2, 4, 3]

Dbprice (talk) 19:08, 4 August 2017 (UTC)

@Dbprice: This isn't a bug. When the pivot is 1, and the first element is the same as the pivot, and all elements are >= 1, then the array is already partitioned. Testing for < or <= are both valid, but it changes the interpretation of the partition point. It is either the last element in the lower partition (if using <=), or it is the first element in the upper partition (if using <). -- Drummist180 (talk) 15:04, 24 November 2017 (UTC)

Intro paragraph

The first paragraph of the article includes this statement: "Quicksort can operate in-place on an array, requiring small additional amounts of memory to perform the sorting.", but the best case amount of stack space used is log2(n) stack frames for the recursive calls, and the worst case is (n-1) stack frames. Rcgldr (talk) 16:51, 9 December 2017 (UTC)

That's correct: the algorithm can operate in-place on an array, it does not require to copy data from an array elsewhere to put them later back to the original array (except making a copy of a single item during the swap step).
And a logarithmic size of stack data actually is small amount of additional memory. Be aware that sorting five or ten items is a matter of school excersises. In real applications one often needs to sort thousands or even hundreds thousand items – then allocating 14-level stack to handle sorting of 10,000-items' array really IS a small amount...
And the worst case of (n-1) leves of stack happens only in most primitive implementations. A simple cure for it, published by Robert Sedgewick, is described in the #Space complexity section of the article. --CiaPan (talk) 17:25, 9 December 2017 (UTC)
I can read Rcgldr's comment as the naive quicksort is not an in-place algorithm. The narrow notion of in-place is O(1) space, but the relaxed notion of in place is satisfied by O(log n) additional space for something O(n). So naive quicksort is not in place but Sedgewick variation is in place. The phrase "can operate" sort of saves it, but a better statement would be QS can be an in-place algorithm in the best case when it uses log(n) stack frames. Glrx (talk) 22:05, 20 December 2017 (UTC)
What is a 'naive quicksort'? How does it differ from the quicksort, as defined by T.Hoare? --CiaPan (talk) 23:16, 20 December 2017 (UTC)
Here, Hoare/non-Sedgewick. (Modern imps should also do fat partition.) Glrx (talk) 18:32, 21 December 2017 (UTC)

Cole-Kandathil's algorithm

The (currently) last subsection is called "generalization," but it does not explain in what sense this algorithm family generalizes quicksort (as implied by the title). AmirOnWiki (talk) 09:40, 30 March 2018 (UTC)

The section is titled "Relation to other algorithms", which seems to justify inclusion of this algorithm. I agree the subsection "generalisations" is a bit misleading. Perhaps it should be changed to "partition sorts", which seems to be the name applied by the source. Kleuske (talk) 09:45, 30 March 2018 (UTC)

Bug in Lomuto Partition

There are two 3 errors in the Lomuto Partition:

algorithm partition(A, lo, hi) is
    pivot := A[hi]
    i := lo     // place for swapping
    for j := lo to hi – 1 do
        if A[j] ≤ pivot then
            swap A[i] with A[j]           
            i := i + 1
    swap A[i] with A[hi]
    return i

i := lo should be i := lo -1

and

swap A[i] with A[j]
i := i +1

should be

i := i +1
swap A[i] with A[j]

and

return i

should be

return i+1

I'm going to make the edit, but I'm leaving this comment first in case the edit is controversial. Also, I'm not particularly fond of the pseudo code. I'm keeping it mostly the same since that's how I found it, and it's used elsewhere in the Article. However I am changing the ordering to look more like the example in "Introduction to Algorithms" (p.146), which I think reads better. --FuturePrefect (talk) 09:27, 27 November 2016 (UTC)

Why don't you change it? I think your comment is right, but it can be improved:
for j := lo to hi – 1 do
should be changed to:
for j := lo+1 to hi do
swap A[i] with A[j]
i := i +1
should be
i := i +1
swap A[i] with A[j]
everything else should be left unchanged then. --Walkerlala (talk) 15:54, 12 February 2017 (UTC)


@FuturePrefect: WHY dou you think these are errors? Could you, please, show an example array, which would not be correctly partitioned with the algorithm presented in the article, and how your change woild fix the incorrect partitioning? --CiaPan (talk) 18:23, 12 February 2017 (UTC)
@CiaPan: consider the corner case where there are only two sorted elements: [4, 5] -- Walkerlala (talk) 23:07, 12 February 2017 (UTC)
@FuturePrefect: I think we should return i; rather than return i+1 because i is the position where the pivot finally live, which is what `partition()' should return. -- Walkerlala (talk) 04:24, 13 February 2017 (UTC)

This is still not correct:

algorithm partition(A, lo, hi) is
    pivot := A[hi]
    i := lo - 1    
    for j := lo to hi - 1 do
        if A[j] ≤ pivot then
            i := i + 1
            swap A[i] with A[j]
    swap A[i] with A[hi]
    return i

Consider the case that A[lo..hi-1]=1 and A[hi]=2. By the end of the j-loop, A[lo..hi-1] will have been swapped with themselves and i will be equal to hi-1. But then the final swap puts the pivot (2) into A[hi-1] with a 1 above it. What should actually happen is that the pivot is swapped into A[i+1] and i+1 is returned. The loop invariant is that A[lo..i] ≤ pivot; this is violated by the final swap if A[i] < pivot. McKay (talk) 07:18, 13 February 2017 (UTC)

@McKay: yes you are right. I just miss that corner case.

But, then what is true implementation of Lomuto Partition? I know a seemingly correct implementation:

algorithm partition(A, lo, hi) is
    pivot := A[lo]
    i := lo    
    for j := lo+1 to hi do
        if A[j] ≤ pivot then
            i := i + 1
            swap A[i] with A[j]
    swap A[i] with A[lo]
    return i

which can handle the corner cases [3, 4] and [1, 1, 1, 2]. But, as someone suggest(in the revision history), pivot selecting is arbitrary and historically people use the last element. So I guess, we should change the implementation to this:

algorithm partition(A, lo, hi) is
    pivot := A[hi]
    i := hi    
    for j := hi - 1 to lo do
        if A[j] > pivot then
            i := i - 1
            swap A[i] with A[j]
    swap A[i] with A[hi]
    return i

what do you think? --Walkerlala (talk) 09:33, 13 February 2017 (UTC)


All right, after consulting the literature, I think the original implementation should be:

algorithm partition(A, lo, hi) is
    pivot := A[hi]
    i := lo - 1    
    for j := lo to hi - 1 do
        if A[j] ≤ pivot then
            i := i + 1
            swap A[i] with A[j]
    swap A[i+1] with A[hi]
    return i + 1

-- Walkerlala (talk) —Preceding undated comment added 09:40, 13 February 2017 (UTC)

I agree with this final algorithm, which matches the reference. I think the problem began when someone marked i as "place for swapping", even though it was at that point a pointer to one before the place, and would be incremented before the swap. Then other editors back then started modifying the algorithm. StarryGrandma (talk) 22:31, 13 February 2017 (UTC)
@Walkerlala: Although I don't like this implementation, since it doesn't work when using unsigned indices, it is the version found in Introduction to Algorithms, and it is consistent with the text describing the algorithm, while the current code is not. In particular, values equal to the pivot are in the right partition, not the left partition. Also, the text implies that the current algo is a stable sort, but it is not, since the final swap will move the last-occurring instance of the pivot to the front of the right partition. Changing to the above suggested code would resolve these inconsistencies. -- Drummist180 (talk) 04:12, 25 November 2017 (UTC)
@Drummist180: The current text (oldid=810951039#Lomuto partition scheme) does not claim the algorithm stable.
And the proposed change can not 'resolve' the Quicksort's non-stability. --CiaPan (talk) 21:29, 25 November 2017 (UTC)
@CiaPan: Ah, yes. I misunderstood. Can we make the change, though? --Drummist180 (talk) 22:51, 25 November 2017 (UTC)

FWIW, I think that the current pseudocode for both algorithms could be made clearer by increasing the i values by one and adjusting accordingly. To me it seemed odd on first inspection that the very first value of i is set to minus one, only for it to be incremented before use. In the case of Hoare's, it would mean a switch from do-while to while which is also clearer IMO. I suppose this is a matter of style; it seems like people here have strong opinions and the battle has already been fought, but just wanted to add one more voice. --AndrewChinery (talk) 16:12, 14 April 2018 (UTC)

Suggestion: Add external link to quick sort implementation in 10 different languages with a widget for trying it out

Hi, I have a suggestion for a new external link.

This link contains an implementation of quick sort in 10 different programming languages. It also has a widget for running quick sort directly in the browser.

https://repo.progsbase.com/repoviewer/no.inductive.libraries/QuickSort/0.1.0/main.QuickSort/QuickSort/quickSortNumbersBounds

I think it is of value and interest to a reader of this page. — Preceding unsigned comment added by Martinfjohansen (talkcontribs) 11:35, 6 July 2018 (UTC)

Elegant quicksort *Musatov

  algorithm quiksort (A, lo, hi) is 
    if lo <hi then 
        p: = partition (A, lo, hi) 
        quicksort (A, lo, p - 1 
        )

  algorithm partition (A, lo, hi) is 
    pivot: = A [hi] 
    i: = lo      // place for swapping 
    for j: = lo to hi - 1 do 
        if A [j] p pivot then 
            swap A [i] with A [j]            
            i: = i + 1 
    swap A [i] with A [hi] return i

The array is arranged by summing a quicksort (A, 1, length (A)) . — Preceding unsigned comment added by Abuseslang (talkcontribs) 05:59, 10 October 2018 (UTC)

@Abuseslang: It's known as Lomuto algorithm, not Musatov; see Quicksort#Lomuto partition scheme section. Additionally, your code lacks one recursive call in quiksort and one comparision operator in partition. --CiaPan (talk) 06:17, 10 October 2018 (UTC)

Hoare's Algorithm

Why does the suggested algorithm not infinitely recourse if we choose a pivot of the last element with a sorted array as input?

algorithm quicksort(A, lo, hi) is
    if lo < hi then
        p := partition(A, lo, hi)
        quicksort(A, lo, p - 1)
        quicksort(A, p + 1, hi)

algorithm partition(A, lo, hi) is
    pivot := A[hi]
    i := lo
    for j := lo to hi - 1 do
        if A[j] < pivot then
            swap A[i] with A[j]
            i := i + 1
    swap A[i] with A[hi]
    return i

— Preceding unsigned comment added by Ilyakooo0 (talkcontribs) 14:49, 27 January 2019 (UTC)

@Ilyakooo0: The for loop in the partition routine scans the input array with the j variable and conditionally swaps its items. Whenever a swap occurs, the i variable is incremented. So, i is incremented at most the same number of times as j variable. Since they start from the same value of lo, eventually i ≤ j at the end of the routine. Right?
Now, the loop ends with j = hi, hence the result of the partition routine is lo ≤ i ≤ hi. And then the quicksort routine recurs on two parts of the array which do not contain the A[p] item, where p was substituted the value returned from partition. As a result the arrays processed on next level of recursion are always strictly shorter than the array on the previous level. This implies the size must eventually drop to 1 or even 0, which is the recursion's base case (see Recursion (computer science)#Recursive functions and algorithms), and then the recursion terminates, thanks to the condition in the first line of quicksort (see Recursion termination).
In the worst case, if p=lo or p=hi on each return from partition, we'll see at most hi-lo levels of recursion. In real applications this might be intolerably high, anyway it certainly is finite. --CiaPan (talk) 08:12, 28 January 2019 (UTC)

Terribly sorry. I meant to paste the other pseudocode.

algorithm quicksort(A, lo, hi) is
    if lo < hi then
        p := partition(A, lo, hi)
        quicksort(A, lo, p)
        quicksort(A, p + 1, hi)

algorithm partition(A, lo, hi) is
    pivot := A[(lo + hi) / 2]
    i := lo - 1
    j := hi + 1
    loop forever
        do
            i := i + 1
        while A[i] < pivot

        do
            j := j - 1
        while A[j] > pivot

        if i >= j then
            return j

        swap A[i] with A[j]

— Preceding unsigned comment added by Ilyakooo0 (talkcontribs) 22:14, 28 January 2019 (UTC)

  • It will infinitely recurse, repeatedly splitting the array into 0 elements, n elements. Rcgldr (talk) 07:48, 30 March 2019 (UTC)

repeated elements

Repeated elements only presents an issue for Lomuto partition scheme. For Hoare partition scheme, repeated elements can result in needless swapping of elements equal to the pivot, but the splitting will be closer to the ideal size of 1/2 and 1/2, and the overall effect of repeated elements with Hoare partition scheme reduces the running time, at least on the PC based benchmarks I have run. I'll wait for feedback before updating the article. Rcgldr (talk) 07:52, 30 March 2019 (UTC)

New external links

I would like to offer my page of animated illustrations about Quicksort (http://www.chrislaux.com/quicksort.html) for the external links section. I believe the animations explain Quicksort nicely visually to go with the text-focussed explanations here.

Ctlaux (talk) 19:00, 3 September 2019 (UTC)

@Ctlaux: Please see WP:External links and WP:Directory. Adding where you have a conflict of interest was not the best approach. — billinghurst sDrewth 23:52, 22 September 2019 (UTC)

Hoare's 1962 paper - mentions some things credited to Sedgewick

A link to a web page with a link to a pdf file of one of Hoare's 1962 papers can be found here (you don't have to pay or join a group to download this one).

https://academic.oup.com/comjnl/article/5/1/10/395338

The computers Hoare was working with didn't have a stack, so he essentially invented one which he called a "nest". Hoare mentions in order to limit "nest" (stack) size, only the smaller partition parameters would be pushed onto the "nest" while iteration would be used on the larger partition. The Wiki article currently credits this idea to Sedgewick in two places (optimizations, space complexity).Rcgldr (talk) 14:06, 23 September 2019 (UTC)

@Rcgldr: This description in your comment is wrong:
Hoare mentions in order to limit "nest" (stack) size, only the smaller partition parameters would be pushed onto the "nest" while iteration would be used on the larger partition
Such routine could lead to a stack holding O(N) single-item partitions in the worst case.
It's exactly the other way in the Hoare's text:
In order to ensure the number of segments postponed at any one time never exceeds the logarithm (base 2) of the number of items to be sorted it is sufficient to adopt the rule of always postponing the processing of the larger of the two segments.
– at each step the larger partition gets pushed to the stack and postponed, while the partitioning is iterated on the shorter one. As a result, at each recursion level we get at most half of the previous level's data, hence it's up to log(N) recursion levels (and used stack levels) possible before the partition size drops to 1. --CiaPan (talk) 20:51, 22 October 2019 (UTC), expanded 10:00, 23 October 2019 (UTC)

Hoare also mentions that pivot values need to be reasonably random (implies that choice of pivot may need to be randomized) in order to avoid O(n^2) time complexity. I don't see where the Wiki article credits Hoare for this idea.Rcgldr (talk) 14:06, 23 September 2019 (UTC)

Hoare's paper is missing some details. It mentions the scanning stops when the pointers cross, but the scanning also has to stop if the pointers meet (equal). It doesn't mention which pointer should be used to split the partitions, other than to guarantee that a split reduces partition size by at least 1 element (to prevent infinite loop). Rcgldr (talk) 14:06, 23 September 2019 (UTC)

Hoare mentions using sentinel values for keys (the comparison values) to stop the scans rather than bounds checking on pointers, but if the pivot is chosen from the middle, or at least not from either end of a sub-array, this isn't an issue, as either the pivot or a swapped value will stop either scan.Rcgldr (talk) 14:06, 23 September 2019 (UTC)

Hoare Partition Scheme - reverted psuedocode back to the prior scheme in the article

I reverted the pseudocode back to a prior version (one that had been in the wiki article for years), since that variation corresponds to most references for Hoare partition scheme (apparently Hoare's original algorithm?) and it has also been confirmed to work (while the version I reverted appeared to be off a bit from known working variations, and went through a couple of updates, and I don't know which variation of those updates is a known working implementation. Rcgldr (talk) 14:04, 22 September 2019 (UTC)

We work to cited versions, so please cite a version, and that version should be supported by references. Just saying that I am going back (something) is just problematic. Especially when you have left in other problematic edits, which were part of the undoing. — billinghurst sDrewth 23:50, 22 September 2019 (UTC)
@Billinghurst:My edit reverted the psuedocode back to what it was from 11:24, 3 August 2015‎ made by Sytelus, when the Hoare partition scheme pseudocode was first added to the article, through 05:20, 21 June 2019‎. I did not research the references for the pseudcode that was added back in August 2015, but it's been there for almost 4 years with no changes. The change from |do while| to |while| was made on 00:03, 28 June 2019‎ by Cosmic Cloud, with no cited references for that change. My feeling is that the psuedocode should be reverted back to its original version, before the unreferenced change made by Cosmic Cloud. Rcgldr (talk) 05:53, 23 September 2019 (UTC)
@Rajgopalbh4:Please read prior comment. The revision you made undoes the pseudocode that originates back to when Hoare partition scheme was first added to the article in August 3, 2015 to a change that was made almost 4 years later in June 28, 2019, by Cosmic Cloud with no cited references for that change. Rcgldr (talk) 06:29, 23 September 2019 (UTC)
  • The time and date should be 01:37, 7 August 2015 also made by‎ Sytelus, as that was the first version with a cited source: Introduction to Algorithms.[1] Rcgldr (talk) 20:37, 23 September 2019 (UTC) Rcgldr (talk) 12:36, 24 September 2019 (UTC)
    • @Rcgldr: Timestamps of specific edits can be sometimes difficult to use if a page was edited multiple times within a day, because users may have different timezones defined in their preferences and see different (local) times at the same edits. It may be most confusing if the edit in question was made early or very late, close to midnight, as not only an hour but also a date may appear different. The safest, and the unique, ID of an edit is its serial number. You can use it either as a link to a specific version of the page: Special:PermanentLink/674924601 or as a link to changes made in that version: Special:Diff/674924601. --CiaPan (talk) 09:35, 4 February 2020 (UTC)
@CiaPan: I wasn't aware that the timestamps could be changed from UTC or how to reference an edit by serial number. However in this case, I did wan't to establish a date as part of the timeline of changes (Hoare Partition scheme added, updated with cited reference, 4 year later change (not cited)). If I do this again, I'll mention it was a UTC time stamp and include the edit id. Rcgldr (talk) 14:57, 4 February 2020 (UTC)
  • The current version of the algorithm doesn't work. The pivot element will end up randomly on either side of the partition. It has to be "put aside" and swapped back in at the end, just like with the Lomuto partition scheme (see https://algs4.cs.princeton.edu/23quicksort/) --212.51.137.13 (talk) 14:34, 2 February 2020 (UTC)
It's not an issue, and it's already explained in that article's section: "In this scheme, the pivot's final location is not necessarily at the index that was returned, and the next two segments that the main algorithm recurs on are (lo..p) and (p+1..hi) as opposed to (lo..p-1) and (p+1..hi) as in Lomuto's scheme. However, the partitioning algorithm guarantees lo ≤ p < hi which implies both resulting partitions are non-empty, hence there's no risk of infinite recursion." The pivot and/or elements = pivot may not be placed into their final sorted position until partition size is reduced to 1, but the algorithm works. Rcgldr (talk) 17:52, 2 February 2020 (UTC)
(ec) If you say ‘the current version of the algorithm doesn't work’, please show an array which is not properly handled by the algorithm.
The point is we simply do not care where the pivot element lands at the end of the partitioning routine. By the algorithm principle, for the algorithm to be correct it's enough that both partitions are not empty each partition is shorter than the run being partitioned and each item in the left partition is not greater than each element of the right partition. The partitioning routine fulfills both requirements, hence the algorithm is correct. If you say otherwise, please support your claim with appropriate counterexample. --CiaPan (talk) 18:05, 2 February 2020 (UTC)
Fixed the above entry. It's actually not necessary that both partitions are non-empty, it's enough both are shorter than the array being partitioned. This can be achieved either by excluding a pivot element from further processing (which requires the pivot be placed in its final position as a side result of partitioning) or by keeping both partitions non-empty. The presented algorithm for Hoare's method goes the latter way, while the Lomuto's one the former.

References

  1. ^ Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2009) [1990]. "Quicksort". Introduction to Algorithms (3rd ed.). MIT Press and McGraw-Hill. pp. 170–190. ISBN 0-262-03384-4.

Hoare partition scheme is incorrect

The while tests like (while A[i] < pivot) should be (while A[i] <= pivot).

  • It will never return with an array like: [1,3,2,4,5]
  • It will access invalid elements with an array like: [1,2,3,4,5]

— Preceding unsigned comment added by 179.186.127.46 (talk) 02:37, 10 February 2020 (UTC)

I'm afraid I can't see it. Can you, please, describe – step by step – the partitioning routine execution leading to accessing invalid element? --CiaPan (talk) 07:25, 10 February 2020 (UTC)
The usage of < or >= actually depends on how the do/while is written, so double checking it's actually correct, but the tests on Cormen's book code are exit expressions (>=), and in the Wikipedia version they are continue expressions (<). — Preceding unsigned comment added by 179.186.127.46 (talk) 14:38, 10 February 2020 (UTC)
And...? It's quite obvious that for different control structures you need different conditional expressions. The loop do instr while condition; is a control structure different from repeat instr until condition;, and they both differ from while condition do instr;.
But you didn't say the code can be written different way but rather you said it is incorrect – it will never return or it will access data at invalid index. Please demonstrate it actually is incorrect. That is, if implemented the way it is presented it will go wrong. Or point out a specific part of the pseudocode which is ambiguous, such that it can be reasonably translated into actual code in different ways, one/some of which do not work as intended. --CiaPan (talk) 16:02, 10 February 2020 (UTC), improved a bit 11:48, 11 February 2020 (UTC)

"quicksort" or "Quicksort"?

What should it be, "quicksort" or "Quicksort"?

The article currently uses both (is inconsistent).

--Mortense (talk) 22:51, 8 March 2020 (UTC)

Intro claims a well tuned quicksort 2 or 3 times faster than merge sort?

This claim should be clarified. Merge sort does more moves but fewer compares than average case quicksort, so the relative speed depends on the cost of moving elements versus comparing elements. Based on various benchmarks, if sorting an array of integers, quicksort is about 15% faster than merge sort on a typical PC. If sorting an array of records where the keys that are compared are a relatively small part of the record, then the advantage of quicksort increases, due to fewer moves. At some record size, it's faster to sort an array of pointers to records, in which case merge sort ends up faster, because moving pointers takes less time than compares that require indirection via the pointers. Rcgldr (talk) 07:45, 15 June 2020 (UTC)

Addition of Content on Quicksort for Linked List

Information to be added to the end of the Variants section:

Linked List Quicksort

The Quicksort is typically taught using a bidirectional partition method appropriate for in-place exchange of array elements[1]. However, the Quicksort algorithm allows partitioning to be performed in any manner[2]. In this variant, the partitioning method is performed by a unidirectional iteration in which nodes with a lesser key than the pivot node's key are unlinked from their current list location and then inserted into the sublist before the pivot node[3], which can be implemented in a way that makes the Quicksort stable for linked lists[4]. This variant was compared empirically with linked list merge sort, and it performed more slowly due to non-optimal list dividing, which resulted in more key comparisons[3].

A bottom up linked list merge sort is faster still than a top down merge sort, and doesn't involve any halving of lists. A bottom up merge sort considers a list of n nodes as n sub-lists of size 1, merging one node at a time into a small (26 to 32) array of lists where array_of_list[i] has 2^i nodes. Merge_sort#Bottom-up_implementation_using_lists Rcgldr (talk) 21:17, 1 January 2020 (UTC)
Hi @Rcgldr:, I agree that what you say is true here, but it seems like it is most relevant where it already is. In this quicksort page, existence of linked list quicksort is missing along with the point that it is empirically slower than merge sort. That linked list quicksort might therefore be even slower than a bottom up merge sort is likely true but draws a conclusion that crosses the line into originality (no direct source to back up the direct comparison). So, it seemed better to reword to talk about the quicksort's suboptimal dividing (which increased key comparisons) and sidestep talking specifically about the exact operation of the merge sort that was actually used. The quicksort would need more comparisons in any case. Thanks for commenting, JohnBoyerPhd (talk) 00:30, 4 January 2020 (UTC)
@JohnBoyerPhd: I'll have to see if I can find web site sources for this. I had correspondence about this issue with P._J._Plauger - one of the authors of the Dinkumware template library used by Microsoft for Visual Studio and other compiler makers, noted that the scanning of lists to split them added significant overhead to merge sort that bottom up merge sort avoids. This is probably mentioned in one of his published works. I've also seen this mentioned in some old textbooks (an analysis showed something like a 40% overhead), but I don't recall the name of the textbook names, so finding a reference there would be difficult. Rcgldr (talk) 16:27, 9 January 2020 (UTC)

Explanation of issue:

The quicksort is most typically taught as a partition _exchange_ sort (see Knuth and many others). The exchange design was evidently due to Quicksort being developed originally for an array, and this design precluded its use on singly linked lists (which have no backward pointer). Via web search, one can easily find implementations of linked list quick sort that have appeared on many knowledge sharing platforms in the last 10 years (For example, see [cboard post]). As the content in this edit request shows, linked list quicksort has been available since the 1990s. Since it is easily found on other knowledge sharing platforms, the proposed content fills an omission from Wikipedia as a knowledge sharing platform.

References supporting change:

  • [1] Donald Knuth The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1998. ISBN 0-201-89685-0. Pages 113–122 of section 5.2.2: Sorting by Exchanging.

References

  1. ^ a b Donald Knuth The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1998. ISBN 0-201-89685-0. Pages 113–122 of section 5.2.2: Sorting by Exchanging.
  2. ^ a b Hoare, C. A. R. (1961). "Algorithm 64: Quicksort". Comm. ACM. 4 (7): 321. doi:10.1145/366622.366644.
  3. ^ a b c Boyer, John M. (May 1998). "Sorting and Searching Linked Lists in Java". Dr. Dobb's Journal (285): 126–129, 137–138. Retrieved November 23, 2019.
  4. ^ a b Robert Sedgewick Algorithms in C++, Part 3: Sorting, Third Edition, p. 321. Addison-Wesley, 1998. ISBN 0-201-35088-2.

Reply 24-NOV-2019

  Edit request declined  

  • The policies against using only self-published sources as references for these particular claims, as well as those advising against using original research, prevent these changes from being made. The COI editor is urged to provide references from reliable, independent, WP:SECONDARY sources in order to verify the claims proposed here.

Regards,  Spintendo  23:56, 24 November 2019 (UTC)

Response 15-DEC-2019

The matter has been discussed extensively on the @Spintendo:'s talk page. The discussion has resulted in the above substantially revised proposed text. Based on the discussion, the sources were not self-published sources. The one article authored (not published) by this author in the revised proposed content is permitted by Wikipedia policy here as it is a single citation published by a reliable source. The revised proposed content also does not violate the no original research policy because the revised proposed content follows the policy requirements for citing a primary source. Specifically, the revised proposed text does not "analyze, evaluate, interpret, or synthesize material found in a primary source," it changes less than 1% of the Quicksort article and so does not "base an entire article on primary sources," and the revised proposed content makes only two "straightforward, descriptive statements of facts that can be verified by any educated person with access to the primary source but without further, specialized knowledge" and so does not "add unsourced material from [my] personal experience, because that would make Wikipedia a primary source of that material." It's clear that the no original research policy exists to prevent Wikipedia from becoming a primary source, not to prevent it from citing primary sources. Therefore, please reconsider this revised edit request. JohnBoyerPhd (talk) 22:35, 15 December 2019 (UTC)

I still don't see any independent sourcing for the variant. Without that, this seems like undue weight. - MrOllie (talk) 23:07, 15 December 2019 (UTC)
There is an error in grammar in the newer request: It states the Quicksort algorithm allows partitioning to be performed in any manner. (referenced by the Hoare source). The very next sentence begins with In this variant, the partitioning method is performed by a unidirectional iteration.. (referenced by the Boyer source). My question is, which of Hoare's "any manners" does this variant in the second sentence refer to? My guess is it's the variant that Boyer has proposed in their research. If that's the case, then considering the Hoare source was written 38 years before Boyer's was published, how does that source verify Boyer's claims? Please advise. I would also add that the changes you've made to your request did not follow the guidelines for changing previous posts at WP:REDACTED, thus there is no simple way to see what you've changed as opposed to what you previously submitted, which is unfortunate. Would you please repost the references that you removed (and place them below this post).  Spintendo  23:51, 15 December 2019 (UTC)
Hi @Spintendo:, If you click through to Hoare's paper to see what it says, you see that Algorithm 64 is really a short piece of text that presents only the algorithm Partition, Quicksort, Quicksort. It is and is intended to be the most broadly applicable algorithm that covers any method that partitions first in any way then recursively sorts the two partitions. This is exactly how broadly stated algorithms can apply to other papers written later. Hoare's algorithm applies to all other papers cited in the Quicksort page as well as the DDJ source. In the variant that is discussed in the proposed subsection, the any manner is given in the cited source as a unidirectional iteration, which is what the sentence says. Regarding the WP:REDACTED, noted for the future, as nothing on the edit request page describes this process. In the meantime, the design of Wikipedia platform is such that the source appear on the [history page]. Thanks, JohnBoyerPhd (talk) 06:03, 16 December 2019 (UTC)
Hi @MrOllie:, The undue weight characterization is not at all applicable to this type of content in general, nor this specific content. This content proposes 2 sentences that provide one point of view among the plethora appearing in the rest of the 5000+ word article. Thanks, JohnBoyerPhd (talk) 06:03, 16 December 2019 (UTC)
Hi @Spintendo:, I'm just checking in to see if you have been able to decide if you want still want to decline the revised comment for a revised reason, or if you see that the compromises made in revising the content along with the answers to your questions above and the discussion of the applicable policies are now sufficient for you to accept the content? Thank you, JohnBoyerPhd (talk) 03:34, 20 December 2019 (UTC)
The intro to the Boyer paper states "This paper presents a novel approach for a generalized comparison by transforming the problem into comparing executed code size of a benchmark imperative algorithm with a partially declarative variant of the same algorithm." The key words here are a novel approach for a generalized comparison.[a] Please indicate what it is, about this novel approach, which negates its novelty simply because the generalized comparison it is performing is upon a broadly-stated algorithm. Regards,  Spintendo  21:37, 30 December 2019 (UTC)

Notes

  1. ^ The main problem here are the words novel approach, simply because Wikipedia is not generally predisposed to publishing novel research without secondary sources verifying it, per WP:NOR. That includes any analysis or synthesis of published materials that serve to reach or imply a conclusion not stated by those sources. You must be able to cite reliable, published sources that directly support the material being presented — in this case that material would be the "novel approach for a generalized comparison" as stated in the paper. The question is thus: Do you consider your paper to be a new analysis or synthesis (in your words, a novel approach) of the Hoare source that serves to advance a position not clearly advanced by Hoare? If your answer is yes, then your paper is original research. If your answer is no, then you must provide other sources that also describe the same novel approach you discuss in your paper.
Hi @Spintendo:, The paper that you are quoting from is not cited in and not a part of the linked list quicksort content proposal that we are discussing. At the very beginning of this, you merged two separate and distinct content requests that I made (variants of quicksort: Linked List Quicksort and Partially Declarative Quicksort). Since then, you've expressed such an amount of concern about the second of those requests, based on the article you continue to object to, that I removed the whole content and citation related to that second request (partially declarative quicksort). Please see the actual revised content request above. The article you object to just simply is not there. The only thing that we are discussing is my first content addition request about Linked List Quicksort-- not partially declarative quicksort and so not the article you have been objecting to. Can you please review the content currently proposed? Also, to reflect the current state of the discussion, please remove your prior objection for the cause of SPS since we got past that on your talk page a long time ago. Thank you, JohnBoyerPhd (talk) 22:25, 2 January 2020 (UTC)
Hi @Spintendo:, I've now added a secondary reference to substantiate the variant, per your request. The secondary source support comes from none other than Robert Sedgewick, so hopefully that will be satisfactory to both you and @MrOllie:. Thanks, JohnBoyerPhd (talk) 00:30, 4 January 2020 (UTC)

@JohnBoyerPhd: First of all, the changes that you've made on this talk page to your older posts were not made according to the guidelines at WP:REDACTED.[a] Those guidelines state that you must use underlined font for additions and strikeout font for subtractions. Nothing is supposed to be deleted. The fact that you've gone back and deleted content that we've been talking about here is not at all helpful. How are editors supposed to evaluate your "newer" request when you've deleted the older request, so much so that there is now no longer a starting point to compare the newer material to. Editors are no longer able to easily see how and in what way a problematic reference that you removed applies to the evolution of your request, and your recommendation that editors simply "check the history" is hardly constructive to the overall request in general.[b] In order to proceed with full transparency, I would ask that you restore the deleted content in a post below, and that you then explain how it differs from what you're requesting now. I think you'll find that editors are better able to follow your explanations when you don't delete content they've already read. Regards,  Spintendo  09:41, 5 January 2020 (UTC)

Notes

  1. ^ That guideline states that you are allowed to make as many changes as you want until another editor posts after you, then older posts of yours must use underlined and strikeout fonts for any changes made.
  2. ^ I would add that the requirements at WP:REDACTED apply to all talk pages — not just in edit requests. It's only on a user's own talk pages where these requirements are lax, and even then, going against them is not considered to be editing in good faith.
Hi @Spintendo:, I did not delete the so-called problematic reference from my request. You merged two different requests I made, a fact that I have mentioned to you many times. I removed the text of the second request to undo the content merge that you performed. The linked list quicksort variant never cited that reference.
Furthermore, there is no lack of full transparency. It's four sentences. And, it's very much aligned with what I proposed during our discussion on your talk page a long time ago. This was a discussion that I assumed was following a cooperative Wikipedia:Negotiation process, so yes the content is different now-- that's what compromise looks like.
And... it's four sentences now. So, no, I will not restore the deleted content because 1) it would restore your damaging merge of my two requests, 2) it is easy enough to see in the history anyway, and 3) the revised content represents sufficient compromise that it is much easier to simply consider it as if it were a new request... to review four sentences.
Since it is the same as withdrawing my request and just submitting a new one with the four sentences, I'd suggest it be handled as if that happened. Because I'm not really going to be jumping through any more hoops. This is simple undergraduate material at or below the level of the rest of the content in the Quicksort article-- not something "scientists" might dispute, as you asserted during our discussion on your talk page and not undue emphasis on some flat-earth-like fringe viewpoint as @MrOllie: seems to believe. It's important for a content editor to have command of the content in order to appropriately apply policies to the content rather than just partial pattern matching on policies out of the context of the content. Frankly, this edit request isn't even necessary anymore. Even the incorrectly labelled "problematic" source is not problematic. Wikipedia policy supports the addition without an edit request of neutrally worded content with a single self-cite from a reliable primary source if the content added to the article performs no analysis or synthesis and just states facts reported in the reliable primary source. Despite your insistence, it does not add original research to a wikipedia article to add a fact to the article that was original research in the primary source. The source and the article are two different things. I've already pointed this out, including occurrences in the Quicksort page itself, but ultimately I grew tired of explaining the policy language, so I simply withdrew the second request containing that so-called problematic reference. You know, this page asks those who can improve the content to do so, but it's too difficult to do so in the face of having more important things to do than continue arguing about this level of content, so the astringency of this process needs to be substantially reduced. Please process my request as it is now (four sentences) or simply indicate that you refuse. JohnBoyerPhd (talk) 04:19, 7 January 2020 (UTC)
It's not the same as submitting a new request, because you've left behind a bunch of discussion that appears to be about your new request, but is really about an old request. It should be obvious by now what problems this leads to. Please just start a new section with your new request. Also, I don't believe this is 'some flat-earth-like fringe viewpoint', I think it is probably correct, but also something that hasn't been impactful enough on the field that it should be discussed in the article. - MrOllie (talk) 12:11, 7 January 2020 (UTC)
You can see from this diff that the only items I changed were in how your request was presented. I didn't change any of the content:
Reformatted request
The changes to the content were of your own making, they were not mine, and I respectfully ask that you retract your accusation that I altered the substance of your request. In contrast to the formatting changes I made, your changes were far more extensive, they were not redacted, and they made the process of understanding what you wanted here much more difficult.
@Spintendo: Thanks for clearly showing that the two requests became one, which was not my intent. That is the damage I was talking about because you complained repeatedly about a source in proposal 2 that you merged into the request for proposal 1 but which didn't have to do with proposal 1. In order to try to unblock your rejection of the edit request I removed the proposal 2 content that you merged into the edit request. Really need to stop arguing about this, so I'll do as you and MrOllie suggest and do it as a separate request. Still going to need you to revise and update your understanding of the Wikipedia policy on ability to use a primary source to support a content addition, as is done throughout wikipedia, including in the page we are talking about. JohnBoyerPhd (talk) 08:52, 9 January 2020 (UTC)
Here in Wikipedia the WP:BURDEN is on the person wishing to add content to an article to persuade others, first by providing the necessary sources, and then by providing a coherent rationale. If you find others unwilling to join you after making that argument then it's time to reexamine your reasoning. I agree with Ollie that you should start all over from square one with a new approach to your request. Regards,  Spintendo  04:55, 8 January 2020 (UTC)

Hi @MrOllie:, OK, sure, let’s try it as a new request to review the four sentences. If you do consider the new request, I’d ask that you use the Wikipedia policy on weighing impact based on the content being supported by a reliable source, rather than using your own evaluative policy. The problem is that the stating that the material is “probably correct” (rather than knowing that it is) demonstrates movement toward the Comprehension level of Bloom’s taxonomy, which is 4 levels below the evaluative level required to make that call on the impact. Therefore, please instead decide on the sufficiency of impact based on the facts that 1) it is only a few sentences added as a variant, 2) that it is supported by a citation of a publication in a reliable source (DDJ, software developer journal for 38 years with 160K circulation), and 3) that the explanation provided with the edit request includes that this variant appears in programmer-focused knowledge sharing platforms. If I were anyone but the author of #2, I would expect to add this material without it being reverted by anyone. Since my adding the content while also being the author of #2 is supported by Wikipedia policy, there should be no obstacle to what would otherwise be permissible content. Also, I don’t really understand why a simple web search wasn’t used to assuage your concerns. While getting no results doesn’t prove a negative, a plethora of results can provide the extra support you desire. In this case, rather than rejecting so quickly, one could just search for something reasonable like “how to do a linked list quicksort in java”. Many knowledge sharing platforms for programmers come up in the results, including stack overflow, geeks for geeks, tutorial points dev, code speedy, and compute patterns. Because I really don’t want to argue this anymore, I’ll put links to half a dozen or so of these into the explanation section of the new request (despite not being required by Wikipedia sourcing policies). JohnBoyerPhd (talk) 08:52, 9 January 2020 (UTC)

June 2020

Hello @MrOllie:, Although it's been more than 120 days for this inactive discussion to be deleted from this page, the automated archiving agent has either malfunctioned in not deleting it or is maldesigned to keep dormant discussions to have some minimum number of discussions around (which would give a false appearance of the number of active discussions). You have also informed me of your preference that I not manually delete this inactive discussion. Although deleted conversations are still archived in the page's version history, if you require this discussion to be kept around, then it makes less sense to me to start a new request and discussion on the same topic as this one, rather than just just reactivating and finishing the discussion here. It seems particularly low cognitive load considering you're actually the one who last expressed a dissenting opinion, so here goes:

The content we're discussing has been reduced, by the discussion, to 4 sentences of about 60 words within an article of about 10,000 words, so your classification of it as "undue emphasis" seems to mean you think it is "unimportant" (how else could less than 1% content be undue emphasis?). Can you please say upon what do you base your opinion? Would you be able to change your opinion and remove your objection if a web search reveals that this is important enough for others to have written about it on, say, half a dozen other knowledge sharing web sites? Here are half a dozen that easily come up using a web search for linked list quicksort: stackoverflow, geeksforgeeks, tutorialcup, studytonight, computepatterns, and codespeedy

These are just the ones that show up at the top of a web search results list. So, if the topic is discussed by Sedgwick, accepted by DDJ editors, and widely written about by others in knowledge sharing sites, can you say whether or not you still assert that this content really fits the intent of "undue emphasis" when it is, as I said, well less than 1% content in the article? If not, great; if so, why?

Thank you, JohnBoyerPhd

As I mentioned on your user talk page, there is indeed a minimum count of threads kept on this page by the archiving bot. This is working as intended. It is configured to archive discussions no earlier than 120 days after last activity, this is not some sort of guarantee that discussions be archived at exactly 120 days. They'll go when we actually need the space.
I looked at the Sedgewick book, and while he does give an overview of Quicksort, he doesn't cite you at all. Wikipedia is not a howto site, so the fact that Q&A sites, tutorials, and other howto style sites write about something doesn't mean we should, any more than the existence of tutorials on fixing drywall means that our article on Drywall should include tips for mixing drywall compound. I also note that those links don't cite you either. We continue to need independent sources and they just aren't here. - MrOllie (talk) 04:04, 11 June 2020 (UTC)
Hello @MrOllie:, I mentioned those sites and the Sedgwick book because they talk about this topic, not me. The problem is that you justified your dissenting opinion by citing the Wikipedia policy on undue weight. If you read that policy, you see it is saying to forbid a minority viewpoint, in other words a topic that is unimportant or not supported by a reliable source. I had already established that DDJ is a reliable source. If you read the reliable sources text linked by your "undue weight" citation, you will see exact language supporting the use of a non-academic magazine publication such as the DDJ source I used. Therefore, I concluded that you must be saying this is an unimportant topic. The sources I gave were meant to refute that, only.
So that misunderstanding is cleared up, as it's clear now that your actual concern doesn't fit the definition of "undue weight" but rather is about the use of a reliable source _that is_ a self citation. And in that case, the problem remains that, despite your concerns, it is expressly permitted by the appropriate Wikipedia policies that I already provided in explanations above.
We can test this hypothesis while being in keeping with the intent of a talk page (to refine content through discussion). Suppose I were to propose the following revised content (in Variants section):
The Quicksort is typically taught using a bidirectional partition method appropriate for in-place exchange of array elements[1]. However, the Quicksort algorithm allows partitioning to be performed in any manner[2]. This variant iterates through a singly linked list and can be implemented in a way that makes the Quicksort stable for linked lists[3].
This revised content would introduce linked list quicksort but simply removes information that was supported by my DDJ article. The question is, would you still claim "undue weight" on this version of the content, or would you approve this revised content?
Thank you, JohnBoyerPhd

References

  1. ^ Donald Knuth The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1998. ISBN 0-201-89685-0. Pages 113–122 of section 5.2.2: Sorting by Exchanging.
  2. ^ Hoare, C. A. R. (1961). "Algorithm 64: Quicksort". Comm. ACM. 4 (7): 321. doi:10.1145/366622.366644.
  3. ^ Robert Sedgewick Algorithms in C++, Part 3: Sorting, Third Edition, p. 321. Addison-Wesley, 1998. ISBN 0-201-35088-2.
These remaining cites talk about quicksort in general, but not this variant you'd like to add. We need secondary sources specifically about the variant. FYI, I took a look at the closest algorithms book to my desk, which happened to be Cormen, Leiserson & Rivest's Introduction to Algorithms, and they also don't mention this. - MrOllie (talk) 11:36, 12 June 2020 (UTC)
  • It's not clear to me what the issues are here. Lomuto partition scheme is unidirectional. I found a 1993 university paper on quick sort for linked list qucksort linked list pdf. In that paper, there is a reference to "Weg82" which specifically mentions linked lists, but I don't know if that reference uses quick sort. That paper also describes a top down merge sort for linked lists, but Merge_sort#Bottom-up_implementation_using_lists is faster. It's also not clear to me if a quick sort partition without swaps but instead partition by splitting a list into 2 or 3 lists would be considered to be a quick sort or just quick sort like. Lomuto partition scheme using the first element as a pivot could be implemented for a linked list using swaps. On a side note, Hoare's original implementation of quicksort was done on a computer without a stack, and used what he called a "nest", a stand-alone stack, to store indexes, for an iterative implementation of quick sort using a stand alone stack. Rcgldr (talk) 20:27, 12 June 2020 (UTC)
Hello @MrOllie:, I appreciate you looked in a book to support you viewpoint, but a counterexample is used to disprove universality, not to prove something is unimportant. Despite having 'Introduction' in its name, the CLR book's audience is not first semester where discussion of this topic would more naturally belong.
The other citations used in the content do say many things about quicksort but they do also provide ample support of the existence of a quicksort for linked lists, including quoting those exact words in this tweaked version:
The Quicksort is typically taught using a bidirectional partition method appropriate for in-place exchange of array elements[1]. However, the Quicksort algorithm allows partitioning to be performed in any manner[2]. Forward iteration can be used to implement "a stable quicksort for linked lists" [3].
How about now?
For what it's worth, I also agree with @Rcgldr:'s assertion that it's not really clear what the issue with this latest content. By removing the DDJ article, I also removed all COI, so at this point I'm really just trying to see if anything would be found agreeable or if disagreement is just the order of the day. The top of the talk page invites users to make improvements if they can, so it'd be useful if that were actually possible, much less if the COI submission policies are understood well enough to use them here. To answer Rcgldr's question about whether a unidirectional quicksort would still be a quicksort or just quicksort like, that is the intent of the second sentence. It is generally accepted that Hoare's invention is a generalized algorithm that covers anything with the pattern of [partition, sort, sort]. You can see this from reading the source that this is really all it says, and you can also see it in the direct words of Sedgwick (he didn't say "a stable quicksort like algorithm for linked lists". I also ran across the 1993 paper, and there is of course this from 1990, but works from sources like these were asserted by others to be not sufficiently reliable sources. While I don't happen to agree, it's not germane to the current trajectory of finding out whether there is any content on this topic of linked list quicksort that could be deemed passable.
Thank you, JohnBoyerPhd

References

  1. ^ Donald Knuth The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1998. ISBN 0-201-89685-0. Pages 113–122 of section 5.2.2: Sorting by Exchanging.
  2. ^ Hoare, C. A. R. (1961). "Algorithm 64: Quicksort". Comm. ACM. 4 (7): 321. doi:10.1145/366622.366644.
  3. ^ Robert Sedgewick Algorithms in C++, Part 3: Sorting, Third Edition, p. 321. Addison-Wesley, 1998. ISBN 0-201-35088-2.
What I meant by "unidirectional" was the scanning to compare elements versus pivot, which is what Lomuto does. "Bi-directional" would be Hoare type scheme, scanning from both ends inwards until the two scans cross. This is independent of how out of order elements are handled, normally by swapping, which could be done with a linked list, or what might be better for a linked list, separating into 3 lists (< pivot, == pivot, > pivot) and using recursion for list != pivot, then concatenating. Another side note, quick sort isn't quite like other divide and conquer algorithms, the reordering of data takes places in each partition step, before a recursion step, and nothing happens on the return chain. Compare this to top down merge sort, where nothing but splitting happens during divide steps, and reordering (merging) takes place on the return chain. Rcgldr (talk) 17:14, 13 June 2020 (UTC)
@JohnBoyerPhd: You said:

It is generally accepted that Hoare's invention is a generalized algorithm that covers anything with the pattern of [partition, sort, sort]. You can see this from reading the source that this is really all it says (...)

and AFAIR the Tony Hoare's work, you must have made that up.
@JohnBoyerPhd: @CiaPan: Tony Hoare's first implementation of quicksort predates Algol 60, and the computer system was probably an Elliott_803 or similar system, one without a native stack. His early papers refer to a "nest", which is a stand alone stack, used by an iterative algorithm, such as this 1962 paper than includes some benchmarks on the older (1956) Elliott 405: quicksort pdf. The introduction of the recursive language Algol 60 made a recursive implementation of quicksort possible, but few computer systems had implemented Algol 60, and even his later papers like the 1962 paper I used as a reference, still refer to using a "nest" with an iterative implementation. That paper also mentions bidirectional scanning from both ends of an array, as well as pushing the parameters (indexes) for the larger partition onto the nest and iterating on the smaller partition to limit "nest" (stack) space complexity to O(log2(n)). Rcgldr (talk) 16:00, 14 June 2020 (UTC)
I read the PDF available at https://doi.org/10.1093/comjnl/5.1.10 again, just to make sure, and here is what actually is written in a few initial paragraphs of Hoare's description of the Hoare's quicksort:

A problem of sorting a mass of items, occupying consecutive locations in the store of a computer, may be reduced to that of sorting two lesser segments of data, provided that it is known that the keys of each of the items held in locations lower than a certain dividing line are less than the keys of all the items held in locations above this dividing line. (...)

The first step of the partition process is to choose a particular key value which is known to be within the range of the keys of the items in the segment to be sorted. (...)

The items to be sorted are scanned by two pointers: one of them, the lower pointer, starts at the item of the lowest address, and moves upwards in the store, while the other, the upper pointer, starts at the item with the highest address and moves downwards. (...)

And so on.
This clearly indicates that the algorithm Hoare described is designed specifically for a compact, continuous block of homogeneous data, stored one after one in consecutive locations in the computer's memory. Which is exactly an array as defined by Fortran, Pascal, C and many other programming languages.
Of course that can be easily adapted for data scattered accross the storage and indexed with some array of references, or (possibly not that easily) to dynamically linked lists – but that would be an adaptation, a quicksort-like algorithm, as Rcgldr called it, not the Tony Hoare's quicksort itself. --CiaPan (talk) 23:26, 13 June 2020 (UTC)


Hoare's paper specifically mentions bidirectional scanning: "The items to be sorted are scanned by two pointers: one of them, the lower pointer, starts at the item of the lowest address, and moves upwards in the store, while the other, the upper pointer, starts at the item with the highest address and moves downwards ", so how did Lomuto unidirectional scanning with only a lower pointer end up being accepted as a type of quicksort? Rcgldr (talk) 01:54, 15 June 2020 (UTC)

@Rcgldr: Apparently that quicksort is not a Hoare's quicksort. Possibly it should be specifically mentioned every time, whose version of quicksort is being discussed. And maybe it should be each time defined, whether it is still a quicksort, or a quicksort-like algorithm. :) Just kidding. --CiaPan (talk) 21:10, 15 June 2020 (UTC)

Hello @Rcgldr:, The 1961 CACM paper appearing as my second reference (and already referenced in the wikipedia quicksort page) articulates the quicksort algorithm as the recursive pattern of partition, sort, sort. If you click on that doi link and read the pdf, you'll see that I'm not making up anything. Algorithm 64: Quicksort doesn't include the partitioning scheme as the algorithm is intended to be as broad as possible, as this is what algorithm designers (and inventors) tend to do. The 1962 paper you recall and provided is a deeper articulation that also includes the partitioning scheme. Various algorithm authors, including Cormen et al. that @MrOllie: mentioned, describe Lomuto's work as another partitioning method within the broad scope of the quicksort algorithm. Here is another nice explanation of Hoare versus Lomuto partitioning schemes within the quicksort algorithm.

This brings us back to the fact that quicksort on a linked list is still a quicksort, as the Sedgwick words I've now directly quoted indicate. Furthermore, you pointed out that Lomuto is unidirectional, but my second sentence isn't just about the fact that partitioning can be unidirectional. The first sentence mentions that the Hoare partition scheme that is usually taught is bidirectional and exchanges elements. Element exchange is what makes quicksort not stable, even if Lomuto's unidirectional partition method is used. When Sedgwick refers to linked list quicksort being stable, he is alluding to the fact that the unidirectional iteration need only move elements with a lesser key than the pivot node to a position immediately before the node. Elements are not exchanged. But these are just variations of implementation of the quicksort algorithm that are suitable for a different data structure than an array. All this being said, we can pop the stack and ask @MrOllie:, do you still object to this content that, I would emphasize, doesn't even have a COI for me anymore?

Thank you, JohnBoyerPhd

stability - exchange of adjacent items doesn't break stability (for example bubble sort), but exchange of non-adjacent items can break stability. Rcgldr (talk) 17:45, 15 June 2020 (UTC)
linked list - I don't have an opinion on calling such a sort for linked list "quicksort" (similar to Lomuto versus Hoare). Since quicksort is not stable, I don't see a requirement for an equivalent sort for linked list to be stable, although it would be nice. Quicksort for a linked list is slower than bottom up merge sort for linked list, but being slower doesn't make it an invalid method. Rcgldr (talk) 17:45, 15 June 2020 (UTC)
Hi @Rcgldr:, I agree there is no requirement for the linked list quicksort variant to be stable. In fact, in my DDJ paper and 1990 works, the partition was implemented slightly differently to eliminate a line of code and run slightly faster, at the expense of stability (because it wasn't required). These days, I'd prefer the stability. Anyway, I mentioned the stability because Sedgwick mentioned it, at which point it's clear he's talking about a better variant than something trivially array-like, such as just converting the linked list to a doubly linked list and then using Hoare's partition method. Thank you, JohnBoyerPhd
Side note - The Hoare paper actually has errors in it. " divided ... keys (on left) ... less than keys (on right)", when it should be less than or equal. It correctly stated later with "The aim ... divided ... equal or less than bound ... equal or greater than bound". The lower pointer scan description is wrong, stops at key "greater than bound", when it should be "equal to or greater than bound". The upper pointer is described correctly. Keys equal to the bound (pivot) or the bound (pivot) itself can end up in either part of a partition. In Hoare's bet with his boss about a sort faster than insertion sort, his boss probably wasn't considering a non-stable sort, since insertion sort was typically used as a stable sort to create runs of sorted data for a tape sort (merge sort(1945), polyphase merge sort(1960)), where stability was expected. I'm not sure how common it was at the time to sort a file small enough to fit in memory and where stability was not expected. Rcgldr (talk) 17:45, 15 June 2020 (UTC)
Arbitrary partitioning method

Hi, JohnBoyerPhd!

Here: Special:Diff/930928823 you wrote 'the Quicksort algorithm allows partitioning to be performed in any manner' and referenced https://doi.org/10.1145%2F366622.366644 – alas, I can't see the linked page supports the claim. Can you, please, explain in two or three sentences, how 'performing in any manner' follows from the linked source? --CiaPan (talk) 21:46, 15 June 2020 (UTC)

Hi @CiaPan:, Yes, I can do that in a few sentences. The linked article does not have a quotable sequence of characters that says you can do partitioning in any way. Rather, it is the fact that the article describes the quicksort algorithm as the recursive pattern around partition-sort-sort while placing no bound on how partitioning is done. Placing no bound in how partitioning is done means that it can be implemented in any way because there is no bound on how it is to be implemented. Thanks, JohnBoyerPhd
@JohnBoyerPhd: That's right! There is no quotable sequence of characters in the article that says you can do partitioning in any way. Instead, there is an explicit definition of the partitioning algorithm, just on the previous page as Algoritm 63: https://dl.acm.org/doi/pdf/10.1145/366622.366642 --CiaPan (talk) 07:23, 17 June 2020 (UTC)
@JohnBoyerPhd: @CiaPan: - In "algorithm 64", there is a statement that no extra space is required, yet the recursive calls require worst case O(n) stack space, since the code is missing the handling of the smaller partition first. Rcgldr (talk) 20:48, 16 June 2020 (UTC)
@Rcgldr: That's a completely another topic. --CiaPan (talk)
@CiaPan: - I somehow lost a comment about about algorithm 64 not specifying at all what partition does, other than returning I and J, which could seem to imply Hoare (bidirectional) scheme, but Lomuto (unidirectional) scheme could set I to pivot index - 1 and J to pivot index + 1. Rcgldr (talk) 16:37, 17 June 2020 (UTC)
Conclusion

FYI @Rcgldr:, @CiaPan:. Hello @MrOllie:, Although we've had discussions recently in which you haven't exactly agreed with the veracity of my proposed content, I've noticed that the issue is a bit of a moot point now. I would reiterate that the objections to my sources are not consistent with the language of the Wikipedia policies that were cited. However, a month after my request was initially declined, it looks like somebody else went on in and added [new content about linked list quicksort] that is along the lines of what I'd hoped could be added. And, wonderfully, they didn't even cite any sources to support the content!

I'm so pleased the content is represented, and I quite agree with the person who added the content that "quicksort can be implemented as a stable sort using linked lists." Of course, I did also provide sources such as the reliably sourced DDJ article and the quoted words from Sedgwick to support it. Plus, added bonus, the new content also implies that the mergesort is a better choice for linked lists. I really wanted this to also appear in the content, but the main source I had for supporting that fact was the DDJ article.

The person who added the content did wobble a bit at the end of the sentence by claiming that poor pivot choices would be made due to not having random access. This isn't really the reason. For one thing, in a random list, the first element is just as random as any other, so using it as the pivot choice is no poorer. More generally, though, partition is O(n) anyway, so a sequential implementation of median-of-three or any other pivot selection method does not increase the order of magnitude of partitioning and hence does not change the analysis of the linked list quicksort. The reason mergesort on linked list outperforms quicksort on a linked list is because mergesort always performs optimal divisions of the list, whereas quicksort often makes pivot choices that result in suboptimal divisions. The mergesort on a linked list can also be done "in place" (i.e. the data structure already contains O(n) extra space for pointers). I discuss this and show it empirically in the DDJ article that the linked list mergesort is faster than the linked list quicksort. Since DDJ is a reliable source, pretty much anybody except me could, without a COI, have said something along the lines of "Although quicksort can be implemented as a stable sort using linked lists[1], the linked list mergesort is faster than linked list quicksort because it can be done in place and always performs optimal divisions.[2]". So, it's a little bit unfortunate that the new content referred to no references to support the point. Maybe somebody should fix it.

Cheers, JohnBoyerPhd

@JohnBoyerPhd: @CiaPan: A bottom up merge sort for a linked list is faster since it avoids the repeated scanning of lists to divide them. Instead, it uses a small (25 to 32) array of lists, typically pointers to first nodes of lists, and immediately starts merging nodes into the array, where array[i] is either nullptr or points to a list with 2^i nodes. Similar to bottom up merge sort for arrays, you could think of the division process as dividing a list of n nodes into n lists of 1 node each, merging them into the array one node at a time. Once all nodes are merged into the array, the lists in the array are merged to form the sorted linked list. On a large list with scattered nodes (a lot of cache misses), bottom up can be ~30% faster than top down (== top down is ~40% slower than bottom up). Merge_sort#Bottom-up_implementation_using_lists Rcgldr (talk) 00:59, 17 June 2020 (UTC)
Hi @Rcgldr:, I think the important bit is to support assertions with external reliable sources such as the DDJ article[2]. If a "bottom up" (non-recursive) implementation is even faster than a recursive one, that wouldn't undermine the point being made by the external reliable source[2] that merge sort is better than quicksort for linked lists. That being said, if you check out the external reliable source[2], you'll see that it does cover two variants of non-recursive linked list mergesort. It may be that non-recursive is faster on arrays and/or languages that are "close to the metal" (or with a better implementation), but for linked lists and with Java at that time, the difference in the implementations tried was negligibly slower. So, another source would be needed to support the point that linked list bottom-up/non-recursive mergesort is faster than recursive mergesort. That being said, such a source would be used to support the point if it were being added to the mergesort page, rather than the sections of the quicksort page that describes its relation to other algorithms like the mergesort. For the quicksort page, which we're talking about here, it suffices to point out (and support with an external reliable source) that linked list merge sort (whether recursive or not) is better than linked list quicksort[2]. Cheers, JohnBoyerPhd (talk) 14:44, 17 June 2020 (UTC)
@JohnBoyerPhd: - I assumed that the recursive DDJ was top down, but the DDJ recursive merge sort for linked list is different than a typical top down merge sort for linked lists, which splits lists in half by scanning. The DDJ version avoids the scanning by not setting F2 until the base case of 1 node is reached and merge returns the value to use for F2 in NP (node pair), going back up the call chain. This reduces the overhead to O(log2(n)) for the passing and updating of parameters, versus bottom up iteration through the array (which I thought would be slightly faster - (update - turns out it is slightly faster)). Essentially the DDJ recursive version is using the stack in the same manner as bottom up uses the array. I was thinking that the recursion, calculation of sizes, and stack overhead would create a small time penalty versus the array access. If the list doesn't have a size member, the recursive method would have to do a 1 time pass to get the count of nodes. Rcgldr (talk) 03:10, 18 June 2020 (UTC)
@JohnBoyerPhd: - I benchmarked recursive versus bottom up and found bottom up to be slightly faster, but not much, typically less than 4% faster. I added the source code for both to my github repository: github/jeffareid/misc. For the recursive, since the test code always passes a "before" node, I removed the conditional. I also switched to primitive integer type (int). For bottom up, merge() uses a static temporary node ("temp"), which would need to be local for a multi-threaded implementation. The benchmark runs twice using pseudo random integers. The first sort will have sequentially allocated nodes, the second sort with scattered nodes. On my system, both sorts take ~7 seconds for the first sort, ~12 seconds for the second sort. Rcgldr (talk) 23:08, 17 June 2020 (UTC)
Hi @Rcgldr:, I appreciate your instinct toward testing via coding and execution. Indeed, before suggesting the addition of text for linked list quicksort and mergesort last year, I reimplemented the recursive versions in straight C, just to double-check with close-to-the-metal code. And sure enough, the linked list quicksort still had O(n log n) performance on random data, but the merge sort was still about 10% faster. I also had the code count key comparisons, and those were down about 10% too. However, empirical results from our code are really only useful for discussion here. They can't be added to the page because they are "original research" (in the way Wikipedia means it). Still, all results in this discussion only serve to reinforce what has been published in the reliable source DDJ article. In particular, any version of the merge sort is quicker than the linked list quicksort, and not because of being unable to use a good pivot selection technique like median of three. As I said earlier, partition is O(n) anyway, so one can afford to spend O(n) time on median of 3 or any other pivot selection technique at the front end of partition. Rather, the linked list quicksort is efficient but the linked list merge sort is quicker because it does less key comparisons, and that happens because quicksort partitions are likely to be suboptimally divided by the pivot key. Anyway, although Wikipedia policies do permit me to add content along the lines I described above after a COI review to ensure the rules are met, which they are, there have been numerous misunderstandings of the words used in those policies, so the COI review process has proven to be impractical for this page. That doesn't prevent anyone else who is qualified to read the DDJ article from using it as a reliable source to correct and support the content that is now in the page. Cheers, JohnBoyerPhd (talk) 04:18, 18 June 2020 (UTC)
@JohnBoyerPhd: - I wasn't intending to make any changes to Quicksort. I have wondered if the DDJ recursive version for linked list should be added to Merge_sort, but I never proposed it. Rcgldr (talk) 08:56, 18 June 2020 (UTC)
Hi @Rcgldr:, That's fine, I meant that anyone who is so inclined could fix the incorrectness at the end of that sentence. I'm not inclined to do so since I don't want to experience any further acrimony related to sources and since I am satisfied that the two main gist points I wanted do now appear in the text. I'm delighted they were added without the unnecessary obstructions related to sources that transpired earlier in this thread, so as I alluded in the section title, for me this is concluded. JohnBoyerPhd (talk) 17:20, 18 June 2020 (UTC)

References

  1. ^ Robert Sedgewick Algorithms in C++, Part 3: Sorting, Third Edition, p. 321. Addison-Wesley, 1998. ISBN 0-201-35088-2.
  2. ^ a b c d e Boyer, John M. (May 1998). "Sorting and Searching Linked Lists in Java". Dr. Dobb's Journal (285): 126–129, 137–138. Retrieved June 16, 2020.

External sort

I redid this section, after finding a reference source that doesn't require paying a fee. The sort is not stable and has worst case of O(n) passes, equivalent to time complexity O(n^2) for internal sort, and the average number of passes is worse than a typical external merge sort. The reference was written in 1983, at which time mainframe high end hard drives had reached 1 GB capacity, but PC hard drives were 10 MB. My impression is that this was an academic exercise, versus what was being used in commercial products that included file sorting, such as databases. Rcgldr (talk) 00:59, 19 June 2020 (UTC)

The quicksort algorithm is wrong

The quicksort algorithm on this page doesn't work. The algorithm given under Hoare partition scheme gives incorrect results for some carefully chosen inputs.

For example, it will incorrectly sort these integers: [10 0 5 2 4 7 1 6 3 8 9] (the result will be [0 3 4 2 5 6 1 7 9 8 10]).

It's hard to point to exactly one thing that's wrong with the algorithm, but the basic issue is that a carefully chosen input can "hide" an incorrectly sorted value in between [lo, p-1] and [p+1, hi]. The partitioning function must detect and correct.

A correct version of this algorithm can be found, e.g., in Sedgewick & Wayne: Algorithms 4th Edition. I can reproduce it here, if copyright permits (does it permit?) — Preceding unsigned comment added by 212.51.146.228 (talk) 09:34, 2 January 2021 (UTC)

Not true. I have just implemented the algorithm in C and run it at https://www.onlinegdb.com/online_c_compiler. The result was:
Input:  10  0  5  2  4  7  1  6  3  8  9
Result:  0  1  2  3  4  5  6  7  8  9 10
Probably you've made some mistake in translating the algorithm from pseudocode to your favourite programming language.
Below is my code, feel free to test it at OnlineGDB or elsewhere. --CiaPan (talk) 12:10, 2 January 2021 (UTC)
Source in C by CiaPan
#include <stdio.h>

int partition(int array[], int lo, int hi)
{
    int pivot = array[(hi + lo) / 2];
    int i = lo - 1;
    int j = hi + 1;
    while(1)
    {
        do
            i = i + 1;
        while(array[i] < pivot);
        do
            j = j - 1;
        while(array[j] > pivot);
        if(i >= j)
            return j;
        int t = array[i];
        array[i] = array[j];
        array[j] = t;
    }
}

void quicksort(int array[], int lo, int hi)
{
    if(lo < hi)
    {
        int p = partition(array, lo, hi);
        quicksort(array, lo, p);
        quicksort(array, p + 1, hi);
    }
}

void printArray(int array[], int len, const char* title)
{
    printf("%s", title);

    for(int i = 0; i < len; i++)
        printf("%3d", array[i]);

    printf("\n");
}

int main()
{
    int array[] = {10, 0, 5, 2, 4, 7, 1, 6, 3, 8, 9};
    printArray(array, 11, "Input: ");   // 11 = length of the array
    
    quicksort(array, 0, 10);            // 10 = last index
    printArray(array, 11, "Result:");

    return 0;
}
After a closer look at the result presented I think you confused sorting with partitioning. What you present as a 'result':
[0 3 4 2 5 6 1 7 9 8 10]
is a result of partitioning – the central element 7 has been chosen from the input array, then the array has been reordered into a lower partition with items 0 through 6 and the higher partition with values 8 through 10. Partitions are separated by the pivot value 7:
[0 3 4 2 5 6 1  7  9 8 10]
That, however, does not complete sorting! Now both partitions: [0 3 4 2 5 6 1] and [9 8 10] need to be independently sorted by recursive calls, so that the whole array gets ordered. --CiaPan (talk) 12:47, 2 January 2021 (UTC)