Talk:Euclidean algorithm/Archive 2

This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Optimization

I'm removing this from the "C/C++ Code" section:

An optimization of this algorithm would be:

int gcd(int a, int b) {

 int t;
 while (a %= b)
 {
   t = a;
   a = b;
   b = t;
 }
 return b;

}

This variation does not aid in understanding the Euclidean Algorithm, and is essentially just a rearrangement of the previous code snippet. It's not even clear whether it's an optimization. The two existing code snippets should probably be rewritten in pseudocode, since it's not clear that "%" means "modulo" unless you've used C. --Dantheox 19:31, 23 January 2006 (UTC)

I believe this this optimization makes code slower.

Explanation of the example of algorithm

I've written a rough explanation of the example of the algorithm, though it can be improved. Besides, I'm not sure whether the explanation should be included in the table or should it be outside the table, having a paragraph of its own, just like other paragraphs of this article. But if we choose the later, I'm not sure how to have the paragraph put side-to-side with the table. Thanks.--Lemontea 02:00, 25 January 2006 (UTC)

All of these are possible. Post your paragraph here and we can help to incorporate it in the best way. Deco 02:01, 25 January 2006 (UTC)

I've tried my best in polishing the description of the algorithm section. Currently the explanation of the example is written on the last column in the table. I hope somebody can now cleanup and finish that part, if anything needs to be done at all. Thanks. --Lemontea 09:18, 26 January 2006 (UTC)

Technical difficulties

The last edit is me trying to change the double equal sign into a single equal sign in the first pseudocode, however it seems the server cannot complete my edit and screwed the article up. If this comment ever get submitted successfully, could someone please do that minor edit for me. Thanks. --Lemontea 11:16, 28 January 2006 (UTC)

Did you mean the one under Concept block?

I did the edit anyway. I mean "b = 0" instead of the original. Check the history if you want to know exactly where it is. --Lemontea 13:36, 31 January 2006 (UTC)

Loop unrolling version

Should the following version (using loop unrolling) be mentioned?

int gcd(int a, int b) {

 while(true)
 {
   a = a%b;
   if(a==0)
   {
     return b;
   }
   b = b%a;
   if(b==0)
   {
     return a;
   }
 }

}

For normal C/C++ integers this does not make much difference; in the case of big integers (several hundreds of digits), however, it avoids the cost of unnecessary copies of the integers. --NeoUrfahraner 07:07, 12 April 2006 (UTC)

While personally I think this version, using some optimazation tricks, does save the temporary variable in a clever way, I don't think it is a good idea to optimize this far - this is an article for explaning an algorithm, not a demostration of programming tricks, and thus readibility and ease of comprehension should always come first before efficiency of the code. Nonetheless, it is ultimately, up to the reader(especially those that don't know about this algorithm before and check this for info) to decide which version is better. --Lemontea 07:55, 12 April 2006 (UTC)

Might be a good snippet for wikisource. Is there a collection of implementations of the Euclidean algorithm there? --Dantheox 22:25, 12 April 2006 (UTC)

A search on wikisource returned no results, so yes, go ahead and add it :) --Lemontea 02:24, 13 April 2006 (UTC)

Done, see http://en.wikisource.org/wiki/Euclidean_algorithm. I also added the two basic versions mentioned in the wikipedia article. This is my first contribution to wikisource, so please feel free to adapt it to the conventions used there. In particular, I do not know how to set the interwiki links. --NeoUrfahraner 05:23, 13 April 2006 (UTC)

Apparently Wikisource no longer considers source code to be legitimate content for their site (above link is marked for deletion). I find this odd, since that's the only content I've ever read on Wikisource... does anyone know what the appropriate place to put source code like this is? I saw Wikibooks suggested on Wikisource, but it seems excessive to create a whole book just for the Euclidean algorithm. Perhaps a subpage on Wikipedia, like Euclidean algorithm/Source code would be more appropriate? --Dantheox 16:56, 6 May 2006 (UTC)

Do you have a link to the discussion in Wikisource? In particular, where did they suggested Wikibooks? Actually I think we should not create a special solution for the Euclidean algorithm but as far as possible follow the general solution suggested (if there is any). --NeoUrfahraner 06:04, 8 May 2006 (UTC)

I'm afraid wikibook is not the place for source code either-except appendix for an algorithm book, but for the more general case of source code, this is not a long-term solution. Sadly, now we have only limited solutions: Either get another wikipedia page for it(but don't use subpage, it's not the practice of wikipedia except for project page, user page, or temporary page for article rewrite, etc), maybe called Implementations of euclidean algorithm, because I observe that more detailed proofs of maths theorem take such measure also, like Proofs of Fermat's little theorem; another type of example is Sample chess game; OR, store a private copy in comment of the article itself, in this talk page, in our user page, or local copy: this is of course the very last resort if all else fails. P.S.: Speaking of general solution, unfortunately there isn't any that community consensus has made. I think we've uncovered a rather large problem-that many algorithm page is actually more of a "implementation" page than a description page. --Lemontea 07:56, 9 May 2006 (UTC)

NeoUrfahraner -- some links to check out: [1], and perhaps [2]. Source code is classified as "reference data," akin to the first million digits of Pi. There was a fairly recent vote on whether to include such "reference data" on Wikisource, and it came out rather strongly in favor of exclusion. This seems like a bit of a bombshell.. the only reason I've ever had to go to Wikisource was for extra source code snippets. It is WikiSOURCE, after all. --Dantheox 15:26, 9 May 2006 (UTC)

I see. Fortunately for the special case of the Euclidean algorithm there is no urgent reason to react because the code in Wikisource is additonally available in the article or here at the discussion page. Personally I still think that wikibooks might be a good place for something like "The book of source code". In the German wikibooks, a related case is Archive of (mathematical) proofs, which should become something like Paul Erdős's "The Book," an imaginary book in which God had written down the best and most elegant proofs for mathematical theorems. This book was created because it was agreed that wikipedia is not a good place for mathematical proofs but referenceable place is needed. IMHO source code is a similiar case. --NeoUrfahraner 06:40, 10 May 2006 (UTC)

Should the code stay?

I know that wikipedia has no set in stone policy for algorithms on wikipedia, on whether code implementations are allowed or not. Of the many random algorithm pages I checked, many don't include an implementation(pseudocode only), while others do the opposite(implementation only), and yet some others do both. While wikipedia is not a code repository, I think we shouldn't be too strict and include *no* implementation at all. Having 50% of the article made up of implementation code is of course a bad thing, but I don't see any reason why the little, short, and concise implementation, as in this case, can't stay, even at the end of the article. Feel free to voice your opinion on this. --Lemontea 09:49, 14 April 2006 (UTC)

I agree that the short implementation in this article was very illustrative and I have no problem when it is inculded again. On the other hand, I copied these code examples (together with my version mentioned above) to Wikisource, so a different solution might be to better crossreference the articles in Wikipedia and Wikisource. --NeoUrfahraner 15:06, 14 April 2006 (UTC)

The code should absolutely stay. Main reason is, readers want it, and this particular piece of code is clear and brief. I don't really believe Wikisource is any better a place for code than Wikipedia - I have a category for the Euclidean algorithm on my own wiki, LiteratePrograms, but that's not a Wikimedia Foundation project. Deco 16:18, 12 May 2006 (UTC)

Not logical?!

This text is from the begining of the article -

The Euclidean algorithm (also called Euclid's algorithm) is an algorithm to determine the greatest common divisor (gcd) of two integers.

And this after -

Description of the algorithm

Given two natural numbers a and b, and assuming a is greater ...

First we have that a nad b are integers and then natural numbers.

--Čikić Dragan 15:24, 12 May 2006 (UTC)

Complexity

Shd, I'm reverting your edit because it is wrong on all accounts. First, it is a standard fact that the Euclidean algorithm has time complexity $O(n^{2})$ , although division is not known to be linear time. It works even with the quadratic school-book long division, essentially because the quotients are small (just do the math carefully). Second, it does not make any sense to plug in time complexity of division into the time bound on the original, subtraction-based, Euclid's algorithm, as there are no divisions (or multiplications) performed by the algorithm. -- EJ 17:54, 4 June 2006 (UTC)

I think this section needs a quick sketch of why the algorithm is

O(n^{2})

when it requires

O(n)

divisions, and division is not in general

O(n)

. A reference would also do nicely. As is, this assertion is somewhat surprising, and comes out of nowhere. --Dantheox 17:27, 14 July 2006 (UTC)

Sketch done. As for a reference, both books we have in the "References" section include it as an exersize; I tend to think this is sufficient. -- EJ 15:36, 16 July 2006 (UTC)

Thanks for the correction. However, I was not wrong on all accounts; I changed the bad use of atomic operations. Shd 14:27, 17 July 2006 (UTC)

Implementation again

I'll add some more souce to the "real programming language vs. pseudocode" flame. Personally me is for pseudocode, but it won't hurt adding an implementation in another langauge, if people here decide, to do so. Here is Haskell (untested, baware of bugs and errors!).

gcd :: Integer -> Integer -> Integer
gcd a 0 = a
gcd a b = gcd b a%b

--88.68.34.243 12:24, 9 June 2006 (UTC)

Notice that the only reason the c++ code stayed here is that wikipedia has no offical policy saying whether to use pseudocode or any programming language to show an algorithm, and so we have pseudocode for non-programmer, and c++ for programmers as c++ is rather popular, which effectively satisfy most people. It would then be unacceptable to add other programming language's code, since wikipedia is not a code repository. (Think again, Haskell? I can imagine people adding code for php, perl, Pascal, python, Prolog...it's endless) Previously, this isn't a big problem, as we can just put all these implementation at wikisource, but now wikisource decided to reject all these, so, unfortunately, there's no place for your code to stay. --Lemontea 03:02, 10 June 2006 (UTC)

The Haskell is wrong anyway. (%) is not the mod operator in Haskell (`mod` or `rem` is, depending on whether integer division truncates towards negative infinity or zero, respectively). (%) is used to construct a Ratio type. 70.132.14.22 07:12, 30 July 2007 (UTC)

Why Fibonacci numbers?

under "running time", it is not clear why the worst case is when two successive Fibonacci numbers are used...could anyone add some explaination? PIX 17:43, 20 September 2006 (UTC)

I added a parenthetical remark about the way I understand it, but it might be obscure if you're not familiar with continued fractions. A simpler explanation would take up a lot of space. —Keenan Pepper 03:26, 21 September 2006 (UTC)

Problem in the proof

The proof says that

"Any common divisor of a and b is also a divisor of r. To see why this is true, consider that r can be written as r = a − qb. Now, if there is a common divisor d of a and b such that a = sd and b = td, then r = (s−qt)d. Since all these numbers, including s−qt, are whole numbers, it can be seen that r is divisible by d."

"The above analysis is true for any divisor d; thus, the greatest common divisor of a and b is also the greatest common divisor of b and r. Therefore it is enough if we continue searching for the greatest common divisor with the numbers b and r."

I think there's problem in the second paragraph. What we can get from the first paragraph is , the greatest common divisor of a and b is a common divisor of b and r, but we cannot conclude directly that gcd(a, b) is the GREATEST common divisor of b and r.

A viable proof would be as follows:

Suppose a=qb+r, and gcd(a, b)=d. Then there are s and t such that a=sd, b=td and gcd(s, t)=1. So r=sd-qtd=(s-qt)d. From gcd(s, t)=1 we get gcd(t, s-qt)=1. Therefore,

gcd(b, r)

=gcd(td, (s−qt)d)

=d × gcd(t, s−qt)

=d,

i.e., gcd(a, b)=gcd(b, r).

Jiangwu 21:49, 22 January 2007 (UTC)

The implication in the second paragraph holds if we know both that "any common divisor of a and b is also a divisor of r", and that "any common divisor of b and r is also a divisor of a". Then the sets "divisors of a and b" and "divisors of b and r" must be the same set, and therefore the greatest element of one is also the greatest element of the other. I'm not sure whether both premises are proved in the current proof — they're certainly not clearly proved.

However, your alternative proof seems to use a lot of non-trivial properties of GCD — for example, that gcd(a,b) = gcd(a,b−na), and that gcd(na,nb) = ngcd(a,b). Can it be made simpler?

--Quuxplusone 01:46, 23 January 2007 (UTC)

Let CD(,) be the set of common divisors of two integers, and D() be the set of divisors of an interger. Then what you suggest is:

CD(a,b)\subseteq D(r){\text{ and }}CD(b,r)\subseteq D(a){\text{ implies }}CD(a,b)=CD(b,r)

. Again, there seems to be a big gap between the premises and the conclusion.

Jiangwu 03:56, 23 January 2007 (UTC)

It seems that there is an error in the iterative version of function gdc(a,b). The while loop condition should be 'b not equal 0'. Please check. And, the 'a / b' notation is not clear to denote modulus operation. Maybe 'remainder(a,b)' could be used. [John Reed]

201.51.144.238 11:32, 30 January 2007 (UTC)

Sorry, wrong place. [John Reed]

201.51.144.238 11:38, 30 January 2007 (UTC)

At the planetmath link, they use contradiction to make this clear. Perhaps changing:

"The above analysis is true for any divisor d; thus, the greatest common divisor of a and b is also the greatest common divisor of b and r."

to something like:

"The above analysis is true for any divisor d, including the greatest common divisor of a and b. That d is then also the greatest common divisor of b and r can be seen by looking for another divisor f that divides both b and r but is larger than d. If there is such an f, it is a divisor of a (by the same logic as in the above analysis). This makes f a common divisor of a and b. But d is already the largest common factor of a and b, so no such f > d can exist. That leaves d as the largest common factor of b and r."

Jefferys 01:25, 25 July 2007 (UTC)

--78.50.94.70 (talk) 12:34, 17 January 2009 (UTC)

I hope my addition to the proof clarifies the problem. Could someone check it for language errors (I'm not a native speaker)?

@Quuxplusone: As

CD(a,b)\subseteq D(r)

implies

CD(a,b)\subseteq CD(b,r)

and

CD(b,r)\subseteq D(a)

implies

CD(b,r)\subseteq CD(b,a)

the conclusion

CD(a,b)=CD(b,r)

holds.

As to why I didn't use contradiction: The planetmath proof uses the same argumentation as the one stated by Jiangwu, so the contradiction makes it less understandable, because it diverts you from the central idea, IMO.

Error in iterative version of gcd(a,b)

It seems that there is an error in the iterative version of function gdc(a,b). The while loop condition should be 'b not equal 0'. Please check. Also, the 'a / b' notation is not clear to denote modulus operation. Maybe 'remainder(a,b)' could be used. [John Reed]

201.51.144.238 11:43, 30 January 2007 (UTC)

The condition on the while loop is correct (b not equal to 0). Also, where's the 'a / b' notation used? fetofs ^Hello! 20:13, 30 January 2007 (UTC)

Yes, the condition is correct now; there had temporarily been a bad-edit on the page which used "a/b" and "b != 1". Not-just-yeti 16:18, 17 February 2007 (UTC)

Why 'anomalous cancellation'

Why have a link to see-also 'anomalous cancellation'? I don't see the connection... not-just-yeti 02:36, 28 March 2007 (UTC)

No need to keep it. I was looking for places to link it from, and this one was probably a stretch. Tom Harrison ^Talk 02:39, 28 March 2007 (UTC)

Original euclidean algorithm is faster on Pentium

On Pentium processors, division is very slow (manual says it is 41x slower than subtraction).

I have not trusted your statement about subtraction being less efficient, so I tested it.

Running a serie of tests (all numbers in range 1-2000, tested on AMD 1300MHz) produced these results:

dvr/pas: Time=1394.980 (division using "recursion", in pascal)

div/pas: Time=1281.991 (division using "while", in pascal)

div/asm: Time=1197.653 (division using "while", in assembler)

sub/pas: Time=1258.099 (subtract in pascal)

sub/asm: Time=1147.782 (subtract in assembler)

sub/pro: Time= 729.497 (subtract in pentium-pro specific assembler)

(all numbers are in mega-ticks, all varying +- 20 in multiple tests. All functions running on same aligned address, high priority interleaved with sleeps, 10 passes averaged)

You may see, that division+recursion is least efficient. Also division+while is slow. Subtraction using conditional is only little more efficient than division (mispredicted conditional poses also a big time penalty), unconditional subtraction (sub/asm) is faster, and unconditional subtraction using cmovcc instructions of pentium pro is far most efficient...

You may not reliably test this in interpret language like Java or Python, that has got so large interpretter overhead, that it wipes off procesor specific characteristics.

The best code (sub/pro) looks like this (in Delphi syntax):

function EuclidAsmSubPro(  A,  B: integer): integer; // pentium pro specific...
asm // EAX=A, EDX=B
  //
  push   ebx
  //while (B<>0)
  test   edx,edx
  jz     done
  //
cy:
  //t1:=A-B
  mov    ebx,eax
  sub    ebx,edx // A-B
  //t2:=B-A
  mov    ecx,edx // B-A
  sub    ecx,eax
  //
  //compare (A ? B)
  cmp    eax,edx
  //
  //if (A <= B) then B:=t2;
  cmovle edx,ecx  // db $0F,$4E,$D1
  //
  //if (A > original_B) then A:=t1;
  cmovg  eax,ebx  // db $0F,$4F,$C3
  //
  //while (B<>0)
  test   edx,edx
  jnz    cy
  //
done:
  pop    ebx
end;

The sub/asm version, rank 2nd in my test and compatible with any 386, uses (sbb,not,4x and,2x or) instead of two cmovcc instructions...

~ Semi, 20070522

Interesting numbers, though be aware that numbers <2000 are very small. The subtraction version is insanely slow, for numbers of any reasonable size. Consider gcd(2billion,3). The remainder approach takes about three steps, while the subraction version requires about 666million subtractions. Or better, for 100-digit numbers (as used in https routinely), its the difference between a few microseconds vs centuries.

So they way I interpret your numbers: If you're running on a pentium, and you are only using small numbers, and you are computing the gcd millions of times every few seconds, then subtraction should be used. In all other cases, avoid it.

A note on You may not reliably test this [in situation which] it wipes off procesor specific characteristics. -- This means your numbers are of interest on a certain platform, but don't extend anywhere. So it's not a statement about the eternal algorithm, but a technology-specific platform. Even if two statements (out of about ten) take 40x longer, it means a four-fold increase in processor power will make your numbers misleading. More importanly, an algorithm with more than a four-fold difference in efficiency still wins, even on an older processor.

The more I think about it, the more surprised that even for small numbers, the subtraction algorithm wins... interesting.

not-just-yeti 15:36, 22 May 2007 (UTC)

The right way of avoiding divisions in the Euclidean algorithm to improve efficiency is to use the binary gcd algorithm. -- EJ 11:51, 23 May 2007 (UTC)

Wrong time complexity for division

The article states

"The reason is that division of two n-bit numbers takes time $O(n(m+1))$ , where m is the length of the quotient."

This, however, seems to be the time complexity of long division, which is notoriously inefficient for large numbers. Just consider the special case of constant length divisors - now the quotient's length is about n, asymptotically, and the time complexity becomes

O(n^{2})

for a single division. Asymptotically much faster algorithms exist, e.g. the Newton-Raphson algorithm, which uses

1+2\ log\ n

multiplication operations (plus

log\ n

subtractions which are

O(n)

each). Together with the asymptotically fastest multiplication algorithm currently known, this results in a total time complexity of

O(n\ (log\ n)^{2}\ log\ log\ n)

(assuming the above-mentioned worst case of a constant length divisor). Aragorn2 19:48, 29 July 2007 (UTC)

It makes no difference. Even if division was

O(n)

, the Euclidean algorithm would still take

O(n^{2})

. It is thus better to keep things simple and use the obvious bound for long division. (This holds also in practice: asymptotically faster division algorithms usually make the Euclidean algorithm slower, as they merely add a lot of overhead which increases the constant hidden in the

O(n^{2})

notation. Asymptotically fast gcd algorithms are not based on Euclid, but some divide-and-conquer approach.) The reason is that the running time of Euclidean algorithm is dominated by cases where the size of the quotient is constant. -- EJ 09:46, 30 July 2007 (UTC)

The original algorithm does not terminate

The description of algorithm said "Given two natural numbers a and b, not both equal to zero ..."; however, if $a=0$ , and $b\neq 0$ , the original algorithm does not terminate. --AndresSR 13:38, 12 September 2007 (UTC)

It does, and with a correct result as well. 0 mod b is 0, hence the first iteration makes new a equal to the original b, and new b equal to 0, which triggers the check on b and makes the algorithm stop and output a, i.e., the original b. Where do you see any problem? -- EJ 09:33, 13 September 2007 (UTC)

I am not talking about the algorithm using iteration, it works. I am talking about the algorithm using repeated subtraction. --AndresSR 13:31, 15 September 2007 (UTC)

Oh I see. I'm sorry for the confusion, you are right. -- EJ 11:40, 18 September 2007 (UTC)

Please make the article more accessible

Please, this is supposed to be a place where EVERYONE can find information. Using unnecessarily exclusive language ("standard" mathematical formulae MAY be necessary but can often be replaced by ordinary English) makes Wiki exclusive. IMHO I would have thought that completely contradicted the central purpose of Wikipedia, no?? Can anyone out there translate this stuff?? If so, please do it. LookingGlass 17:36, 21 October 2007 (UTC)

Reverse algorithm

there is a very similar algorithm that can be used to find the least common multiple of two numbers. the original algorithm is

 function gcd(a, b)
     while b ≠ 0
         if a > b
             a := a - b
         else
             b := b - a
     return a

to avoid causing an infinite loop, it can be changed to

 function gcd(a, b)
     while a ≠ b     (the change is here)
         if a > b
             a := a - b
         else
             b := b - a
     return a        (or b)

the other algorithm is

 function lcm(a, b)
     m:=a ; n:=b
     while m ≠ n
         if m < n
             m := m + a
         else
             n := n + b
     return m        (or n)

should I add this to the article or is it considered OR ? --George (talk) 19:01, 8 February 2008 (UTC)

also, please take a look at this image and tell me what you think .--George (talk) 22:30, 8 February 2008 (UTC)

Only Two Numbers!?

This article describes how to take the gcd of two numbers, but not more. There should be a quick explanation of how to expand the method to gcd(a, b, c, ... ,n). —Preceding unsigned comment added by OrganicSolar (talk • contribs) 18:51, 20 September 2008 (UTC)