Ted's Computer World Some Permutation Algorithms

March, 2018

Despite the fact that there already are many web pages on this topic, and despite my admittedly limited expertise in this area, I would like to contribute something while attempting to avoid duplication of content.

It also occurred to me that some others might feel the same as I, in that many other solutions are detailed in code that cannot be understood without a working knowledge of the specific programming language being used.  To that end, all code on this page is presented in my beloved PowerBASIC, which is at least as readable as most pseudo-code and is more easily interpreted than other syntaxes.

No Recursion: Although generatimg permutations is ideal fodder for a recursive program, it is well-documented that recursion not only is slower than non-recursive methods, but it can be prohibitively resource-intensive as well; therefore, nothing recursive will be studied here.

Testing: The included algorithms have been analyzed for speed and efficiency.  Every method produces results quickly enough to satisfy us mere mortals in actual practice, because we are not trying to launch a space shuttle or design a robot to be smarter than we are.  Although I certainly do appreciate the quest for the 'perfect algorithm', choosing one compatible with individual programming needs is another matter.

About the code: All example algorithms process a string of consecutive digits starting with [1]; but unless it is otherwise specified, any integer values or character strings could be accommodated.  The following format applies to all listed code:

· A() contains the string of items to be permuted.
· C() is utility storage.
· Z() stores computed data as necessary.
· R() is just for double-precision randoms.
· TOT = number of items to be permuted.
· PERMS = total permutations.
· P and Q reference positions in the string of items.
· CT is just for counting.
· XX is a bit-array integer.
· ODD(n) and EVEN(n) are macros that simulate the named functions.

For the sake of readability, some routine setup such as the initialization of variables is not shown.  It is assumed that you know that stuff already.

---

Okay.  These algorithms purport to compile a listing of all the different ways in which the items of a data set can be listed with regard to the specific sequences.  For example, there are six permutations of the digits [1,2,3]123,132,213,231,312,321.  Reproducing that sequence, and in lexicographic (that is, sorted) order, should be relative child's play for anyone who can count that high; but that is because the human brain is vastly more capable than any computer.

One might guess that teaching a machine to sequence those digits would be equally trivial, and simple solutions do exist.  For a long time now, however, mathematicians have been attempting to 'improve' upon the straightforward methods, partly so as to reduce program run-time, and partly in order to get their stuff published in some scientific journal or other.  These efforts have produced some amazing algorithms, as shall be seen.


Let's begin with an unlikely method that you will not find elsewhere.  I wrote it only to get myself started in this investigation, knowing that it would work.  The idea is simply to shuffle the string of four digits multiple times, recording each new pattern as it is generated until all permutations have been collected.  This is accomplished using Ted's Tricky Array-Mapping Technique.

— A Shuffle of Sorts —

SUB ArrayMatch 'Arguably primitive, but effective '===== TOT=4 'total items '===== DIM Z(1000000) AS LONG DIM A(1 TO TOT) AS LONG DIM R(1 TO TOT) AS DOUBLE FOR J=1 TO TOT: A(J)=J: NEXT 'initialize data PERMS=1: FOR J=1 TO TOT: PERMS*=J: NEXT 'total perms = tot! CT=0 WHILE CT<PERMS FOR J=1 TO TOT: R(J)=RND: NEXT 'generate randoms ARRAY SORT R(), TAGARRAY A() 'sort R() = shuffle A() K=0 FOR J=1 TO TOT K+=10^(TOT-J)*A(J) 'combine data to an integer NEXT IF Z(K)=0 THEN Z(K)=K: INCR CT 'flag integer if new WEND '(ct<perms) '---Output Z() here, ignoring empty cells END SUB '(arraymatch)

Every shuffled string is logged by writing its value to the corresponding cell in the array itself if that has not already been done.  The CT variable knows when all the PERMS finally have been generated.  Then, scrolling back through the array produces a list that is sorted in ascending order by default!  What a deal.  Of course, this requires that array Z() be dimensioned at least to the greatest integer permutation possible — [4321] in this example.  Also, the value of PERMS always must be known in advance, in order to prevent an infinite search for new patterns.

Although any shuffling mechanism would suffice, the nifty TAGARRAY function provides a solution with an absolute minimum of program code.


Meanwhile, there should be a more efficient method than that for our purposes, and there is.  The following simple algorithm surely is the most straightforward manner with which to permute the digits [1,2,3,4], because it simulates the way that a human actually counts:

— Forloop Fiesta —

SUB FourLoops 'Just for digits 1-4 FOR J=1 TO 4 FOR K=1 TO 4 IF K=J THEN ITERATE FOR 'digit already in use FOR L=1 TO 4 IF L=J OR L=K THEN ITERATE FOR '' FOR M=1 TO 4 IF M=J OR M=K OR M=L THEN ITERATE FOR '' '---Output J,K,L,M here NEXT:NEXT:NEXT:NEXT END SUB '(fourloops)

For every position in the string, digits are tried in ascending numeric order, but are rejected if they match any digit already in use.  Without interruption, 4×4×4×4, or 256 total loops would be run.  With the IF-THEN testing, just 168 loops are needed to generate the 24 permutations.

Rather than using all the IF-THEN functions to check on integer usage, though, it occurred to me to try a Bit-Array to accomplish the same goal:

— Bits and Pieces —

SUB ForBITS 'Controls integer allocation with a bit-array FOR J=1 TO 4 BIT SET XX,J 'flag the counter-var as already used FOR K=1 TO 4 IF BIT(XX,K) THEN ITERATE: ELSE BIT SET XX,K '' FOR L=1 TO 4 IF BIT(XX,L) THEN ITERATE: ELSE BIT SET XX,L '' FOR M=1 TO 4 IF BIT(XX,M) THEN ITERATE '---Output J,K,L,M here BIT RESET XX,M: NEXT M 'clear unwanted bit for the next loop BIT RESET XX,L: NEXT L '' BIT RESET XX,K: NEXT K '' BIT RESET XX,J: NEXT J '' END SUB '(forbits)

Disappointingly, this routine runs more than three times slower than FourLoops.  Frequently, the elegant bitwise functions perform best, but not this time.


In any case, the two prior algorithms are limited to handling a data set of specific length.  In order to be truly worthwhile, the code must be more flexible.  To that end, I restructured FourLoops  to permute any number of consecutive integers:

— Homemade Handiwork —

SUB TedsPerm 'Output is lexicographic. 'Simulates TOT nested ForLoops. '===== TOT=6 'number of integers to permute '===== P=0 WHILE P<TOT INCR P DO '---Get One Permutation--- WHILE A(P)=TOT A(P)=0 DECR P: IF P=0 THEN EXIT SUB 'nothing more to do WEND ' INCR A(P) IF P>1 THEN '---compare with prior a() for match--- Match=No FOR J=1 TO P-1 IF A(P)=A(J) THEN Match=Yes: EXIT FOR NEXT IF Match THEN ITERATE '(do) 'try another digit END IF ' IF P<TOT THEN EXIT '(do) '---Output A() here LOOP '(get one perm) WEND '(p<tot) END SUB '(tedsperm)

This routine iterates more than a million permutations per second on my 3.0-Ghz PC, but of course that isn't good enough for everyone.


— A Heap of Data —

In 1963, B.R. Heap came up with an ingenious method for generating permutations by means of calculated swaps of data items.  Because no extra loops are involved, the routine was considered to be more or less as fast as possible.  This is the famous Heap's Algorithm:

SUB Heap 'Output not in lexicographic order REGISTER J AS LONG, N AS LONG 'greatly increases the speed '===== TOT=8 'total items '===== DIM A(0 TO TOT) AS LONG, C(0 TO TOT) AS LONG FOR J=0 TO TOT: A(J)=J+1: NEXT 'initialize data '---Output first A() here P=0 WHILE P<TOT IF C(P)<P THEN IF EVEN(P) THEN SWAP A(0),A(P): ELSE SWAP A(C(P)),A(P) '---Output A() here INCR C(P): P=0 ELSE C(P)=0: INCR P END IF '(c(p)<p) WEND '(p<tot) END SUB '(heap)

If that code seems almost magical in its simplicity, that's because it is.  Try to analyze its inner workings to see what I mean.  It runs four times as fast as TedsPerm  on four data items and nearly ten times as fast on eight items, when using the registered variable.  You can read more about it here:

<wikipedia.org/HeapsAlgorithm>


— Even Faster —

Just after getting Heap  to work, I discovered this newer one published by Phillip Paul Fuchs.  It actually runs heaps faster than Heap.  Perhaps you can figure out why.

SUB QuickPerm 'Output not in lexicographic order '====== TOT=10 'total items '====== DIM A(0 TO TOT) AS LONG DIM C(0 TO TOT) AS LONG FOR P=0 TO TOT: A(P)=P+1: C(P)=P: NEXT 'initialize arrays '---Output first A() here P=1 WHILE P<TOT DECR C(P) IF ODD(P) THEN J=C(P) ELSE J=0 SWAP A(J),A(P) '---Output A() here P=1 WHILE C(P)=0 C(P)=P: INCR P WEND WEND '(p<tot) END SUB '(quickperm)

This variation is even better yet:

SUB QuickPermReversal 'Output not in lexicographic order REGISTER P AS LONG '====== TOT=12 'total items '====== DIM A(0 TO TOT) AS LONG DIM C(0 TO TOT+1) AS LONG FOR J=0 TO TOT: A(J)=J+1: C(J)=J: NEXT 'initialize arrays '---Output first A() here P=1 WHILE P<TOT DECR C(P) J=0 WHILE J<P 'reverse remaining data SWAP A(J),A(P) INCR J: DECR P WEND '---Output A() here P=1 WHILE C(P)=0 C(P)=P: INCR P WEND WEND '(p<tot) END SUB '(quickpermreversal)

Kudos to anyone who can decipher what actually is happening in this code.  The rest can read all about the QuickPerms  here:  <http://quickperm.org>

Here is a chart of the output of the last three algorithms, each permuting four digits:

*


— A Golden Oldie —

Although many other algorithms are known, I'll round out this study with my favorite method of them all, which was conceived in the 14th Century by a lesser-known albeit gifted Indian mathematician named Narayana Pandita.  His insight is truly elegant in its simplicity, because once again it essentially replicates the manner in which a human would generate such a listing in real-time.  Apparently no one had done such a thing previously.  Here it is, paraphrased for maximal readability:

· Find the rightmost P such that A(P)<A(P+1)
· Find the rightmost Q to the right of P such that A(Q)>A(P)
· Swap A(P) with A(Q)
· Reverse the elements to the right of P

My requisite code is decidedly non-mysterious, and perhaps it could be improved:

SUB Pandita 'Output in lexicographic order REGISTER J AS LONG, TOT AS LONG '===== TOT=5 'total items '===== DIM A(1 TO TOT) AS LONG FOR J=1 TO TOT: A(J)=J: NEXT 'initialize data '---Output first A() here DO '---[Get One Permutation per Loop]--- P=0 FOR J=1 TO TOT-1 'find rightmost P-value with IF A(J)<A(J+1) THEN P=J 'a greater value immediately NEXT 'to its right. IF P=0 THEN EXIT SUB 'finished! (5-4-3-2-1) ' FOR J=P+1 TO TOT 'find rightmost Q-val to the IF A(J)>A(P) THEN Q=J 'right of P that is greater NEXT 'than P-val, SWAP A(P),A(Q) 'and swap them. ' FOR J=1 TO (TOT-P)\2 'reverse positions of all SWAP A(P+J),A(TOT+1-J) 'elements to the right of P. NEXT '---Output A() here LOOP '(get one perm) END SUB '(pandita)

All that it took was for the man to spell out, in simplest terms, what actually happens during a lexicographic sequencing.  What a concept.

---

CODE TWEAKING

A program designed for speed naturally should be coded so as to promote that objective.  One aspect of QuickPerm, as presented, is not best (but see the Addendum):

IF ODD(P) THEN J=C(P) ELSE J=0 SWAP A(J),A(P)

Before continuing, however, perhaps it is time to reveal the details of the macros ODD() and EVEN(); for their usage is merely intended to make the algorithms a bit more readable:

ODD(P)  -is the same as-  P MOD(2) -and- EVEN(P)  -is the same as-  NOT (P MOD(2))

The QuickPerm  author shows his student readers an elegant little C++ function for assigning a value to J depending upon the status of P, and invites them to improve upon it.  Its BASIC equivalent looks like this:

J=C(P)*(P MOD(2)) SWAP A(J),A(P)

Well, I don't know whether one can do better than to utilize the MOD() function for this purpose; for I have tested various options.  It is correct, however, that the overall module can be improved.  A better solution is to eliminate both the multiplication and the redundant J-variable altogether:

IF (P MOD(2)) THEN SWAP A(C(P)),A(P) ELSE SWAP A(0),A(P)

This adjustment enables the entire routine to run fully 3% faster, and that increase is worth having.  My Heap  code is structured in a similarly efficient manner, although it tests for EVEN() instead:

IF NOT (P MOD(2)) THEN SWAP A(0),A(P) ELSE SWAP A(C(P)),A(P)

Oddly enough (sic), even though those two constructs seem functionally equivalent, the ODD() function runs slightly faster than EVEN() on my machine.


Much the same condition holds true for the data-reversal mechanism that I had set up for Pandita, as compared to the one in QuickPermReversal:

FOR J=1 TO (TOT-P)\2 'reverse positions of all SWAP A(P+J),A(TOT+1-J) 'elements to the right of P. NEXT -or- J=P+1: K=TOT WHILE J<K '<< this loop runs 33% faster! >> SWAP A(J),A(K) INCR J: DECR K WEND

Although the Forloop design is more straightforward (as always), once again the logical comparison handily outperforms the math formulas.  Pandita  will be amended accordingly.  Of course, anyone whose compiler features a REVERSE function might as well use it.  My own IDE doesn't have one of those.

---

SPEED TESTS

Each algorithm was merely run through many complete loops while total time was tracked — a million loops for just four integers, a thousand for eight digits, etc., and just a single run for strings of ten or more items.  No data actually were output at all, because doing so would have consumed the same amount of time irrespective of the other code.  (FourLoops  and ForBits  were augmented to accommodate eight integers.)

The following chart shows the results obtained after maximizing each algorithm as per previous discussion.

TOTAL ITEMS:
[permutations]
4
[24]
8
[40,320]
10
[3,628.800]
12
[479,001,600]
13
[6,227,020,800]
ArrayMatch 32.8/M
(31.8-R)
FourLoops .70/M
(.78-R)
.0043
ForBits 2.36/M
(2.27-R)
.138
TedsPerm 3.54/M
(3.21-R)
.017 2.06 437
=1.1M/sec
Heap .66/M
(.54-R)
.0010
(.0008-R)
.093
(.075-R)
17.2
(16.8-R)
161
(131-R)
=45M/sec
QuickPerm
Reversal
.48/M
(.39-R)
.0007
(.0006-R)
.066
(.055-R)
8.74
(7.30-R)
=66M/sec
113
(93.4-R)
=67M/sec
Pandita 1.0/M
(.69-R)
.0029
(.0019-R)
.302
(.185-R)
46.3
(26.8-R)
=18M/sec
641
(365-R)
=17M/sec

---

SO WHICH ALGORITHM IS BEST?

Not surprisingly, the answer is, "It depends".  What is wanted from such code?  Those dabbling in Artificial Intelligence naturally would demand the greatest possible speed, but the rest of us surely don't need that kind of performance.

Even if one is able to generate 67 million permutations per second using QuickPermReversal, so what?  I don't know about you; but my own 'life-clock' is not callibrated in millionths of a second, or even in thousandths.  My various programming projects would be happy with the speed of any of the algorithms discussed.

Another important factor is functionality.  Personally, I cannot conceive of a practical use for the likes of Heap  or QuickPerm, because they do not produce data in a lexicographic sequence; so that limitation would seem to relegate them to the toybox of esoteric discussion.  In contrast, the other algorithms do output in ascending order, and that's what my programs want.  How about yours?

---

OTHER CONSIDERATIONS

Thus far, only data strings of unique items have been studied; but what if there are some matching items?  The values [1,2,3,3] would be a simple example.  In fact, some interesting features come to light.  Apparently, Heap  and both QuickPerms  treat the two 3's as discrete elements such as 3(a) and 3(b), because all permutations are duplicated during the calculations; moreover, any additional matches would multiply the duplication-factor.

In contrast, Pandita  and TedsPerm  suppress all duplication; but they have a limitation of their own.  In order to generate every permutation, the data string must be seeded starting with the lexicographically minimum value.  If the inital submitted order is [3,1,4,2], for example, only the permutations 'greater' than that would be found; therefore, the sequences from [1,2,3,4] through [3,1,2,4] would not be generated.  This could be a desirable feature, though, I suppose.

That having been said, I must finish by pointing out that one of the algorithms does not suffer from either of those issues.  That's right; despite your actual opinion of ArrayMatch, it does generate all permutations and does avoid duplication, irrespective of the original data sequence or values!

Did the tortoise beat the hare again?

---

ADDENDUM

In May of 2019, the creator of QuickPerm, having reviewed my presumed 'fix' for his algorithm, wrote and graciously explained that the choice of code was not an oversight; rather, it is dependent upon whether one employs a recursive structure or an object-oriented one.  (This also helps to explain why I use PowerBASIC; I am not a big fan of recursive code.)  In any case, thanks to Mr. Fuchs for clarifying this and for reading my stuff.

Go Back