![]() |
Sorting, Shuffling & Dealing |
Out of Sorts?
boldl boldaslf bold bold2 normal
Probably more research has been devoted to sort routines than
any other algorithm. Literally trillions of items are sorted
every day in the computer world, and speed is crucial. If you
could present IBM with a sort package that would save its mainframe
users 5% on their sort-times, you could name your price.
I have analyzed a multitude of sorts — Bubble, Insertion,
Tree, Heap, Selection, Merge, O-Sort, Endsort, Jumpsort, Flipsort,
Fastsort, Shell, and Quicksort. The Quicksort is rated by most as
the fastest of the bunch; however, the routine is recursive, and that
causes GW-BASIC to run out of stack space in a hurry.
Of the remaining options, the Shell Sort performed by far the best in
my testing, so that is what will be featured here.
Included are three other methods, which aren't actually sorts at all. Although their usefulness is limited to certain types of data, they are worth a look, because they are exponentially faster and more economical than any traditional sorting routine!
The stated [times] are just for comparison purposes; they represent the average runtime on several arrays of 10,000 random integers, in BASIC direct mode. Compiled runtimes would be far faster.
40 DIM A(10000), X(12500) 'data array, mirror array 50 SIZE=10000 'total elements to sort
SHELL-METZNER SORT [time: 1.21 sec]
This is inherently a multi-layer Insertion Sort. A decreasing
offset enables data slated for greater moves to do so early, rather than
being transferred later in smaller increments. No one knows the best
strategy for values of D, but Donald Knuth has suggested a series of the
form (3^n-1)/2, or 1, 4, 13, 40, 121, 364, 1093,
etc. Knuth also suggests that the starting point be one lower than
the greatest D-value that is less than the sort size. Example:
to sort 500 items, the starting value for D would be 121. This strategy
proved best overall in my testing, and Knuth is a better programmer than I;
so his recommendation is featured in this sort. The strategy
D=2^n-1 performed nearly as well, however;
the only absolute requirement is that the final value is
D=1.
2100 '-----------[ Shell-Metzner ]-----------
2101 '
2102 'In: SIZE = number of data elements
2103 ' A() = array to be sorted
2104 'Out: A() as sorted list
2109 '
2110 DLOOPS = INT(LOG(2*SIZE)/LOG(3))-1
2115 D=(3^DLOOPS-1)/2: IF D=0 THEN D=1
2119 '
2120 WHILE D
2130 FOR J=D+1 TO SIZE
2135 T=A(J)
2140 FOR K=J-D TO 1 STEP -D
2150 IF T>=A(K) GOTO 2165
2155 A(K+D)=A(K)
2160 NEXT K
2165 A(K+D)=T
2170 NEXT J
2180 D=D\3
2190 WEND
Lines 2110-2115 calculate the starting value for D.
It is decremented at 2180 (integer division). The defaulted
K-loops increase the speed of the routine, as they are handled
quite efficiently at the machine level.
String input can by accommodated by setting DEFSTR A,
or by changing the array type to A$().
TED'S SHELL SORT [time: 1.13 sec]
One can do better, however. According to my testing,
the following improvement results in a 2-5% increase in
efficiency. Making the swap at each opportunity at 2155
obviates the temporary variable T, and it eliminates the
assignment at 2165, which is redundant when no swap is to be made.
2100 '-----------[ Shell-Metzner-Ted ]-----------
2101 '
2102 'In: SIZE = number of data elements
2103 ' A() = array to be sorted
2104 'Out: A() as sorted data
2109 '
2110 DLOOPS=INT(LOG(2*SIZE)/LOG(3))-1
2115 D=(3^DLOOPS-1)/2: IF D=0 THEN D=1
2119 '
2120 WHILE D
2130 FOR J=D+1 TO SIZE
2140 FOR K=J-D TO 1 STEP -D
2150 IF A(K)<=A(K+D) GOTO 2170
2155 SWAP A(K),A(K+D)
2160 NEXT K
2170 NEXT J
2180 D=D\3
2190 WEND
That is the pretty, readable version. However, as with
most routines, runtime can be minimized by packing the code.
Every time BASIC accesses a new line, some processor time is consumed.
Also, whenever a GOTO or GOSUB is executed, BASIC hunts for the target
line, starting at the top of the program. This means not only that
speed-critical routines ideally would be positioned early in the
code, but that time can be saved by eliminating cosmetic spacing and
combining statements to reduce the number of logical lines.
This three-line packing of Ted's Shell Sort saves noticeable time
in direct mode:
[time: 1.10 sec]
2120 WHILE D:FOR J=D+1 TO SS:FOR K=J-D TO 1 STEP-D:IF A(K)<=A(K+D) GOTO 2170 2155 SWAP A(K),A(K+D):NEXT 2170 NEXT:D=D\3:WEND
My tests indicate that the packing of code cuts runtimes substantially
when using the BASCOM 2.0 Compiler; but the benefits are negligible
when using the far smarter QuickBASIC 4.0 Compiler. And yes,
it's okay to jump out of the K-loop, as long as you don't make a
habit of it. GW-BASIC provides stack space for tracking only
nine concurrent FOR-loops; however, if you re-initialize
a loop using the same counter variable, no additional stack space is
required, and your program will not crash for that reason.
The next three routines are unlike traditional sorts, in that no
data values are compared or swapped at any time. All are linear
algorithms — that is, no nested loops. Speeds are up
to ten times faster than any other sort. The tradeoff is that the
data must conform to specific criteria.
ARRAY MAP [time: 0.15 sec]
This routine is for use with positive integers only, with no
duplication of values. A mirror array X() needs
as many elements as the greatest integer to be sorted. Any matching
values are absorbed; that is, if two or more elements have the same
integer value, only one such element will survive the process.
If that effect is unwanted, then use Array Match.
What it does: For each value in A(), the X()
element of the corresponding number is flagged with -1 (although
any value could be used). Example: if A(24) = 375,
then X(375) is set equal to -1. Then
variable N steps through X(); when it
encounters a flag, it writes the corresponding address (not its
value) back to the lowest-numbered available element of
A(). The result is a sorted list!
2400 '-------[ Array Map Sort - no matching data ]-------
2401 '
2402 'In: SIZE = number of elements to sort
2403 ' MAXINT = greatest value in list
2404 ' A() = array of unique integers
2405 'Out: A() as sorted data
>
2409 '
2410 FOR J=1 TO MAXINT: X(J)=0: NEXT
2420 FOR J=1 TO SIZE: X(A(J))=-1: NEXT
2430 N=0
2440 FOR K=1 TO SIZE
2450 N=N+1: IF X(N)=0 THEN 2450
2460 A(K)=N
2470 NEXT
That's it! Two variable assignments per data element, and it is finished.
The code could easily be adjusted to process other types of numeric data having a rigid format, such as decimal currency or baseball batting averages. It would be necessary only to convert the data to or from integers before and after the sort.
ARRAY MATCH [time: 0.18 sec]
This adjustment accommodates repetitions of data values by counting
them, then writing the totals back to A() accordingly.
The only changes from ARRAY MAP are at line 2520, plus the addition
of line 2570.
2500 '-------[ Array Map - matching data ok ]-------
2501 '
2502 'In: SIZE = number of elements to sort
2503 ' MAXINT = greatest value in list
2504 ' A() array of non-negative integers
2504 ' X() dimensioned >= greatest integer
2505 'Out: A() sorted
2509 '
2510 FOR J=1 TO MAXINT: X(J)=0: NEXT
2520 FOR J=1 TO SIZE: X(A(J))=X(A(J))+1: NEXT
2530 N=0
2540 FOR J=1 TO SIZE
2550 N=N+1: IF X(N)=0 THEN 2550
2560 A(J)=N
2570 X(N)=X(N)-1
2580 NEXT J
BITMAP [time: 0.16 sec.]
This is similar in concept to Array Sort. But this time,
instead of flagging whole array elements, each data integer in
A() is represented by a single bit in X();
that's 16 bits per array element. Array X()
needs to be only 1/16 as large as the greatest integer.
The data are packed and unpacked in binary fashion, using the bitwise
AND/OR.
This option could be useful if the data integers were large and/or
array space was at a premium. (In GW-BASIC, it's not difficult
to run short of such resources.) Any matches in the data are absorbed,
being flagged by the same bit; therefore, if there are two or more instances
of any integer in the data list, only one would be written to the output.
400 '-----------[ Bitmap Sort - matching values are absorbed ]-----------
401 '
402 'In: SIZE = number of elements to sort
403 ' MAXINT = greatest value in list
404 ' A() = array of;non-negative integers
405 ' X() = initialized at 0, must be at least 1/16 the size of the greatest integer
406 'Out: A() sorted
409 '
410 FOR J=1 TO SIZE
420 EL=A(J)\16: BIT=A(J) MOD 16
430 IF BIT<15 THEN X(EL)=X(EL) OR 2^BIT
440 IF BIT=15 THEN IF X(EL)>=0 THEN X(EL)=X(EL) OR -32768!
450 NEXT J
499 '
500 N=1
510 FOR EL=0 TO (MAXINT+15)\16
520 BIT=0
530 WHILE BIT<16 AND N<=SS
540 IF BIT=15 AND X(EL)<0 THEN A(N)=16*EL+15: N=N+1
550 IF BIT<15 THEN IF X(EL) AND 2^BIT THEN A(N)=16*EL+BIT: N=N+1
560 BIT=BIT+1
570 WEND
580 NEXT EL
This method also can accommodate larger sorts that cannot be handled
by other methods. In GW-BASIC, the memory limit for total
integer array elements is approximately 25,000. Of course, most
lists are far smaller than that, so no problem arises. Theoretically,
however, if A() were maximally dimensioned, a list of
25,000 × 16, or 400,000, unique values could be sorted to and from
a disk file. Unfortunately, another limitation of BASIC would restrict
such a list to a maximum of 32767, as that's all the unique positive integers
that are available! Newer programming languages, with 32-bit
or 64-bit integers, are not similarly restricted.
Shuffling Around
As with sort routines, there has been an overwhelming amount of research on shuffling. Gaming casinos and online poker sites alike are keenly interested in maximizing the randomness of their shuffles; otherwise, the results could be predicted, and their games could be mathematically beaten. Despite all the study, debates still rage over the relative validity of certain algorithms.
Being addicted to the game of contract bridge as well as to computer
programming, I have devoted a lot of independent research to shuffling,
particularly in regard to mixing a deck of cards. For my testing,
I deal four 13-card hands using the various shuffles, then compare
the results with known mathematical expectations. Everything
meaningful is counted: hand patterns, suit lengths, runs, individual card
placements, and more.
I am satisfied as to the validity of a shuffle routine only if it passes such tests within close tolerances. The following two obviously sound algorithms fared better than the others.
SORTING RANDOMS
This routine generates a list of random numbers, then sorts them. Mirroring data movement with that of the random values results in the data itself being mixed as the randoms are organized. Any sorting module could be utilized. For the sake of simplicity, a Bubble Sort is used in this example:
3000 '-----------[ Shuffle: Sort Randoms ]-----------
3001 '
3002 'In: A() = data to be shuffled
3003 ' SIZE of list
3004 'Out: A() as shuffled data
3009 '
3010 FOR J=1 TO SIZE: X(J)=RND: NEXT
3020 FOR J=1 TO SIZE-1: FOR K=J+1 TO SIZE
3030 IF X(J)>X(K) THEN SWAP X(J),X(K): SWAP A(J),A(K)
3040 NEXT K,J
This is the method used by the American Contract Bridge League (ACBL) for its preparation of handrecords. One can do far better, however, in terms of speed:
COLLECTION SHUFFLE
This one is my personal favorite for a number of reasons.
Being another linear algorithm, it is more or less as fast as possible.
From the sort list A(), a random unit is selected and
"collected" sequentially in X(). Then the
last data element is placed in the slot vacated by the selected unit,
and the scope of the random search is decremented. The process
repeats until all data elements have been assigned. Array
X() then contains a shuffled list.
7000 '--------------[ Collection Shuffle ]---------------
7001 '
7002 'In: A() data to be shuffled
7003 ' SIZE of list
7004 'Out: X() as shuffled data
7005 ' A() trashed
7009 '
7010 D=SIZE
7020 FOR J=1 TO SIZE
7030 R=INT(RND*D)+1 'random, 1 to size
7040 X(J)=A(R) 'assign the value
7050 A(R)=A(D) 'fill the hole
7060 D=D-1 'decrement the data choices
7070 NEXT J
The filling of the "hole" left by a selected element eliminates the problem of subsequent random selections accessing an already allocated value (empty array element), which process could theoretically run indefinitely. Oddly enough, in practice such a procedure invariably runs approximately three times as long as the more robust setup (perhaps pi (3.14) is the actual average runtime, or is it e (2.72)?).
A WORD ABOUT HUMAN SHUFFLES
Using a deck of Kem cards of average wear and average grime, I made
100 normal shuffles, noting the sizes of both the cuts and the resultant
"packets". The cuts varied from 24-28 cards, and the
packets ranged between 1 and 7 cards each. Starting with an
ordered deck, a shuffling routine then simulated my human shuffles by making
weighted random choices of cut and packets, based upon their empirically
suggested frequencies of occurrence.
Results of testing proved reasonable, albeit unacceptable by computer standards. Six shuffles yielded much better statistics than three, and ten shuffles were even better yet. Perhaps that disparity would be lessened by starting with a more randomly organized deck, such as from a live game.
Dealing (with) a Deck of Cards
Sometime in the 1980s, when computer bridge scoring was just getting
started, the ACBL's Contract Bridge Forum — a monthly
newspaper — published a card-shuffling routine in
BASIC code. It is unfortunate, however, that no one —
including, apparently, the programmer — had bothered to test
or analyze the routine, for it was hideously flawed. It suffered from
what I call the 'clubs problem', an error in the handling
of probabilities.
Any dealing routine must begin with the pack arrayed in some ordered
fashion; otherwise, there would be no way of tracking the cards'
locations. One popular choice is to arrange the cards spades first,
ace to deuce, then the hearts, diamonds, and clubs. So card #1
would be the ace of spades, and card #52 the deuce of clubs.
Here is the error. It is invalid to repetitively select a
random hand, then deal a card to it. Why? Because the hands do
not fill up evenly. After dealing say, 40 cards, they might be
distributed 12 - 11 - 9 - 8, or any other pattern. Using that
sequence, if the next card were dealt truly at random, it would be
5 times as likely to go to the hand missing 5 cards as to the
hand missing only 1 card. Assigning equal weight to the four
hands themselves distorts the randomness, because the possible outcomes do
not carry equal weight. Additionally, the odds are against the
first 48 cards being distributed 12 - 12 - 12 - 12.
This means that in the majority of cases, the last two (or more) cards
are dealt to the same hand.
Why is that bad? Because it results in the lowest-ranking
clubs going to the same hand an inordinate number of times.
That's why we call it the 'clubs problem'. This
distortion of the probabilities has a number of disastrous effects, most
notably a great increase in frequency of the more distributional hands.
A different ordering of the original deck would not help, either; for the
same last few cards dealt still would be paired up far too frequently.
So let us opt for a mathematically valid routine instead, by selecting
the cards at random, and dealing them to the hands in some orderly
sequence. Although it makes no difference whether the cards are
disbursed one at a time or thirteen at a time, the following routine
simulates a standard human deal of one card at a time around the table.
It combines the Collection Shuffle with a simple dealing mechanism.
The randomly selected cards actually are dealt directly to
Array Deal(), without the requirement of a prior shuffle.
7000 '----------[ Deal 52 Cards to 4 Hands ]----------
7001 '
7002 'Out: DEAL(h,c) hands #0-3, cards #0-12
7009 '
7010 FOR C=0 TO 51: DK(C)=C: NEXT 'initialize the pack
7019 '
7020 FOR CD=0 TO 51
7030 R=INT(RND*(52-CD))
7040 HAND=CD MOD 4
7050 CARD=CD MOD 13
7060 DEAL(HAND,CARD)=DK(R)
7070 DK(R)=DK(51-CD)
7080 NEXT CD
Comprehensive methods don't necessarily require a lot of code!
To maximize speed, this entire routine could be packed into a single
logical line, saving 5-6% runtime. At home my Pentium-4
generates more than 2,200 deals per second in immediate mode, and nearly
8,000 deals per second compiled.