PowerBASIC Sorting Faster |

Would You Like a Faster Sort?

Probably more research has been devoted to sort routines than any other algorithm. Literally trillions of items are sorted every day in the computer world, and speed is crucial. If you could produce a sort package that was 5% faster than the best existing algorithms, you probably could name your price.

Well, I haven't written one of those, but I would offer an alternative to your
compiler's ~~built-in~~ sorting mechanism. This idea isn't actually a sort at all,
but a linear ~~algorithm —~~ that is, no nested loops. Under certain
conditions, the simple routine shown below can run **exponentially faster** than
PowerBASIC's function.*ARRAY SORT*

ARRAY MAP

There are two restrictions. The data must consist of **integers**; and the
utility array *X()* must be dimensioned at least to the
**size of the greatest data value**. So if the value 9,000,000 occurs somewhere
in the data, then must be dimensioned at least to nine million
cells. That is not an issue for PowerBASIC, which supports arrays exceeding
*X()*__four billion__ cells.

*DIM* A(100000000) *AS LONG* `'integer data to be sorted`
*DIM* X(100000000) *AS LONG* `'dimensioned >= greatest data value`
*DIM* SortSize& `'number of data items`
*DIM* MaxInt& `'greatest integer in data`
*DIM* J&, K&, N&
`'--------[ Array Map Sort ]--------`
*FOR* J=1 *TO* SortSize: *INCR* X(A(J)): *NEXT*
N=1
*FOR* K=1 *TO* SortSize `'for DESCENDING order, use: FOR K=SortSize TO 1 STEP-1`
*DO*
*IF* X(N) *THEN* A(K)=N: *DECR* X(N): *EXIT*
*INCR* N
*LOOP
NEXT* K

Each value in *A()* is flagged at the corresponding __address__ of
So if *X()*. then *A(any)=375*,
is incremented. Then variable *X(375)**N* steps through
whenever a positive value is encountered, the *X()*;__location__ of *N* is written
back sequentially to A nifty feature is that
*A()*. doesn't have to be cleared out before being rewritten.*A()*

Data which include negative integers can be accommodated either by adding a suitable
offset when writing to then subtracting that value as it is written
back to *X()*, or by setting up the array to include negative addresses
in the first place. PowerBASIC lets you do that. Also, using an appropriate
multiplier as an offset would enable the sorting of *A()*,~~non-integer~~ values that
conform to a rigid ~~format —~~ such as decimal currency or batting averages.

Because *ARRAY SORT* was written in assembler, it is of course more inherently
efficient than anything written in a high-level language such as PowerBASIC. That
is the only reason that it can compete with at all.
As it is, *Array Map* performs better with smaller data
*ARRAY SORT*~~samples —~~ up to about 25,000 items. But for a sort of, say, 50 million
items, runs *Array Map*__30 times faster__ on my computer when the
integer size is limited to 100 thousand, and five times faster when the integer values
can range up to ~~100 million.~~

BITMAP SORT

This works just like *Array Map*, except that instead of flagging whole array
elements, each data integer in is represented by a single *A()*__bit__
in The tradeoff is that there can be *X().***no matching data values**,
unless it is okay that only one of equal values would survive the process.

The sole advantage is that array can be *X()*__eight times smaller__
than before. Admittedly, in today's world of virtually unlimited memory resources,
this routine has become functionally obsolete; I include it just for nostalgia purposes;
for in my days it was a godsend.*DOS/GW-BASIC*

*DIM* (as above)
*DIM* EL& `'array element`
*DIM* BT& `'bit counter`
`'--------[ BitMap Sort ]--------`
*FOR* J=1 *TO* SortSize
EL=A(J)\32: BT=A(J) MOD 32
*BIT SET* X(EL),BT
*NEXT* J
K=1
*FOR* EL=0 *TO* (MaxInt+31)\32 `'step through the array`
*FOR* BT=0 *TO* 31 `'step through the bits`
*IF BIT*(X(EL),BT) *THEN*
N=32*EL+BT: A(K)=N: *INCR* K
*IF* K>SortSize *THEN EXIT*,*EXIT* `'done`
*END IF
NEXT* BT
*NEXT* EL