The Best-Kept Secret of Programming |
Note: Although this article was originally written back in the heyday
of GW-BASIC, I have upgraded some of the code to look like output from modern
compilers. These methods are particularly useful in streamlining macros
for spreadsheets, any version of BASIC, and even some word-processors.
The GW-BASIC User's Guide and Reference, Microsoft, 1989, contains a small section explaining the function of relational operators. The text begins:
"Relational operators let you compare two values.
The result of the comparison is either true (-1) or
false (0). This result can then be used to make a decision
regarding program flow".
The GW-BASIC Reference, McGraw-Hill, 1990, features a similar section with this sentence:
"When two values are compared, the relational operators
return a result of -1 (true) or 0 (false)".
The GW-BASIC 3.20, Epson, 1986, is nearly as terse:
"A logical operation uses Boolean mathematics to define
a logical connection between the results (-1 or 0) of relational
operations. You can use logical operators to connect two or more
relationships and return a true or false value to be used in a decision".
Those manuals make no further reference to the numeric value of
logical expressions, except in regards to the usage of the bitwise
which is a different construct. Also, despite extensive searching,
I have found no other web page that addresses this issue, AND/OR
,per se.
"Big deal", you might say. Computers work with
numbers. Every relational expression, such as
is internally evaluated for its relative
truthfulness. The CPU's logic gates process such expressions using
P > Q
,numbers — specifically, 0 and -1 in the case of
BASIC. This should come as no surprise. My response would be,
"Those arithmetic values are immensely useful;
why is their application being largely ignored?"
Before going further, I would like to share my first experience as a
BASIC programmer, on a Commodore-64. At the BASIC prompt,
I entered the statement, What happened?
A PRINT 1=2
.zero
appeared on the screen, which was not what I had expected;
yet the absence of an error condition told me that such a statement had a
valid meaning to BASIC. Hmm. Then I entered this command:
What happened? Again, there was no
error message, but this time a PRINT 2=2
.-1
appeared. Hmm. I tried
it again: This also begat
PRINT 3 <= 4
.a -1
.
It seems that relational expressions need not exist only within an IF statement or as a condition of a WHILE loop, yet those are pretty much the only constructs in which one ever sees them being used. This brings us to The Great Unknown of Programming, and you may have heard it here first:
The relational and logical operators can be used in ANY mathematical context!
A corollary to that is:
Any LOGICAL CONDITION can be expressed as an ALGEBRAIC EQUATION:
That means formulas. (It also happens to mean that the keywords
IF and THEN are technically
redundant, and that any program can be written without them!
But that's the stuff of another article, and those keywords do make life
easier.) Let us see some examples, which I have termed
'Booleans'
for lack of a more imaginative designation.
X=2: IF Y <> 0 THEN X=3 can be written as: X=2 -(Y <> 0) |
The term (Y<>0)
is evaluated. If true, it is
assigned a value of The expression becomes
-1
. or X = 2-(-1)
, Should
3
. then Y = 0
,(Y <> 0)
would be false, and its value Then,
0
. or X= 2-(0)
,2
.
A multiplier can be utilized to effect a specific change in value:
IF A=B THEN A=A-4 can be shortened to: A=A -4*-(A=B) which might as well be written as: A=A+4*(A=B) |
If A
entered the equation equal to B
, then
4*(-1)
is subtracted from it.
Logical operators may be included in the mix.
The logical OR is equivalent to arithmetic ADDITION:
IF P=5 OR P=6 THEN Q=Q*2 becomes: Q=Q -Q*(P=5 OR P=6) or: Q=Q -Q*((P=5)+(P=6)) |
The logical AND equates to arithmetic MULTIPLICATION:
IF R>0 AND S=7 THEN T=5 ELSE T=2 could be written as either: T=2 -3*(R>0 AND S=7) or: T=2 +3*(R>0)*(S=7) |
Notice the difference in the sign in front of the
In the first case, there is but one Boolean term, which will evaluate to
3
.(-1)
if true. In the second case, there are two Booleans,
which would generate (-1)*(-1)
if true. An ODD
number of multiplied terms CHANGES the sign; an EVEN
number MAINTAINS it. Including the logical operators in the
code is more readable, but converting to pure algebra is more concise.
It is the user's option.
One might be able to combine several statements, and string
comparisons are fair game:
IF D$=E$ THEN J=J-1 IF F<=3 THEN J=J+2 can become this: J=J +(D$=E$) -2*(F<=3) |
Coding can be particularly concise when zeroes are involved:
D=0: IF A <> 0 AND B <> 0 AND C <> 0 THEN D=8 reduces to: D=-8*(A*B*C<>0) |
D
goes from zero to 8 only if A
, B
, and C
all have value. Now that's pretty.
Nested conditional statements can be combined:
IF R < 0 THEN Q=Q+1: IF S=0 THEN Q=Q+1 equates to: Q=Q +(R<0)*(1-(S=0)) |
Nothing happens unless whereupon
R<0
,Q
is incremented by 1
or
depending upon the value 2
,of S
.
In the olden days, constructs such as these were invaluable. For example,
GW-BASIC has no built-in function for converting an alphabetic
character to uppercase. The most concise solution for that was to create
a Function
, which could be defined only as a single program statement:
DEF FNUP$(C$) = CHR$(ASC(C$) -32*(ASC(C$)> 96)*(ASC(C$)< 123)) |
The Ascii value of any lowercase letter (97-122) is reduced by 32,
converting the character to its uppercase equivalent. If desired, one could
convert input to lower case by changing the formula's numbers to
+32, 64, 91
.
In order to convert an entire string, the characters had to be processed individually:
FOR J=1 TO LEN(A$) MID$(A$,J,1) = FNUP$(MID$(A$,J,1)) NEXT: RETURN |
Nowadays, we can simply apply a UCASE$()
function and be done with it. Many modern routines can be enhanced using
Boolean tactics nonetheless. Here's another old-fashioned tactic in
a menu-option routine that accepts only a proper response and sets
a variable accordingly, obviating some additional error-trapping code in
the process.
SUB GetSessionTime PRINT "(M)orning" PRINT "(A)fternoon" PRINT "(E)vening" PRINT "(L)ate" PRINT "Select a time: "; Session=0 DO UNTIL Session A$ = UCASE$(WAITKEY$) Session = -(A$="M") -2*(A$="A") -3*(A$="E") -4*(A$="L") LOOP END SUB |
And yes, there's a more modern solution to that as well using
INSTR(); but you get the idea.
A leap year is any year evenly divisible by 4, excepting the century years not evenly divisible by 400. So 2000 is a leap year, but 1900 and 2100 are not. A reasonable subroutine for making such a determination might look like this:
LEAPYR=0: IF YR MOD 4=0 THEN LEAPYR=1 IF (YR MOD 100=0) AND (YR MOD 400 <> 0) THEN LEAPYR=0 compare that with: LEAPYR = (YR MOD 4=0)*((YR MOD 100 <> 0) +(YR MOD 400=0)) |
Both routines return 1=Yes
, 0=No
.
To access the number of days in a given month, one has options.
For purposes of this example, M
is the month in question, and there
is no leap-year adjustment:
DIM DaysInMonth(12) AS INTEGER ARRAY ASSIGN DaysInMonth = 31,28,31,30,31,30,31,31,30,31,30,31 |
Alternatively, one could choose to eliminate the array by setting a variable (or a function) thusly:
DaysInMonth = 31 +(M=4 OR M=6 OR M=9 OR M=11) +3*(M=2) or: DaysInMonth = 31 +(M=4) +(M=6) +(M=9) +(M=11) +3*(M=2) |
Here are a couple of useful macros that also could be set up as functions:
MACRO OKchar(A)= (ASC(A)>= 32 AND ASC(A)<= 125) 'is a typewriter character MACRO ISdigit(N)= (ASC(N)>= 48 AND ASC(N)<= 57) 'is a digit |
A discussion of Booleans is not complete without an acknowledgement
of the downside. In fact, Booleans run slower than
IF..THEN statements. Convenient though
they may be, math calculations require more CPU time than logic functions.
In most programming situations, the time differential is unnoticeable;
however, the programmer would do well to avoid the use of Booleans in
multiple-loop structures where speed is important.
Also, don't plan on having a colleague proofread your code, unless
he/she also has been assimilated into the Boolean fold!
To close out this section, I would like to share one last tricky
function call that won me a monthly prize from a trade magazine.
C-64 BASIC had nothing resembling a PRINT USING
statement (the entire operating system was only 8k in size!), so data
had to be organized for printout in some other way. I doubt whether you
are still programming a Commodore; but if you simply are weary of formatting
PRINT USING statements, you might enjoy this function
that aligns numeric input around the decimal point, irrespective of the size
of the numbers:
DEF FNUSING(X) = INT(LOG(ABS(X)-(ABS(X)< 1))/LOG(10)) +(ABS(X)< 1) -(X=0 OR X=1000) |
This scary-looking formula merely determines the number of characters that
are to the left of the decimal point, by exploiting the idiosyncrasies of the
LOG function. This example will print any number
with the decimal point aligned at Column 20:
PRINT TAB(20-FNUSING(NUM)); NUM |
Data values were limited to less than 1 million, due to a BASIC anomaly regarding the powers of 1000 (although that probably has been repaired by now); larger numbers could be accommodated by augmenting the last term to:
... -(X=0 OR X=1000 OR X=1000000) |
In this case, X
would have to be defined as double-precision.
MANY PROGRAMMERS STILL DON'T GET IT
When I tried the command PRINT 2=2
to begin my first-ever
programming session, I discovered the most interesting feature of the
BASIC language. I promptly shared this find with my like-minded
brother, who now uses Booleans religiously in his own programs.
On the other hand, discussions with several university math instructors
didn't fare as well. I was amazed by the nearly universal ignorance
of the true capabilities of Boolean math, even though equivalent constructs
are valid in certain other programming languages and on any spreadsheet
(but be aware that they mostly use One professor
even claimed that "That isn't BASIC"; but his True = +1
)!face-saving
edict was erroneous. If the interpreter/compiler doesn't
complain, then the syntax is valid by default, whether in BASIC or any
other programming language.
Actually, there are other folks familiar with
Booleans — or at least, there were. For
Commodore-64 users, every byte of ram was so precious that
no spaces were required in program code! Because their usage
saved a lot of bytes, Boolean constructs were discussed regularly in the
C-64 trade journals; unfortunately, many correspondents admitted
that they simply couldn't understand them.
On the modern front, I recently viewed a German web page detailing some tricks for speeding up Visual BASIC programs. One example was the flawed design of this simple function:
Private Function Comp(a$, b$) as Boolean If a$ > b$ Then Comp = True Else Comp = False End Function |
The article correctly pointed out that not only is that ELSE clause redundant; but a simpler, faster construct is available:
Private Function Comp(a$, b$) as Boolean Comp = a$ > b$ End Function |
The difference between those two calls:
IF A$ > B$ THEN COMP = 1 ELSE COMP = 0 as compared to the Boolean equivalent: COMP = -(A$ > B$) |
Another cited VB shortcut is an inherently Boolean concept. The following statement contains a redundancy, yet one sees this sort of code all too frequently:
IF Z <> 0 THEN ... {so and so} |
A numeric variable equal to zero is logically False
. In fact,
that is the definition of having no value.
The same applies to string variables: a null string is False
—False
.
In all cases, a value of any kind renders a variable True
by
default. Therefore,
IF Z THEN ... {this and that} |
is perfectly valid, and it runs faster to boot. Variable Z
is evaluated; if it exists (has a value), then it is True
, and the next
portion of the statement is executed. The interpreter/compiler knows
the rules of symbolic logic; it doesn't need to be reminded that something
with a value is not equal to zero.
A living example of this redundancy can be found in one of the earlier constructs on this very page:
D=0: IF A <> 0 AND B <> 0 AND C <> 0 THEN D=8 could have been written more competently as: D=0: IF A AND B AND C THEN D=8 or even: D=0: IF A*B*C THEN D=8 although that still is not as concise as the formulated equivalent: D=-8*(A*B*C <> 0) |
The primary point is this: the fact that instructional pages such as the
German one exist at all indicates that even many present-day programmers
are largely ignorant of the Boolean Mystique. Perhaps this condition is
the result of yet another federal conspiracy, or perhaps some programmers
merely need to have their auras recharged.
I hope that you have enjoyed this introduction to the mysteriously beautiful, yet highly efficient world of Boolean math. Feedback is welcomed.