6.5.8 Floating Point

For the rules used by the text interpreter for recognising floating-point numbers see Number Conversion.

Gforth has a separate floating point stack, but the documentation uses the unified notation.9

Floating point numbers have a number of unpleasant surprises for the unwary (e.g., floating point addition is not associative) and even a few for the wary. You should not use them unless you know what you are doing or you don’t care that the results you get are totally bogus. If you want to learn about the problems of floating point numbers (and how to avoid them), you might start with David Goldberg, What Every Computer Scientist Should Know About Floating-Point Arithmetic, ACM Computing Surveys 23(1):5−48, March 1991.

Conversion between integers and floating-point:

s>f ( n – r ) floating-ext “s-to-f”
d>f ( d – r ) floating “d-to-f”
f>s ( r – n ) floating-ext “f-to-s”
f>d ( r – d ) floating “f-to-d”

Arithmetics:

f+ ( r1 r2 – r3 ) floating “f-plus”
f- ( r1 r2 – r3 ) floating “f-minus”
f* ( r1 r2 – r3 ) floating “f-star”
f/ ( r1 r2 – r3 ) floating “f-slash”
fnegate ( r1 – r2 ) floating “f-negate”
fabs ( r1 – r2 ) floating-ext “f-abs”
fcopysign ( r1 r2 – r3  ) gforth-1.0 “fcopysign”

r3 takes its absolute value from r1 and its sign from r2

fmax ( r1 r2 – r3 ) floating “f-max”
fmin ( r1 r2 – r3 ) floating “f-min”
floor ( r1 – r2 ) floating “floor”

Round towards the next smaller integral value, i.e., round toward negative infinity.

fround ( r1 – r2 ) floating “f-round”

Round to the nearest integral value.

ftrunc ( r1 – r2  ) floating-ext “f-trunc”

round towards 0

f** ( r1 r2 – r3 ) floating-ext “f-star-star”

r3 is r1 raised to the r2th power.

fsqrt ( r1 – r2 ) floating-ext “f-square-root”
fexp ( r1 – r2 ) floating-ext “f-e-x-p”
fexpm1 ( r1 – r2 ) floating-ext “f-e-x-p-m-one”

r2=e**r1−1

fln ( r1 – r2 ) floating-ext “f-l-n”
flnp1 ( r1 – r2 ) floating-ext “f-l-n-p-one”

r2=ln(r1+1)

flog ( r1 – r2 ) floating-ext “f-log”

The decimal logarithm.

falog ( r1 – r2 ) floating-ext “f-a-log”

r2=10**r1

f2* ( r1 – r2  ) gforth-0.2 “f2*”

Multiply r1 by 2.0e0

f2/ ( r1 – r2  ) gforth-0.2 “f2/”

Multiply r1 by 0.5e0

1/f ( r1 – r2  ) gforth-0.2 “1/f”

Divide 1.0e0 by r1.

Vector arithmetics:

v* ( f-addr1 nstride1 f-addr2 nstride2 ucount – r ) gforth-0.5 “v-star”

dot-product: r=v1*v2. The first element of v1 is at f_addr1, the next at f_addr1+nstride1 and so on (similar for v2). Both vectors have ucount elements.

faxpy ( ra f-x nstridex f-y nstridey ucount – ) gforth-0.5 “faxpy”

vy=ra*vx+vy

Angles in floating point operations are given in radians (a full circle has 2 pi radians).

fsin ( r1 – r2 ) floating-ext “f-sine”
fcos ( r1 – r2 ) floating-ext “f-cos”
fsincos ( r1 – r2 r3 ) floating-ext “f-sine-cos”

r2=sin(r1), r3=cos(r1)

ftan ( r1 – r2 ) floating-ext “f-tan”
fasin ( r1 – r2 ) floating-ext “f-a-sine”
facos ( r1 – r2 ) floating-ext “f-a-cos”
fatan ( r1 – r2 ) floating-ext “f-a-tan”
fatan2 ( r1 r2 – r3 ) floating-ext “f-a-tan-two”

r1/r2=tan(r3). ANS Forth does not require, but probably intends this to be the inverse of fsincos. In gforth it is.

fsinh ( r1 – r2 ) floating-ext “f-cinch”
fcosh ( r1 – r2 ) floating-ext “f-cosh”
ftanh ( r1 – r2 ) floating-ext “f-tan-h”
fasinh ( r1 – r2 ) floating-ext “f-a-cinch”
facosh ( r1 – r2 ) floating-ext “f-a-cosh”
fatanh ( r1 – r2 ) floating-ext “f-a-tan-h”
pi ( – r  ) gforth-0.2 “pi”

Fconstantr is the value pi; the ratio of a circle’s area to its diameter.

One particular problem with floating-point arithmetic is that comparison for equality often fails when you would expect it to succeed. For this reason approximate equality is often preferred (but you still have to know what you are doing). Also note that IEEE NaNs may compare differently from what you might expect. The comparison words are:

f~rel ( r1 r2 r3 – flag  ) gforth-0.5 “f~rel”

Approximate equality with relative error: |r1-r2|<r3*|r1+r2|.

f~abs ( r1 r2 r3 – flag  ) gforth-0.5 “f~abs”

Approximate equality with absolute error: |r1-r2|<r3.

f~ ( r1 r2 r3 – flag  ) floating-ext “f-proximate”

ANS Forth medley for comparing r1 and r2 for equality: r3>0: f~abs; r3=0: bitwise comparison; r3<0: fnegate f~rel.

f= ( r1 r2 – f ) gforth-0.2 “f-equals”
f<> ( r1 r2 – f ) gforth-0.2 “f-not-equals”
f< ( r1 r2 – f ) floating “f-less-than”
f<= ( r1 r2 – f ) gforth-0.2 “f-less-or-equal”
f> ( r1 r2 – f ) gforth-0.2 “f-greater-than”
f>= ( r1 r2 – f ) gforth-0.2 “f-greater-or-equal”
f0< ( r – f ) floating “f-zero-less-than”
f0<= ( r – f ) gforth-0.2 “f-zero-less-or-equal”
f0<> ( r – f ) gforth-0.2 “f-zero-not-equals”
f0= ( r – f ) floating “f-zero-equals”
f0> ( r – f ) gforth-0.2 “f-zero-greater-than”
f0>= ( r – f ) gforth-0.2 “f-zero-greater-or-equal”

Special values in IEEE754 can be derived by for example dividing by zero. The most common ones are defined as floating point constants for easy usage.

infinity ( – r  ) gforth-1.0 “infinity”

floating point infinity

-infinity ( – r  ) gforth-1.0 “-infinity”

floating point -infinity

NaN ( – r  ) gforth-1.0 “NaN”

floating point Not a Number


Footnotes

(9)

It’s easy to generate the separate notation from that by just separating the floating-point numbers out: e.g. ( n r1 u r2 -- r3 ) becomes ( n u -- ) ( F: r1 r2 -- r3 ).