REAL signatureThe REAL signature specifies structures that implement floating-point numbers. The semantics of floating-point numbers should follow the IEEE standard 754-1985 and the ANSI/IEEE standard 854-1987. In addition, implementations of the REAL signature are required to use non-trapping semantics. Additional aspects of the design of the REAL and MATH signatures were guided by the Floating-Point C Extensions developed by the X3J11 ANSI committee and the lecture notes by W. Kahan on the IEEE standard 754.
The sign of a zero is ignored in all comparisons.
The relation between the comparison predicates defined here and those defined by IEEE, ANSI C and FORTRAN is specified in [TABLE].
| SML | IEEE | C | FORTRAN | 
|---|---|---|---|
| == | = | == | .EQ. | 
| != | ?<> | != | .NE. | 
| < | < | < | .LT. | 
| <= | <= | <= | .LE. | 
| > | > | > | .GT. | 
| >= | >= | >= | .GE. | 
| ?= | ?= | !islessgreater | .UE. | 
| not o ?= | <> | islessgreater | .LG. | 
| unordered | ? | isunordered | unordered | 
| not o unordered | <=> | !isunordered | .LEG. | 
| not o op < | ?>= | ! < | .UGE. | 
| not o op <= | ?> | ! <= | .UG. | 
| not o op > | ?<= | ! > | .ULE. | 
| not o op >= | ?< | ! >= | .UL. | 
In the functions below, unless specified otherwise, if any argument is a NaN, the return value is a NaN. In a list of rules specifying the behavior of a function in special cases, the first matching rule defines the semantics.
Rationale:
The specification of the default signature and structure for non-integer arithmetic, particularly concerning exceptional conditions, was the source of much debate, given the desire of allowing implementations to provide efficient floating-point modules. Permitting implementations to differ on whether or not, for example, to raise Div on division by zero meant that the user really did not have a standard to program against. Portable code would require adopting the more conservative position of explicitly handling exceptions. A second alternative was to specify that functions in the Real structure must raise exceptions, but that implementations so desiring could provide additional structures matching REAL with explicit floating-point semantics. This was rejected because it meant that the default
realtype would not be the same as a defined floating-pointrealtype. This imbued a second-class statu! s on the latter, while providing a default real of lesser performance and involving additional implementation complexity for little benefit.Deciding if real should be an
eqtype, and if so, what should equality mean, was also problematic. IEEE specifies that the sign of zeros be ignored in comparisons, and that equality evaluate to false if either argument is a NaN. These constraints are disturbing to the SML programmer. The former implies that0 = ~0is true whiler/0 = r/~0is false. The latter implies such anomalies asr = ris false, or that, for a ref cellrr, we could haverr = rrbut not have!rr = !rr. We accepted the unsigned comparison of zeros, but felt that the reflexive property of equality, structural equality, and the desire that<>be equivalent tonot o =ought to be preserved. Additional complications led to the decision to not have real be aneqtype. Additional rationale.The type, signature and structure identifiers real, REAL and Real, although misnomers in light of the floating-point-specific nature of the modules, were retained for historical reasons.
signature REAL
structure Real : REAL
structure LargeReal : REAL
structure Real{N} : REAL
type real
structure Math : MATH
val radix : int         
val precision : int         
val maxFinite : real       
val minPos : real       
val minNormalPos : real         
val posInf : real       
val negInf : real         
val + : (real * real) -> real       
val - : (real * real) -> real         
val * : (real * real) -> real         
val / : (real * real) -> real         
val *+ : real * real * real -> real       
val *- : real * real * real -> real         
val ~ : real -> real         
val abs : real -> real         
val min : (real * real) -> real       
val max : (real * real) -> real         
val sign : real -> int         
val signBit : real -> bool         
val sameSign : (real * real) -> bool         
val copySign : (real * real) -> real         
val compare : (real * real) -> order         
val compareReal : (real * real) -> IEEEReal.real_order         
val < : (real * real) -> bool       
val <= : (real * real) -> bool       
val > : (real * real) -> bool       
val >= : (real * real) -> bool         
val == : (real * real) -> bool       
val != : (real * real) -> bool         
val <> : (real * real) -> bool       
val ?= : (real * real) -> bool         
val unordered : (real * real) -> bool         
val isFinite : real -> bool         
val isNan : real -> bool         
val isNormal : real -> bool         
val class : real -> IEEEReal.float_class         
val fmt : StringCvt.realfmt -> real -> string       
val toString : real -> string         
val fromString : string -> real option         
val scan : (char, 'a) StringCvt.reader -> (real, 'a) StringCvt.reader         
val toManExp : real -> {man : real, exp : int}         
val fromManExp : {man : real, exp : int} -> real         
val split : real -> {whole : real, frac : real}         
val rem : (real * real) -> real         
val nextAfter : (real * real) -> real         
val checkFloat : real ->real       
val realFloor : real -> real         
val realCeil : real -> real         
val floor : real -> Int.int         
val ceil : real -> Int.int         
val trunc : real -> Int.int         
val round : real -> Int.int         
val toInt : IEEEReal.rounding_mode -> real -> int         
val toLargeInt : IEEEReal.rounding_mode -> real -> LargeInt.int         
val fromInt : int -> real       
val fromLargeInt : LargeInt.int -> real         
val toLarge : real -> LargeReal.real       
val fromLarge : IEEEReal.rounding_mode -> LargeReal.real -> real         
type real
eqtype.     
structure Math
radix
          
precision
          
0 and            radix-1, in the mantissa.     
maxFinite
          
          minPos
          
          minNormalPos
          
val posInf
val negInf
r1 + r2
          
          r1 - r2
          
r1 * r2
          
r1 / r2
          
NaN and            +-infinity / +-infinity = NaN.           Dividing a finite, non-zero number by a zero,            or an infinity by a finite number produces an infinity with           the correct sign. (Note that zeros are signed.) A finite           number divided by an infinity is 0 with the correct sign.     
*+ (a, b, c)
          
          *- (a, b, c)
          
a*b + c            and a*b - c, respectively.           Their behaviors on infinities follow from the behaviors derived           from addition, subtraction and multiplication.           
          The precise semantics of these operations depend on the language           implementation and the underlying hardware. Specifically,           certain architectures provide these operations as a single           instruction, possibly using a single rounding operation.           Thus, the use of these operations may be           faster than performing the individual arithmetic operations           sequentially, but may also cause different rounding behavior.     
~ r
          
(0.0 - r).           ~ (+-infinity) = -+infinity.     
abs r
          
abs ((+-infinity) = infinity.     
min (a, b)
          
          max (a, b)
          
sign r
          
signBit r
          
true if and only if the sign of r (infinities,           zeros and NaNs, included) is negative.     
sameSign (r1, r2)
          
signBit r1            equals signBit r2.     
copySign (x, y)
          
compare (r1, r2)
          
          compareReal (r1, r2)
          
The function compareReal behaves similarly except it returns values of type IEEEReal.real_order and returns IEEEReal.UNORDERED on unordered arguments.
Implementation note:
Implementations should try to optimize use of Real.compare, since it is necessary for catching NaNs.
r1 < r2
          
          r1 <= r2
          
          r1 > r2
          
          r1 >= r2
          
          Note that these operators return false on unordered arguments,           i.e., if either argument is NaN, so that the usual reversal of           comparison under negation does not hold, e.g.,           a < b is not the same as not (a >= b).      
== (x, y)
          
          != (x, y)
          
= operator.           
          The second function != is equivalent to not o op ==           and the IEEE ?<> operator.     
<> (x, y)
          
          ?= (x, y)
          
x < y or x > y;           in particular, neither y nor x is NaN.           This is equivalent to the IEEE <> operator.           
          The second function ?= is equivalent to not o op <>           and the IEEE ?= operator. It returns true if either argument           is a NaN or if the arguments are bitwise equal, ignoring signs on zeros.     
unordered (x, y)
          
isFinite x
          
isNan x
          
isNormal x
          
class x
          
fmt spec r
          
          toString r
          
SCI (SOME i) or           FIX (SOME i)            with i < 0, or if if spec is           GEN (SOME i)            with i !
< 1.  	  
The value returned by toString is equivalent to:
          (fmt (StringCvt.GEN NONE) r)
	  
 	  
          If r is a finite value, these functions will always produce           a valid SML real constant            unless constrained otherwise (e.g., spec is FIX (SOME 0),           which specifies no fractional part).           These functions return "NaN" on NaN values and           "+inf" and "-inf" on positive and negative infinities,           respectively.     
fromString s
          
SOME(r) if a real            value can be scanned from a prefix of s,  	  ignoring any initial whitespace; otherwise, returns NONE.           Equivalent to StringCvt.scanString scan.     
scan getc a
          
SOME(r,rest) where 	  r is the scanned real value and rest is            the unused portion of the character source a.           Raises Overflow if the value cannot be represented in           real value.           
The format for valid string representation of reals is given by the regular expression
	  [+~-]?(([0-9]+(\.[0-9]+)?)|(\.[0-9]+))([eE][+~-]?[0-9]+)?
          
      
toManExp r
          
{man, exp}, where man and exp are           the mantissa and exponent of r, respectively. Specifically,           we have the relation           
          r = man * radix(exp)           
           where 1.0 <= man * radix < radix.           This function is comparable to frexp in the C library.           
          If r is +-0,           man is +-0 and exp is +0.           If r is +-infinity,            man is +-infinity and exp is unspecified.           If r is NaN,           man is Nan and exp is unspecified.     
fromManExp {man, exp}
          
radix(exp).           This function is comparable to ldexp in the C library.           Note that non-exceptional arguments can produce zero or infinities,           essentially because of underflows and overflows.           
          If man is +-0, the result is +-0.           If man is +-infinity, the result is +-infinity.           If man is NaN, the result is NaN.     
split r
          
{whole, frac}, where            frac and whole are           the fractional and integral parts of r, respectively.           Specifically, whole is integral,            |frac| < 1.0, whole            and frac have the same sign as r,            and r = whole + frac.           This function is comparable to modf in the C library.           
          If r is +-infinity, whole is           +-infinity and frac is +-0.           If r is NaN, both whole and frac are NaN.     
rem (x, y)
          
trunc (x / y). The result           has the same sign as x and has absolute value less than           the absolute value of y.           
          If x is an infinity or y is 0, rem returns NaN.           If y is an infinity, rem returns x.     
nextAfter (r, t)
          
r = t then it returns r.           If r is +-infinity, it returns +-infinity.           If either argument is a NaN, this returns NaN.     
checkFloat x
          
          This can be used to synthesize trapping arithmetic from the            non-trapping operations given here. Note, however, that infinities can            be converted to NaNs by some operations, so that if accurate            exceptions are required, checks must be done after each operation.      
realFloor r
          
          realCeil r
          
floor r
          
          ceil r
          
          trunc r
          
          round r
          
These are respectively equivalent to:
         toInt IEEEReal.TO_NEGINF r
         toInt IEEEReal.TO_POSINF r
         toInt IEEEReal.TO_ZERO r
         toInt IEEEReal.TO_NEAREST r
       
        
Question:
Presumably raises Domain on infinities as well. Also, toInt, toLargeInt, realFloor, realCeil.
toInt mode x
          
          toLargeInt mode x
          
Question:
Should we provide real-valued versions of this?
fromInt i
          
          fromLargeInt i
          
toLarge x
          
          fromLarge mode x
          
Implementation note:
Implementations may choose to provide a debugging mode, in which NaNs and Infs are detected when they are generated.
MATH, IEEEReal, StringCvt
Last Modified January 21, 1997
Copyright © 1996 AT&T