G.2.1 Model of Floating Point Arithmetic
In the strict mode, the predefined operations of
a floating point type shall satisfy the accuracy requirements specified
here and shall avoid or signal overflow in the situations described.
This behavior is presented in terms of a model of floating point arithmetic
that builds on the concept of the canonical form (see A.5.3
Associated with each floating point type is an infinite
set of model numbers. The model numbers of a type are used to define
the accuracy requirements that have to be satisfied by certain predefined
operations of the type; through certain attributes of the model numbers,
they are also used to explain the meaning of a user-declared floating
point type declaration. The model numbers of a derived type are those
of the parent type; the model numbers of a subtype are those of its type.
The model numbers
floating point type T are zero and all the values expressible in the
canonical form (for the type T), in which mantissa
digits and exponent
has a value greater than or equal to T'Model_Emin.
(These attributes are defined in G.2.2
A model interval
floating point type is any interval whose bounds are model numbers of
The model interval
of a type T associated
with a value v
is the smallest model interval of T that includes
. (The model interval associated with a model number of a type
consists of that number only.)
The accuracy requirements for the evaluation of certain
predefined operations of floating point types are as follows.
An operand interval
the model interval, of the type specified for the operand of an operation,
associated with the value of the operand.
For any predefined
arithmetic operation that yields a result of a floating point type T,
the required bounds on the result are given by a model interval of T
(called the result interval) defined in terms of the operand values
- The result interval
is the smallest model interval of T that includes the minimum and the
maximum of all the values obtained by applying the (exact) mathematical
operation to values arbitrarily selected from the respective operand
The result interval of an exponentiation is obtained
by applying the above rule to the sequence of multiplications defined
by the exponent, assuming arbitrary association of the factors, and to
the final division in the case of a negative exponent.
The result interval of a conversion of a numeric
value to a floating point type T is the model interval of T associated
with the operand value, except when the source expression is of a fixed
point type with a small that is not a power of T'Machine_Radix
or is a fixed point multiplication or division either of whose operands
has a small that is not a power of T'Machine_Radix; in these cases,
the result interval is implementation defined.
For any of
the foregoing operations, the implementation shall deliver a value that
belongs to the result interval when both bounds of the result interval
are in the safe range of the result type T, as determined by the values
of T'Safe_First and T'Safe_Last; otherwise,
- if T'Machine_Overflows
is True, the implementation shall either deliver a value that belongs
to the result interval or raise Constraint_Error;
- if T'Machine_Overflows is False, the
result is implementation defined.
For any predefined relation on operands of a floating
point type T, the implementation may deliver any value (i.e., either
True or False) obtained by applying the (exact) mathematical comparison
to values arbitrarily chosen from the respective operand intervals.
The result of a membership test is defined in terms
of comparisons of the operand value with the lower and upper bounds of
the given range or type mark (the usual rules apply to these comparisons).
If the underlying floating point hardware implements
division as multiplication by a reciprocal, the result interval for division
(and exponentiation by a negative exponent) is implementation defined.