# IEEE 754 Machine Numbers and Machine Arithmetic

In order to make numerical programs more portable between different
machines, the IEEE 754 standard defines machine numbers and how arithmetic
operations should be performed. Virtually all current computers comply with
this standard.

William Kahan and the History of IEEE 754

Soon after its conception in 1977 this standard has been implemented by
virtually all numerical processors.

## Machine Numbers

Machine numbers are stored as a sequence of* k* + *n* bits
(each of which is 0 or 1):

*s e*_{1} ... *e*_{k}
*d*_{2} ... *d*_{n}

For **single precision **numbers we have *n*=24,
*k*=8.

For **double precision **numbers we have *n*=53,
*k*=11.

The **sign** is ``+'' for *s*=0 and ``-'' for
*s*=1.

The **exponent** is obtained as *e* =
(*e*_{1} ... *e*_{k} )_{2} - *b*
where *b* = 2^{k-1}-2. The largest and smallest values of
*e* are used to represent special values. Hence the smallest remaining
value is *e*_{min} = 1 - *b* = 3 -
2^{k-1}, the largest remaining value is
*e*_{max} = 2^{k} - 2 - *b* =
2^{k-1}.

- For
*e*_{min} <= *e* <=
*e*_{max} we have
*x* =
±(.1*d*_{2}...*d*_{n})_{2}
2^{e}, representing **normalized
numbers**
- For
*e* = *e*_{min} - 1 we have
*x* =
±(.0*d*_{2}...*d*_{n})_{2}
2^{emin}, representing **±0**
and **subnormal numbers** (aka *denormalized
numbers*).
- For
*e* = *e*_{max} + 1 we have
*x* = **±Infinity** if all
*d*_{j}=0

*x* = **NaN** otherwise

**Note:** All numbers with sign "+", arranged by size from +0
up to +Infinity correspond to all the bit sequences (0 0...0 0...0) up to
(0 1...1 0...0), arranged as binary integers. Therefore it is easy to compare
two machine numbers, or to find the next smaller or larger machine number.

## Rounding

Normally rounding ``to nearest'' is enabled. Let *x*_{max} be the
largest machine number and *x* be an arbitrary real number.
- For |
*x*| > *x*_{max}
- fl(
*x*) = ±Infinity
- otherwise
- fl(
*x*) is the nearest machine number. In the case of a tie the
number with *d*_{n}=0 is chosen.

Other rounding modes are ``towards +Infinity'', ``towards -Infinity'',
``towards 0'' (chopping).

## Machine Arithmetic

For **addition, subtraction, multiplication, division and square
roots** of machine numbers the **rounded exact result**
must be returned. E.g., adding two machine numbers *x*, *y* returns
the machine number fl(*x*+*y*). For all combination of machine
numbers (including ±0, ±Infinity, NaN) the result of the operation is well
defined: E.g., 1/±0=±Infinity, Infinity+Infinity=Infinity,
Infinity-Infinity=NaN, 0/0=NaN, 0*Infinity=NaN. Any arithmetic
operation involving NaN returns NaN.

Note that there are two distinct machine numbers +0 and -0 which behave
differently in expressions such as 1/0. However, IEEE 754 defines the
comparison operator "==" such that +0==-0 is true. Note that NaN==NaN is
defined as *false*.