Lesson 4. Fixed and Floating Point Binary
Lesson Objective
- Represent fractional numbers in fixed point form in binary in a given number of bits.
- Represent fractional numbers in floating point form in binary in a given number of bits.
- Know and be able to explain why both fixed point and floating point representation of decimal numbers may be inaccurate.
- Compare the advantages and disadvantages of fixed point and floating point forms in terms of range, precision and speed of calculation.
Lesson Notes
Fixed Decimal Point Numbers
Using bits to the right of the units column (after a notional point) introduces fractional values.
Fractional values are negative powers of 2.
A fixed-point binary value uses a specified number of bits where the placement of the binary point is fixed.
For example, in an 8 bit fixed-point binary value, the binary point could be set between the fourth and fifth bits.
23 |
22 |
21 |
20 |
|
2-1 |
2-2 |
2-3 |
2-4 |
|
-8 |
4 |
2 |
1 |
|
1/2 |
1/4 |
1/8 |
1/16 |
|
0 |
0 |
0 |
1 |
• |
1 |
0 |
0 |
1 |
-1.562510 |
Floating Point Binary
A Real number in binary has three parts:
- The Sign: positive or negative number
- Mantissa: the part of a floating-point number which represents the significant digits of that number (the value)
- Exponent: is the power the value is raised to (how much the decimal point needs to be shifted)
Mantissa |
Exponent |
-1 |
|
1/2 |
1/4 |
1/8 |
1/16 |
-4 |
2 |
1 |
1 |
• |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
- Sign bit = -1
- Mantissa represented as a Two's Complement Number
- Exponent represented as a Two's Complement Number
Standard Form?
Standard form, also known as scientific notation, is a way of writing very large or very small numbers in a way that makes them easier to read and write. It is based on the idea of using powers of 10 to represent the number. Computers, however, work with binary values. So, instead of multiplying by powers of ten, they use floating point representation.
5,000,000 can be written as 5 x 106
- Mantissa = 5
- Exponent = 6
- Base = 10
Floating Point - Positive Exponent
By using floating point binary we can increase accuracy of our binary number. It also means we can represent more numbers.
Mantissa |
Exponent |
-1 |
|
1/2 |
1/4 |
1/8 |
1/16 |
-4 |
2 |
1 |
0 |
• |
1 |
0 |
1 |
1 |
0 |
1 |
0 |
To store the number in standard form, the first and second digits have to be opposite.
Mantissa and Exponent stored as one number.However, the Mantissa is the number that is displayed. The Exponent represents the position of the floating point.
Example:
- The floating point always starts at the same position.
- Exponent in the example is +2.
- This means the floating point moved 2 places to the right.
- The Base 2 weighting also changes.
Mantissa |
Exponent |
-4 |
2 |
1 |
|
1/2 |
1/4 |
-4 |
2 |
1 |
0 |
1 |
0 |
• |
1 |
1 |
0 |
1 |
0 |
2 + 0.5 + 0.25 = 2.75
Floating Point - Negative Exponent
In the example below the exponent is -2
Mantissa |
Exponent |
-1 |
|
1/2 |
1/4 |
1/8 |
1/16 |
-4 |
2 |
1 |
0 |
• |
1 |
0 |
1 |
1 |
1 |
1 |
0 |
If the exponent was a negative number the floating point will move to the left.
The Mantissa (🦗) can increase the number of bits to accommodate the floating point.
Example:
Mantissa |
Exponent |
-1 |
|
1/2 |
1/4 |
1/8 |
1/16 |
1/32 |
1/64 |
-4 |
2 |
1 |
0 |
• |
0 |
0 |
1 |
0 |
1 |
1 |
1 |
1 |
0 |
1/8(0.125) + 1/32(0.03125) + 1/64(0.015625) = 0.171875
Floating Point - Negative Mantissa and Negative Exponent
In the example below the mantissa is -0.8125 exponent is -2
When both Exponent and Mantissa are negative numbers. Any new binary digit added must be a 1.
Mantissa |
Exponent |
-1 |
|
1/2 |
1/4 |
1/8 |
1/16 |
-4 |
2 |
1 |
1 |
• |
0 |
0 |
1 |
1 |
1 |
1 |
0 |
The Mantissa (🦗) result on the right shows the new place digits as 1's
The same principle will apply to positive Mantissas (ignoring the +/- status). If it starts with 0, new place digits will be 0's
Example:
Mantissa |
Exponent |
-1 |
|
1/2 |
1/4 |
1/8 |
1/16 |
1/32 |
1/64 |
-4 |
2 |
1 |
1 |
• |
1 |
1 |
0 |
0 |
1 |
1 |
1 |
1 |
0 |
-1 + 1/2 + 1/4 + 1/32 + 1/64 = -0.203125
Floating Point - Negative Mantissa and Positive Exponent
The example Exponent is now +4. the Mantissa is -1.125
This means the floating point will need to move 4 spaces to the right.
Mantissa |
Exponent |
-1 |
|
1/2 |
1/4 |
1/8 |
-8 |
4 |
2 |
1 |
1 |
• |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
The Mantissa (🦗) result on the right shows that 2 extra spaces/digits have been added to the right.
Right of the floating point is the fractional value.
Left of the point would be the standard Base 2 weighting.
Example:
Mantissa |
Exponent |
-16 |
8 |
4 |
2 |
1 |
|
1/2 |
-8 |
4 |
2 |
1 |
1 |
0 |
0 |
1 |
0 |
• |
0 |
0 |
1 |
0 |
0 |
-16 + 2 = -14
Fixed vs Floating
Fixed and floating point each have their own advantages and disadvantages in terms of range, precision and the speed of calculation.
Floating point allows a far greater range of numbers using the same number of bits. Very large numbers and very small fractional numbers can be represented. The larger the mantissa, the greater the precision, and the larger the exponent, the greater the range.
Fixed-point numbers have a limited range, which is determined by the number of bits used to represent them. Fixed-point numbers have a fixed precision, which means that the same number of digits are always stored, regardless of the value of the number.
Fixed point binary is a simpler system and is faster to process compared to floating point.
Advantages of Fixed Point:
- Faster calculations
- Less hardware required
- Easier to debug
Advantages of Floating Point:
- Wide range
- Variable precision
- Can represent a wider variety of values
The best choice of representation will depend on the specific application. If speed and efficiency is critical, then fixed-point may be the better choice. If range or precision is critical, then floating-point may be the better choice.
Normalisation
There are two main reasons why we need to normalize floating point binary numbers:
- To ensure maximum accuracy. When a floating point number is normalized, the mantissa (the part of the number to the right of the decimal point) is as large as possible without overflowing the number. This means that the number can be represented with the least number of bits, which in turn gives the greatest possible accuracy.
- To ensure uniqueness. When a floating point number is normalized, each unique number has only one possible bit pattern to represent it. This is important for ensuring that floating point operations are performed correctly.
It is the process of moving the binary point of a floating point number to provide the maximum level of precision for a given number of bits.
- To do this for a positive binary number involves removing any leading zeros (0s).
- To do the same for a negative binary number involves removing and leading ones (1s).
- This means that a normalised floating point number must always start as either 0.1 or 1.0.