AlbaCode
Higher / Computer Systems / Floating Point Representation

Floating Point Representation

Learn how computers store real numbers

Introduction

We have now learned to use two's complement which allows us to store negative numbers as well as positive numbers. However, this approach has limitations: it does't allow us to store real numbers (numbers with decimal parts) and makes storing large integers inefficient.

In this section, we will learn how a computer uses floating point representation to break numbers up into a mantissa and exponent.

Floating Point Representation

When working with floating point numbers each number is allocated a certain number of bits. Some of these might include:

  • 16 bit
  • 24 bit
  • 32 bit
  • 64 bit

This is the number of bits that will make up the entire number. This has to be split between the mantissa and the exponent.

We can do this in many different ways. For example we could split 32 bits between the mantissa and exponent as follows:

MantissaExponent
24 bits8 bits
16 bits16 bits
8 bits24 bits

The way in which we choose to split this will affect the range of numbers that we can store and the precision of those numbers as you will find out in the next section.

Mantissa

The mantissa is made up of 2 parts:

  • Sign Bit
  • Significand

The sign bit is a single bit that stores whether the number is positive or negative. If the number is positive the sign bit will be 0 and if the number is negative the sign bit will be 1.

The significand is the remainder of the mantissa and stores the significant digits of the number we are storing.

Exponent

The exponent stores the number of places the decimal point has been moved.

Examples

The 5 examples below which show you 5 different possible question you could be asked. Each differs slightly and in order to fully understand floating point representation you should ensure that you understand all 5 examples.

Example 1

Convert the following binary number into floating-point representation.

-11 0000 1111.1101 1

There are 16 bits for the mantissa (including the sign bit) and 8 bits for the exponent.

Sign bitMantissaExponent
   

Sign Bit

The sign bit is a single binary digit that stores whether the number is positive or negative.

If the number is positive, the sign bit is 0

If the number is negative, the sign bit is 1

In this case we can see that the number contains a negative sign and therefore the sign bit is 1.

Sign bitMantissaExponent
1  

Mantissa

The mantissa stores the significant figures.

All floating point numbers are stored in the following format:

0·1...

For any floating point number, we must move the decimal point till it is left of the first 1

In our current example, the decimal point will move as follows:

Floating point number

This results in the decimal point being placed to the left of the first 1:

Floating point number

The numbers after the decimal point become the mantissa.

Sign bitMantissaExponent
11100 0011 1111 011 

We have 16 bits available for the mantissa, but this includes the sign bit.

We have used 1 bit to store the sign bit which leaves 15 bits available for the rest of the mantissa.

The rest of the mantissa must be exactly 15 bits long, which it is!

Exponent

The exponent stores the number of places that the decimal point has been moved.

We can simply count this.

Floating point number

As we can see above, the decimal point has been moved 10 places to the left. Therefore, our exponent is 10!

We have 8 bits available for the exponent and will write 10 as an 8-bit binary number.

1286432168421
00001010

Therefore our mantissa is:

Sign bitMantissaExponent
11100 0011 1111 0110000 1010

And that is us finished!

It is worth noting that the mantissa won't always be exactly 15 bits long and the exponent won't always be a positive number as we will see in the next 2 examples.

Example 2

Convert the following binary number into floating-point representation.

1111 0011.001

There are 16 bits for the mantissa (including the sign bit) and 8 bits for the exponent.

Sign bitMantissaExponent
   

Sign Bit

The sign bit is a single binary digit that stores whether the number is positive or negative.

Again, if the number is positive, the sign bit is 0. If the number is negative, the sign bit is 1

The number does not have a negative sign and therefore is positive. Our sign bit in this case is 0.

Sign bitMantissaExponent
0  

Mantissa

The mantissa stores the significant figures.

All floating point numbers are stored in the following format:

0·1...

For any floating point number, we must move the decimal point till it is left of the first 1

In our current example, the decimal point will move as follows:

Floating point number

This results in the decimal point being placed to the left of the first 1:

Floating point number

The numbers after the decimal point become the mantissa.

Sign bitMantissaExponent
01111 0011 001 

Again, we have 16 bits available for the mantissa, but this includes the sign bit.

We have used 1 bit to store the sign bit which leaves 15 bits available for the rest of the mantissa.

The rest of the mantissa must be exactly 15 bits long. Our current mantissa is only 11 bits long.

When this occurs, we pad the right side of the mantissa with 0's until it it the correct length.

In this case our mantissa becomes:

Sign bitMantissaExponent
01111 0011 0010 000 

Exponent

Remember, the exponent stores the number of places that the decimal point has been moved.

Floating point number

As we can see above, the decimal point has been moved 8 places to the left. Therefore, our exponent is 8!

We have 8 bits available for the exponent and will write 8 as an 8-bit binary number.

Therefore our mantissa is:

Sign bitMantissaExponent
01111 0011 0010 0000000 1000

And we are done!

Example 3

Convert the following binary number into floating-point representation.

-0.0001 1001 11

There are 16 bits for the mantissa (including the sign bit) and 8 bits for the exponent.

Sign bitMantissaExponent
   

Sign Bit

This time the number is negative and therefore our sign bit is 1.

Sign bitMantissaExponent
1  

Mantissa

As stated before, all floating point numbers are stored in the following format:

0·1...

For any floating point number, we must move the decimal point till it is left of the first 1

In this example, the decimal point will move right instead of left

Floating point number

This results in the decimal point being placed to the left of the first 1:

Floating point number

The numbers after the decimal point become the mantissa.

Sign bitMantissaExponent
11100 111 

Again, since we have 16 bits available for the mantissa and have used 1 bit for the sign bit, we have 15 bits to fill.

Currently we have only used 7 bits of the 15 available, so we will add 8 0's to the right.

Sign bitMantissaExponent
11100 1110 0000 000 

Exponent

Remember, the exponent stores the number of places that the decimal point has been moved.

Floating point number

This time the decimal point has been moved 3 places to the right.

The direction the decimal point moved matters.

Floating point number

If the decimal point moves to the left, our exponent is positive.

If the decimal point moves to the right, our exponent is negative.

This time the decimal point has been moved 3 places to the right, so our exponent is -3.

There are 8 bits available for the exponent. Therefore we will write -3 as an 8-bit two's compliment number.

Sign bitMantissaExponent
11100 1110 0000 0001111 1101

Finished.

Example 4

Convert the following binary number into floating-point representation.

1100 1111 1111. 1101 11

There are 16 bits for the mantissa (including the sign bit) and 8 bits for the exponent.

Sign bitMantissaExponent
   

Sign Bit

This time the number is positive and therefore our sign bit is 0.

Sign bitMantissaExponent
0  

Mantissa

Let's move the decimal point and find the mantissa.

Floating point number

This makes our mantissa:

Sign bitMantissaExponent
01100 1111 1111 1101 11 

However, there are only 16 bits available for the mantissa. 1 is used for the sign bit leaving us with 15 for the rest of the mantissa.

We only have 15 bits available but have used 18 bits! This is an issue, we can't use more bits than are available.

In this case the mantissa cuts off when it hits the limit of the number of bits available.

The mantissa used above is 3 bits over the allocation, therefore the last 3 bits are cut off.

Sign bitMantissaExponent
01100 1111 1111 110 

This leads to a loss of data and is one of the issues that can arise when there are not enough bits allocated to the mantissa.

Exponent

The exponent stores the number of places that the decimal point has been moved.

Floating point number

This time the decimal point has been moved 12 places to the left.

Floating point number

We are moving left so our number will be positive.

As stated in the question, the exponent is allocated 8 bits, therefore we will write 12 as an 8 bit binary number.

Sign bitMantissaExponent
01100 1111 1111 1100000 1100

Done.

Example 5

Convert the following binary number into floating-point representation.

-0.0011 1101 1100 1111

There are 28 bits for the mantissa (including the sign bit) and 4 bits for the exponent.

Sign bitMantissaExponent
   

Sign Bit

This time the number is negative and therefore our sign bit is 1.

Sign bitMantissaExponent
1  

Mantissa

This time our mantissa has been allocated 28 bits in total. 1 bit will be used for the sign bit leaving 27 bits for the rest of the mantissa.

Again we will move the decimal point to achieve the following format:

0·1...

Floating point number

The numbers after the decimal point become the mantissa.

Sign bitMantissaExponent
11111 0111 0011 11 

There are 23 bits allocated so we will fill the remaining space with 0's.

Sign bitMantissaExponent
11111 0111 0011 1100 0000 0000 000 

Exponent

The exponent stores the number of places that the decimal point has been moved.

Floating point number

This time the decimal point has been moved 2 places to the right.

Floating point number

Since we are moving to the right, our exponent will be negative.

We have been allocated 4 bits for our exponent. Therefore we will write -2 as a 4-bit two's compliment number.

Sign bitMantissaExponent
11111 0111 0011 1100 0000 0000 0001110

We have finished the final example.