Review of floating point representation from last time The ieee floating point standard (notes)

Download 447 b.
Hajmi447 b.

Lecture 22

The IEEE floating point standard

  • 1985

  • 3 key requirements

  • standardized representation format: single (32 bits), double (64 bits), extended (> 64 bits, in practice 80 bits)

  • correctly rounded arithmetic

  • provide for avoiding exceptions by generating infinities and NaNs

Correctly rounded arithmetic

  • The arithmetic operations +,-,*,/ on 2 floats x,y must return the float that is closest to the exact answer

  • Possible to change rounding mode at the hardware level so that answers round up or down instead of to nearest, but this is not normally used and not easily accessible from Java

Infinities and NaNs

  • x/0.0 is inf when x>0, -inf when x<0 and NaN when x==0

  • inf/x is inf when x>=0, -inf when x<=0

  • inf and –inf are very different

  • there are different representations for 0 and -0, but they test equal

  • any operation with NaN gives NaN

  • any comparison with NaN gives false, even x==x when x is NaN

The 80-bit extended format

  • All Pentium PCs have 80 bit floating point registers where the arithmetic operations are executed (whereas Sun Sparc, Apple Power-PC, etc use 64 bit registers)

  • On a PC, in the C language, “long double” uses the same 80-bit format.

  • However, the Java language does not allow the use of the 80-bit format and, in Java 1.1, insisted that the operations be carried out as if they were being done in a 64-bit register

  • In Java 1.1, the result was that floating point was very slow on a Pentium, as it required software modification to hardware results

  • In Java 1.2, this was relaxed enough to make floating point fast again

  • The keyword strictfp is used to insist on identical results on all machines, but this makes floating point very slow

Do'stlaringiz bilan baham:

Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan © 2017
ma'muriyatiga murojaat qiling