how can I make C++ support 128-bit floating point numbers?

0 ⤊

how can I make C++ support 128-bit floating point numbers?

2006-06-26 23:00:11 · 1 answers · asked by hadoo 2 in Computers & Internet ➔ Programming & Design

1 answers

1 Only some Fortran compilers, and very few C/C++ compilers, support quadruple ... mpieee.maple, Program to convert floating-point numbers in decimal

2 F_floating format (single precision floating), DEC VAX, 32 bits, the first bit (high order bit in a register, first bit in memory) is the sign magnitude bit (one=negative, zero=positive or zero), followed by 15 bits of an excess 128 binary exponent, followed by a normalized 24-bit fraction with the redundant most significant fraction bit not represented. Zero is represented by all bits being zero (allowing the use of a longword CLR to set a F_floating number to zero). Exponent values of 1 through 255 indicate true binary exponents of -127 through 127. An exponent value of zero together with a sign of zero indicate a zero value. An exponent value of zero together with a sign bit of one is taken as reserved (which produces a reserved operand fault if used as an operand for a floating point instruction). The magnitude is an approximate range of .29*10-38 through 1.7*1038. The precision of an F_floating datum is approximately one part in 223, or approximately seven (7) decimal digits).

3 32 bit floating format (single precision floating), AT&T DSP32C, 32 bits, the first bit (high order bit in a register, first bit in memory) is the sign magnitude bit (one=negative, zero=positive or zero), followed by 23 bits of a normalized two’s complement fractional part of the mantissa, followed by an eight bit exponent. The magnitude of the mantissa is always normalized to lie between 1 and 2. The floating point value with exponent equal to zero is reserved to represent the number zero (the sign and mantissa bits must also be zero; a zero exponent with a nonzero sign and/or mantissa is called a “dirty zero” and is never generated by hardware; if a dirty zero is an operand, it is treated as a zero). The range of nonzero positive floating point numbers is N = [1 * 2-127, [2-2-23] * 2127] inclusive. The range of nonzero negative floating point numbers is N = [-[1 + 2-23] * 2-127, -2 * 2127] inclusive.

4 D_floating format (double precision floating), DEC VAX, 64 bits, the first bit (high order bit in a register, first bit in memory) is the sign magnitude bit (one=negative, zero=positive or zero), followed by 15 bits of an excess 128 binary exponent, followed by a normalized 48-bit fraction with the redundant most significant fraction bit not represented. Zero is represented by all bits being zero (allowing the use of a quadword CLR to set a D_floating number to zero). Exponent values of 1 through 255 indicate true binary exponents of -127 through 127. An exponent value of zero together with a sign of zero indicate a zero value. An exponent value of zero together with a sign bit of one is taken as reserved (which produces a reserved operand fault if used as an operand for a floating point instruction). The magnitude is an approximate range of .29*10-38 through 1.7*1038. The precision of an D_floating datum is approximately one part in 255, or approximately 16 decimal digits).

2006-06-26 23:08:30 · answer #1 · answered by Joe_Young 6 · 0⤊ 0⤋