ESE 345 Lecture Notes - Lecture 10: Sticky Bit, Scientific Notation, Double-Precision Floating-Point Format

15 views12 pages

Document Summary

Supports a wide range of values (small and large) Helps programmers with errors in real arithmetic. Support infinity, not a number (nan), exponent overflow and underflow. Keep encoding that is somewhat compatible with two"s complement. Ex: 0 in fp is same as 0 in two"s complement. Make it possible to sort without needing to do a special floating point comparison. Binary point like decimal point signifies boundary between integer and fractional parts: S represents sign (1 is negative, 0 is positive) What if result x is too large> Overflow: exponent is larger than can be represented. Underflow: negative exponent is larger than can be represented. Truncate, round towards +infinity, round towards -infinity. Round to nearest even using guard, round, and sticky bits. Next multiple of word size (64 bits) Represent numbers as small as 2. 0 * 10-308 to almost as large as 2. 0 * 10308. Primary advantage is greater precision due to larger significand.

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related Documents