Class Notes (834,967)
Canada (508,836)
CSC207H1 (40)
Lecture

nov28c.docx

5 Pages
149 Views
Unlock Document

Department
Computer Science
Course
CSC207H1
Professor
Diane Horton
Semester
Fall

Description
NOVEMBER 28  Floating Point o ex. Some simple arithmetic, and checking that laws of mathematics hold. o Adding.java 1. package floatingpoint; 2. 3. public class Adding { 4. 5. public static void main(String[] args) { 6. 7. double x = 0.1; 8. double sum = x + x + x; 9. System.out.println(sum == .3); 10. System.out.println(sum); 11. 12. double bigger = 1.0; 13. double s = 1.0e-6; 14. double sum1 = s + s + s + s + s + s + s + s + s + s + bigger; // 10 s 15. double sum2 = bigger + s + s + s + s + s + s + s + s + s + s; // 10 s 16. System.out.println(sum1 == sum2); 17. System.out.println(sum1); 18. System.out.println(sum2); 19. } 20. } o Output >>> false >>> 0.30000000000000004 >>> false >>> 1.00001 >>> 1.0000099999999992  Representing numbers o It all makes sense if you understand how “real” numbers are represented. o First, consider an int like 42. Hardware doesn’t directly represent 4s or 2s -- everything is binary.  o So 42 can be represented by 101010 (base 2).  Of course, the binary digits are represented electromagnetically.  Representing fractions o Fractions can be handled using the same approach.  o So 0.4375 can be represented by using 0.0111 (base 2) o Another example  ̅̅̅̅̅̅̅  Problem Numbers o You already know from math that some numbers do not have a finite representation. o Even worse, some numbers that have a finite representation in decimal do not in binary! o We have finite memory. But we need to represent numbers that take an infinite number of bits.  IEEE-754 Floating Point o Like a binary version of scientific notation o 32 bits for a float as follows(and 64 bits for a double)  1 bit for the sign: 1 for negative and 0 for positive  8 bits for the exponent e  To allow for negative exponents,127 is added to the exponent. We say that the exponent is “biased” by 127.  So the range of possible exponents is not 0 to 28-1 = 0 to 255, but (0-127) to (255-127) = -127 to 128.  23 bits for the mantissa M  since the first bit must be 1, we don’t waste space storing it o Rounding  If we have to lose some digits, we don’t just truncate, we round.  In rounding a decimal to a whole number, an issue arises: If we have a 0.5, do we round up or down?  If we always round up, we are biasing towards higher values.  “Proper” rounding: round to the nearest even number. E.g., 17.5 is rounded up to 18 but 16.5 is rounded down to 16.  The IEEE standard uses proper rounding also. o Historical aside  30 years ago, computer manufacturers each had their own standard for floating point. NOVEMBER 28  Problem? Writing portable software!  Advantage to manufacturers? Customers got locked in to their particular computers.  In the late 1980s, the IEEE produced the standard that now virtually all follow.  Kahan spearheaded the effort, and won the 1989 Turing Award for it.  Examples o As we saw, 0.1cannot be represented exactly in binary, leading to the unexpected result. o And adding a very small quantity to a very large quantity can mean the smaller quantity falls off the end of the mantissa. o But if we add small quantities to each other, this doesn’t happen.  And if they accumulate into a larger quantity, they may not be lost when we finally add the big quantity in. o Totalling.java 1. package floatingpoint; 2. 3. public class Totalling { 4. 5. public static double sum1(double startingAmount, double amount, int howMany) { 6. double answer = startingAmount; 7. for (int i = 0; i < howMany; i++) { 8. answer += amount; // add all values to an accumulator 9. } 10. return answer; 11. } 12. 13. public static double sum2(double startingAmount, double amount, int howMany) { 14. double answer = 0; 15. for (int i = 0; i < howMany; i++) { 16. answer += amount; // 2 accumulations, this one is for small doubles 17. } 18. answer += startingAmount; // this is the final accumulation 19. return answer; 20. } 21. 22. public static void main(String[] args) { 23. double v1 = sum1(1, 10e-17, 10); // inaccurate 24. double v2 = sum2(1, 10e-17, 10); // accurate 25. System.out.println(v1 + " vs " + v2); 26. } 27. }  Output >>> 1.0 vs 1.000000000000001 o ArrayTotal.java 1. package floatingpoint; 2. 3. import java.util.Arrays; 4. import java.util.Random; 5. 6. public class ArrayTotal { 7. 8. public static double[] createList(int n) { 9. Random r = new Random(); 10. double[] list = new double[n]; 11. list[0] = 1; 12. for (int i = 1; i < n; i++) { 13. list[i] = r.nextDouble() * 1.0e-18; 14. } 15. return list; 16. } 17. 18. public static double sum1(double[] list) { 19. double answer = 0.0; 20. for (int i = 0; i < list.length; i++) { 21. answer += list[i]; // add 1 first 22. } 23. return answer; 24. } 25. NOVEMBER 28 26. public static double sum2(double[] list) { 27. Arrays.sort(list); // add 1 last 28. double answer = 0.0; 29. for (int i = 0; i < list.length; i++) { 30. answer += list[i];
More Less

Related notes for CSC207H1

Log In


OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


OR

By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.


Submit