false

Class Notes
(834,967)

Canada
(508,836)

University of Toronto St. George
(43,993)

Computer Science
(754)

CSC207H1
(40)

Diane Horton
(24)

Lecture

Unlock Document

Computer Science

CSC207H1

Diane Horton

Fall

Description

NOVEMBER 28
Floating Point
o ex. Some simple arithmetic, and checking that laws of mathematics hold.
o Adding.java
1. package floatingpoint;
2.
3. public class Adding {
4.
5. public static void main(String[] args) {
6.
7. double x = 0.1;
8. double sum = x + x + x;
9. System.out.println(sum == .3);
10. System.out.println(sum);
11.
12. double bigger = 1.0;
13. double s = 1.0e-6;
14. double sum1 = s + s + s + s + s + s + s + s + s + s + bigger; // 10 s
15. double sum2 = bigger + s + s + s + s + s + s + s + s + s + s; // 10 s
16. System.out.println(sum1 == sum2);
17. System.out.println(sum1);
18. System.out.println(sum2);
19. }
20. }
o Output
>>> false
>>> 0.30000000000000004
>>> false
>>> 1.00001
>>> 1.0000099999999992
Representing numbers
o It all makes sense if you understand how “real” numbers are represented.
o First, consider an int like 42. Hardware doesn’t directly represent 4s or 2s -- everything is binary.
o So 42 can be represented by 101010 (base 2).
Of course, the binary digits are represented electromagnetically.
Representing fractions
o Fractions can be handled using the same approach.
o So 0.4375 can be represented by using 0.0111 (base 2)
o Another example
̅̅̅̅̅̅̅
Problem Numbers
o You already know from math that some numbers do not have a finite representation.
o Even worse, some numbers that have a finite representation in decimal do not in binary!
o We have finite memory. But we need to represent numbers that take an infinite number of bits.
IEEE-754 Floating Point
o Like a binary version of scientific notation
o 32 bits for a float as follows(and 64 bits for a double)
1 bit for the sign: 1 for negative and 0 for positive
8 bits for the exponent e
To allow for negative exponents,127 is added to the exponent. We say that the exponent is “biased” by 127.
So the range of possible exponents is not 0 to 28-1 = 0 to 255, but (0-127) to (255-127) = -127 to 128.
23 bits for the mantissa M
since the first bit must be 1, we don’t waste space storing it
o Rounding
If we have to lose some digits, we don’t just truncate, we round.
In rounding a decimal to a whole number, an issue arises: If we have a 0.5, do we round up or down?
If we always round up, we are biasing towards higher values.
“Proper” rounding: round to the nearest even number. E.g., 17.5 is rounded up to 18 but 16.5 is rounded down to 16.
The IEEE standard uses proper rounding also.
o Historical aside
30 years ago, computer manufacturers each had their own standard for floating point. NOVEMBER 28
Problem? Writing portable software!
Advantage to manufacturers? Customers got locked in to their particular computers.
In the late 1980s, the IEEE produced the standard that now virtually all follow.
Kahan spearheaded the effort, and won the 1989 Turing Award for it.
Examples
o As we saw, 0.1cannot be represented exactly in binary, leading to the unexpected result.
o And adding a very small quantity to a very large quantity can mean the smaller quantity falls off the end of the mantissa.
o But if we add small quantities to each other, this doesn’t happen.
And if they accumulate into a larger quantity, they may not be lost when we finally add the big quantity in.
o Totalling.java
1. package floatingpoint;
2.
3. public class Totalling {
4.
5. public static double sum1(double startingAmount, double amount, int howMany) {
6. double answer = startingAmount;
7. for (int i = 0; i < howMany; i++) {
8. answer += amount; // add all values to an accumulator
9. }
10. return answer;
11. }
12.
13. public static double sum2(double startingAmount, double amount, int howMany) {
14. double answer = 0;
15. for (int i = 0; i < howMany; i++) {
16. answer += amount; // 2 accumulations, this one is for small doubles
17. }
18. answer += startingAmount; // this is the final accumulation
19. return answer;
20. }
21.
22. public static void main(String[] args) {
23. double v1 = sum1(1, 10e-17, 10); // inaccurate
24. double v2 = sum2(1, 10e-17, 10); // accurate
25. System.out.println(v1 + " vs " + v2);
26. }
27. }
Output
>>> 1.0 vs 1.000000000000001
o ArrayTotal.java
1. package floatingpoint;
2.
3. import java.util.Arrays;
4. import java.util.Random;
5.
6. public class ArrayTotal {
7.
8. public static double[] createList(int n) {
9. Random r = new Random();
10. double[] list = new double[n];
11. list[0] = 1;
12. for (int i = 1; i < n; i++) {
13. list[i] = r.nextDouble() * 1.0e-18;
14. }
15. return list;
16. }
17.
18. public static double sum1(double[] list) {
19. double answer = 0.0;
20. for (int i = 0; i < list.length; i++) {
21. answer += list[i]; // add 1 first
22. }
23. return answer;
24. }
25. NOVEMBER 28
26. public static double sum2(double[] list) {
27. Arrays.sort(list); // add 1 last
28. double answer = 0.0;
29. for (int i = 0; i < list.length; i++) {
30. answer += list[i];

More
Less
Related notes for CSC207H1

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.