Here is the plot of P(x) vs. PEX(x):
Well folks, I was really surprised. So what happened? I asked my teacher about it.
Basically any time that you do math on a computer, it is wrong. It is an approximation; it can never represent numbers as the are in our minds (they are perfect, without compromise); so you are always getting the right answer to the wrong problem. I tried to determine whether I really understood this and came up with the following example.
Imagine some distance say the distance between you and the wall. In your minds eye, you can split up this distance into:
- 1 (100^0) equally spaced units
- And then 100 (100^1) equally spaced units
- And then 10,000 (100^2) equally spaced units
- And then 1,000,000 (100^3) equally spaced units
- And then 100,000,000 (100^4) equally spaced units
- And basically keep doing this until you have divided this distance into an extremely large amount of small numbers, say a trillion.
- You were probably able to do this effortless; in your own mind there is no obstacle to conceiving of enough precision to represent that level of measurement.
The numbers in your mind are limitless; the problem is the numbers that computers can represent are limited. They might have a number which represents the smallest difference that you can have between two numbers, called epsilon. If you start trying to perform computations on these very small numbers that are very close together, like subtracting them; then you are eventually going to come up with differences that are smaller than epsilon, so they are not representable and you end up losing that precision in the subtraction. This is OK though as long as you are aware of this and know what you are doing. This was the first homework; and would be come a running theme throughout the semester: “Know what you are doing, where, how, and why!”.
In the case of the problem above, there are probably both roundoff errors and subtractive cancellation going on; both cases of loss-of-precision that screw up the calculation. Very roughly: the machine is trying to deal with differences between numbers that are so small that it can’t keep track of them. How bad is the difference? Plug x=1.993 into both of the equations:
>> x=1.993;
>> P(x)
ans = -4.0354e-020
>> PEX(x)
ans = 1.5461e-011
>> P(x) - PEX(x)
ans = -1.5461e-011
That is a big difference.
What has this got to deal with the “real world”? A lot!
When you are doing math on a computer; you really need to grok all of the layers of the system all the way down, otherwise you will screw up the math and give somebody a really “bad day”, at least yourself, and probably the people paying you to do the work (eg. the client).
Is this purely an academic problem? No way. Anytime you do math on a computer you need to be conscious of how you are doing it and how “wrong” is your answer because anytime you do math on a computer, your math will always be wrong!
ADDENDUM: 07/10/11
To Nick’s point, is this a precision issue?
Via:
By default, MATLAB stores all numeric values as double-precision floating point.
And via:
MATLAB constructs the double-precision (or double) data type according to IEEE® Standard 754 for double precision.
Being unfamiliar with this, I read the Wikipedia page on IEEE-754 here. For double-precision it looks like the values are stored like this:
- The value itself is computed like this: (−1)^s × c × b^q
- s = the sign, 0|1
- c = the coefficient, the significant parts of the value
- b = the base, 2|10
- q = the exponent
Here are some example of how these values might be used to store the number 123.45:
- b=10, s=0, c=12345, q=-2
- b=10, s=0, c=1.2345, q=2
- b=10, s=0, c=.12345, q=-3
So what is happening? Here is a MATLAB page that explained the fact that a number like 0.1 can not be represented precisely and gave some examples. The thing was that the examples were straightforward, for example here is an example in Java which uses IEEE-754 double-precision:
App.java
public double computeDoubles() {
double a = 0.1;
double b = 0.3;
double result = (a + a + a) + b;
return result;
}
public BigDecimal computeBigDecimals() {
BigDecimal a = new BigDecimal(0.1);
BigDecimal b = new BigDecimal(0.3);
BigDecimal result = a.add(a).add(a).add(b);
return result;
}
AppTest.java
public void testDouble()
{
App app = new App();
assertEquals(0d, app.computeDoubles());
}
public void testBigDecimal()
{
App app = new App();
assertEquals(new BigDecimal(0), app.computeBigDecimals());
}
junit.framework.AssertionFailedError: expected:<0.0> but was:<0.6000000000000001>
at roundoff.AppTest.testDouble
junit.framework.AssertionFailedError: expected:<0> but was:<0.6000000000000000055511151231257827021181583404541015625>
at roundoff.AppTest.testBigDecimal
Racket does the same:
#lang racket
(require rackunit)
(define (compute)
(- (+ 0.1 0.1 0.1) 0.3))
(check = (compute) 0)
--------------------
FAILURE
name: check
location: (unsaved-editor601 8 0 81 21)
expression: (check = (compute) 0)
params: (# 5.551115123125783e-017 0)
--------------------
But, I still had a hard time grokking how the representation was screwing up the number. Take for example 0.1. How is it stored? I need to do some more work here before I understand how to find the difference between 0.1 and the number actually represented by the IEEE-754 double precision value.
The world is really digital; it is just that we do not have enough bits. For example, electricity comes in digits called electrons; elements come in digits called atoms, electromagnetic radiation comes in digits called photons….
Stating the obvious, the software solution is based on a normal range of numbers and needed to be better scaled to fit the solution. We humans are not very good at this either. In general, people have gotten use to the idea of a trillion dollar debt, with no real understanding of a trillion. Even examples fail, a trillion seconds is over 32,000 years. 32,000 years is too big to understand. Or in the other direction, carbon dioxide makes up about 390 parts per million by volume of the earth’s atmosphere (or 0.039%). This is too small of number for us to understand without further thought. How do you envision 1 in a million? How about 1 in a trillion? (some pollutants are measured at this level).
I think we talked about this before, but I’d be curious what floating point representation is used by Matlab. Is it a Single Precision? Is it possible to do a double precision and try again, or use a Decimal type? These are all different ways of representing floating point numbers in computers, that either use more bytes, or assign more precision to the floating point portion over the integer portion.
Theoretically, each of them would produce a different graph, some with more fidelity to the ideal than others.
Brad:
Such an interesting point; not even sure where to begin on that.
Nick:
Tonight I looked into it. I feel like I confirmed that I thought I knew; but am left with more work to understand the numerical representation
Crazy late followup, hopefully you are notified, but, for the record:
In base b, consider a fraction (reduced, but possibly improper) with denominator D. If the factorization of D is not a subset of the factorization of b, or vice versa, then this fraction cannot be expressed in a finite expansion with a radix point. E.g., factors of 10 are 2 and 5. Then any rational number whose denominator has any number of 2s or 5s as factors—but no others!—can be represented finitely with a decimal point.
Then primorials are the “best” bases, in terms of the number of rationals they can represent finitely with a radix expansion. (2, 6, 30, 210, …)
ANONYMOUS:
Thanks.