Search 75,745 tutors
FIND TUTORS
Ask a question
0 0

For the standard normal distribution, find the area within one standard deviation of the mean--that is, the area between

Tutors, please sign in to answer this question.

1 Answer

The probability density function for the standard distribution is 
 
P(t, μ, σ) = 1/√(2 π σ2) * e^{-(t-μ)2/(2σ2)}
 
For a standard normal distribution, σ=1, and μ=0. But that may not be the case if they are asking you to do that for some specific σ
 
To find the area within 1.5 standard deviations on either side, you must integrate that function, with respect to t, with the lower and upper limits of t=-1.5σ and t=1.5σ respectively.
 
This function is not analytically integrable. (There is no way to compute a function that produces an exact solution)
 
You must do one of the following four things to retrieve your answer:
 
1.) Use a computational service like wolfram alpha to perform the integration. 
 
2.) Use your calculators CDF function
 
3.) Use a table of values to look up the area. 
 
4.) Perform the integration iteratively using a Riemann sum, Simpson's rule, or some other finite approximation method (for high accuracy you should not do this by hand as the number of terms for convergence will be excruciatingly long, but rather write a computer program to do this) 
 
 
EDIT: Because Andre has a very good point below, but I am guessing you are in a statistics rather than power series course, I will carry out the arduous calculus for you.
 
For a given function f(t) we would like to produce some series that can approximate it to arbitrary precision. 
 
Algorithmically you can think of this as follows: Start with an initial guess a0. Then recognize that the answer might be dependent on the first order of the variable. So we add in t to have a0 + a1(t-t0). Recognize that it could dependent on any order of t. By adding and subtracting small monomials we should be able to get nearer and nearer the actual function. Since subtracting is the same thing as adding a negative coefficient, we can still represent our function as an infinite sum of monomials.
 
Σ(ak(t-t0)k)
 
However, just finding an expression to represent f(t) at a known point t0 is fairly useless, since we knew the value of that function to begin with. What we want is an expression to extrapolate (predict) other values that are reasonably distant from the initial t0.
 
Extrapolation in a linear sense is done by taking a point, looking at the slope and calculating the next point. Thus we can do the same thing by looking at the derivative of the function. But a first derivative is not enough. There exist an infinite number of functions with the same first derivative. If we truly want a polynomial expression that accurately defines our function we need the second derivatives to be the same... and the third, fourth, fifth...., infinite derivative to be the same. 
 
What effect does this have on our series. 
 
Well, the first gives a1 + 2a2(t-t0) - 3a3(t-t0)+...
 
we note that a1=f'(t)
 
The second gives 2a2+3*2a3(t-t0)+4*3a4+...
 
we note that a2=f''(t) / 2 
 
The third gives 3*2a3+4*3*2a4(t-t0)+5*4*3a5(t-t0)2+...
 
we note that a3=f'''(t)/3*2
 
Here a pattern emerges, and we see that
 
an=(1/n!)f(n)(t)t=t0
 
Thus, our series must be 
 
f(t) = Σ(1/k!)f(k)(t)t=t0(t-t0)k
 
 
And that will give you the taylor series for any function (an infinite polynomial series to find a functions value to arbitrary precision)--these are normally evaluated by a computer as the calculations are arduous.
 
But now to the meat of the question, how do you calculate the integral of the gaussian distribution?
 
We have a function ex in the gaussian distribution. 
 
By the above method we can see that the taylor series for ex is 
 
Σ(xk/k!)
 
Now, if we go to our definition of the distribution we see that we really have an e-(t^2)
 
we simply substitute in -t2 for t in our expansion leaving us with Σ((-t)2k/k!) = Σ((-1)2kt2k/k!)
 
Now we want an integral of the function. Where we could not previously integrate analytically the distribution, we can integrate analytically the polynomial series, since the integral of a sum is merely the sum of the integrals. 
 
This leaves us with
 
Σ((-1)2kt2k+1/((2k+1)k!))
 
We then normalize the function, and realize that (-1)2k = (-1)k for all integers k. And reveal our final expression
 
(2/π(1/2)) Σ((-1)kt2k+1 / ((2k+1)k!))
 
Which is the error function that Andre was talking about. 
 
This expression is then evaluated to a reasonable number of terms in k, at the points t=1.5 and t=-1.5 to arrive at your centered 3σ spread that you asked for. 
 
I am sorry if that is confusing, I was hoping to avoid that topic. It is important to remember that though Taylor series provide arbitrarily precise calculations for a value, they are still approximations. Since you cannot actually add up an infinite number of terms, you must choose the appropriate number until you have the precision you desire. 
 
Not all Taylor series converge at the same rate, or even to the function itself, as well, so where you might need 40 terms for one series to converge on 10 decimal places of accuracy, another expression might take 50 million--or it may never get there. The optimization of such series representation is an important topic in mathematics used by engineers and computer scientists all over the world. 

Comments

The answer is independent of the values of µ and σ. For example, the area within one standard deviation of the mean will always be 0.6827 area units.
Also, when you say "there is no way to compute a function that produces an exact solution", that's technically not correct. You can define a function to be the integral of P(t) from 0 to x. (It's called the error function, erf(x).) Its Taylor series produces a solution as exact as you like it.
Andre. Yes, you can find not just "an", but many, infinite series to approximate the answer to this function--but it will never be exact. You may do it to arbitrary precision, but you may not come up with a finite series of elementary terms that exactly represent the quantity in question. There is a significant distinction here--which given your background I see you are well aware of branches of numerical, real, and complex analysis. 
 
Secondly, by defining a function as the integral of the standard distribution and then claiming that function is the analytical solution to the original function is, by definition, begging the question. Such logic is circulatory. It is as if I said that the Fresnel Integral is the solution to the indefinite integral of sin(t2). That is true, but definition, but the Fresnel integral is still not an "elementary" function. So to say that there is no way to compute a function that produces an exact solution is technically correct--the best kind of correct (if you are a futurama fan). But you are correct in that you may compute a function that produces an arbitrarily accurate solution.
 
I think in today's era it is important for our students to learn the distinction between families of functions that may be computed algebraically and those that must be computing algorithmically. Since many of these functions are of chief importance in the sciences, exposing them to the idea early on that math is not "a thing" but a language and a toolkit with restrictions on its expressiveness assists in their mastery of techniques while understanding engineering tolerance for a given set of constraints. Naturally, you are free to disagree on a pedagogical level, but this hardly seems the place to debate such matters when it is no longer relevant to the question. 
 
Since we are off topic anyway, I have been searching for a partner to write an interactive digital mathematics text leveraging technologies like mathbox--update for the 21st century if you will--would that interest you? 
Hi Timothy. I’m a little confused by what you mean with “a function that produces an exact solution.” We all agree that functions such as exp(-t²) or sin(t²) do not have finite-form antiderivatives (indefinite integrals). Their antiderivatives are simply defined as power series which converge everywhere. Nothing circulatory about that. There are many other Riemann integrable functions that don’t have antiderivatives, including those functions that are not continuous, but piecewise continuous. I believe now this is what you mean.
When you spoke of an “exact solution” to the problem, I assumed you meant the area under the graph (definite integral), which is a positive real number. The total area has the “exact” value of 1 by construction. (The fact that the total area is finite, even though the function is positive everywhere, is remarkable in its own right.) However, most partial areas (all but a set of measure zero) do not have such an exact value. That’s because there is no such thing as an “exact” value of an irrational real number. Unlike integers or rationals, irrational numbers are only as accurate as you specify them to be, usually by sandwiching them between two rational numbers. (In fact, irrationals can be defined as limits of sequences of rationals.)
Back to this problem, there is no “exact area” within one standard deviation of the mean, the number 0.6827 is accurate within 4 significant figures.
That was precisely my point. Thank you for summing that up more clearly than I. Perhaps I misspoke when I used the term exact solution. Technically this is not solved via any sort of perturbation or weak-formulation, and is therefore an "exact" solution as you say. What I intend is to say closed-form solution, but due to the ambiguity in that (such as the erf(t) described above) I would refine to say, I suppose, "Those solutions defined by a finite set consisting of a finite number of algebraic polynomials" -- that should allow for piecewise formulations of the problem so long as it is not infinitely piecewise.
 
I am trying to exclude things such as erf(), Fresnel() kronecker / dirac distributions. And, though most tacitly accept them, even sin(), cos(), etc for those components of the domain that do not map to the subset of the range consisting of rational numbers. 
 
I realize that is a bit restrictive, and lord knows its pedantic, but I think it is an important distinction to be had between the bipartite sets of (that which may be solved by hand without use of iterative / recursive approximations) and (that which must be approximated to arrive at a numeric solution)
 
Granted even something as trivial as the square root of two fits into the second category, but I feel that is also an important point to make to students. It is quite common in calculus and in early statistics / distribution theory that I am asked "how can I solve this function?". Ignoring the semantics of the question, there is something to be said for what the word "solve" actually implies, and what the students often think it means. 
 
It seems to always surprise them when you begin to explain, they attempt to clarify, "yeah, but, like, how do you like *solve* it?",  and then the answer is "you cannot find a closed-form solution". I am unsure what to infer from the general lack of exposure to these topic in my generation... I am sure it does not reflect well upon the state of public education... but that aside, it is generally more pragmatic to direct them toward the resource that allows them to solve their problem while simultaneously hinting at the fact that this is one (of many) function(s) where their normal approach will fail--reassuring them that it is not their "fault" (inability to load the required algorithm for solution into their memory) that they cannot determine the answer. 
 
I had hoped to avoid getting into the gritty (yet beautiful) details of non-analytic functions--not because I do not find it interesting, but rather because I wished not to confuse the OP any more than already. Hopefully, though, OP will get this sorted from any of the methods we've provided thus far. 

Comment