Course Notes | Siqi Zheng

Supplement Solutions for New Questions in Chapter 1.4 to 1.6 in Understanding Analysis Second Edition

Tue, 10 Aug 2021 01:00:00 +0000

Note: There may be LaTex display issues due to blogdown rendering limitations. A complete well-formatted solution can be found by clicking the download icon above.

One may notice that most questions in the second edition are the same as those in the first edition. However, there are still some new or modified questions in the latest edition that remain unanswered.

Therefore, in the following posts, I am going to present a collection of solutions to these new questions found on the internet and worked out by myself. To be more concise and clear, I also rewrote some of my solutions according to the internet sources (links are attached at the end of each question). The solution to the first edition can be found here: https://github.com/mikinty/Understanding-Analysis-Abbott-Solutions

Exercise 1.5.2. Review the proof of Theorem 1.5.6, part (ii) showing that $\Bbb R$ is uncountable, and then find the flaw in the following erroneous proof that $\Bbb Q$ is uncountable: Assume, for contradiction, that $\Bbb Q$ is countable. Thus we can write $\Bbb Q = {r1, r2, r3, \dots}$ and, as before, construct a nested sequence of closed intervals with $r_n \not \in I_n$. Our construction implies $\cap^\infty_{n=1} I_n = \empty$ while NIP implies $\cap^\infty_{n=1} I_n \neq \empty$. This contradiction implies Q must therefore be uncountable.

(1) It may contain only one irrational number. (2) NIP is for real intervals not rational.

Source: https://math.stackexchange.com/questions/1914901/false-proofs-claiming-that-mathbbq-is-uncountable

Exercise 1.5.4. (a) Show $(a, b) \sim R$ for any interval $(a, b)$. We know from the Example 1.4.9. that the function $f(x) = x/(x^2 − 1)$ takes the interval $(−1, 1)$ onto $\Bbb R$ in a 1–1 fashion. Then we map $(a,b)$ onto $(-1,1)$ by another bijective linear function $g(x)=2x/(b-a)-(b+a)/(b-a)$.

(b) Show that an unbounded interval like $(a,\infty) = {x : x > a}$ has the same cardinality as $\Bbb R$ as well. We know from the Example 1.4.9. that the function $f(x) = x/(x^2 − 1)$ takes the interval $(−1, 1)$ onto $\Bbb R$ in a 1–1 fashion. Then we map $(a,\infty)$ onto $(-1,1)$ by another bijective linear function $g(x)=2x/(1-x)$.

(c) Using open intervals makes it more convenient to produce the required 1–1, onto functions, but it is not really necessary. Show that $[0, 1) \sim (0, 1)$ by exhibiting a 1–1 onto function between the two sets. $f:[0,1) \rightarrow (0,1)$ by $f(0)=1/2$, $f(1/n)=1/(n+1)$ for integer $n \geq 2$, and $f(x)=x$ otherwise.

Source: https://math.stackexchange.com/questions/1425492/explicit-bijection-between-0-1-and-0-1

Exercise 1.5.5. (a) Why is $A \sim A$ for every set $A$? Trivial. By definition $f(x)=x$ will do the job.

(b) Given sets $A$ and $B$, explain why $A \sim B$ is equivalent to asserting $B \sim A$. Bijection, so consider inverse mapping.

(c) For three sets $A,B,$ and $C$, show that $A \sim B$ and $B \sim C$ implies $A \sim C$. These three properties are what is meant by saying that $\sim$ is an equivalence relation. Assume $f$ maps $A$ to $B$ and $g$ maps $B$ to $C$, $g(f(x))$ will work.

Exercise 1.5.6. (a) Give an example of a countable collection of disjoint open intervals. $A_n = (n, n+1)$, $n\in \Bbb N$

(b) Give an example of an uncountable collection of disjoint open intervals, or argue that no such collection exists. DNE. Every collection of disjoint open intervals in $\Bbb R$ is countable because you can choose a rational number (by density theorem) in each of them and rationals are countable.

Exercise 1.5.7. Consider the open interval $(0,1)$, and let $S$ be the set of points in the open unit square; that is, $S = {(x, y) : 0 < x,y < 1}$.

(a) Find a 1–1 function that maps $(0, 1)$ into, but not necessarily onto, $S$. (This is easy.) $f(x) = (x,x),x \in (0,1)$

(b) Use the fact that every real number has a decimal expansion to produce a 1–1 function that maps $S$ into $(0, 1)$. Discuss whether the formulated function is onto. (Keep in mind that any terminating decimal expansion such as $.235$ represents the same real number as $.234999 \dots$)

For any point with two coordinates $(0.d_1d_2\dots,0.e_1e_2\dots)$, we map it to the real number $(0.d_1e_1d_2e_2\dots)$. We restrict the choice of point in its simplest form so that $(0.2,0.5)$ will be chosen for $0.25$ instead of $(0.2999\dots,0.4999\dots)$, which is equal to $(0.3,0.5)$, corresponding to $0.35$.

This function (mapping), however, is not onto. Consider $1/11=0.090909\dots$, which by definition can be produced by a point $(0,0.999\dots)$, but this point can no be selected since it is equal to $(0,1)$ and $(0,1)$ yields $0.01$. Therefore not point in the unit square can be used to map to $1/11$.

Exercise 1.5.8. Let $B$ be a set of positive real numbers with the property that adding together any finite subset of elements from $B$ always gives a sum of $2$ or less. Show $B$ must be finite or countable.

For each $n\in \Bbb N$, let$$B_n=\left{b\in B,\middle|,b\geqslant\frac2n\right}\subset B.$$

Of course, $B_n$ can have no more than $n-1$ distinct elements; otherwise, the sum of $n$ distinct elements of $B_n$ would be grater than $2$.

But$$B=\bigcup_{n\in\Bbb N}B_n.$$Since $\Bbb N$ is countable and each $B_n$ is finite, $B$ is countable.

Source: https://math.stackexchange.com/questions/2446630/showing-a-set-is-finite-or-countable

Exercise 1.5.10. (a) Let $C \subseteq [0,1]$ be uncountable, show there exists $a \in (0,1)$ such that $C \cap [a,1] $ is uncountable.

Suppose that $C\cap [\tfrac{1}{n}, 1]$ is countable for all $n$. Then $$C\cap [0,1] = C\cap\big({0}\cup \bigcup_{n=1}^\infty [\tfrac{1}{n},1]\big) = (C\cap {0}) \cup \bigcup_{n=1}^\infty (C\cap [\tfrac{1}{n}, 1])$$ would be countable too.

Source: https://math.stackexchange.com/questions/1452550/let-c-subseteq-0-1-be-uncountable-show-there-exists-a-in-0-1-such-tha

(b) Now let A be the set of all $a \in (0, 1)$ such that $C \subseteq [a,1]$ is uncountable, and set $\alpha = supA$. Is $C \subseteq [0,1]$ an uncountable set?

WTS: Suppose $C\subseteq [0,1]$ is uncountable. Let $A = {a\in (0,1)\mid C\cap[a,1]$ is uncountable $}$, and $\alpha = \sup A$. Then $C\cap [\alpha,1]$ is countable.

First, $A$ is nonempty: for $n\in\Bbb N$ let $C_n = C\cap [\frac 1 n, 1]$. Some $C_n$ must be uncountable, otherwise $C= \bigcup_n C_n$ is a countable union of countable sets and therefore countable. So for some $n$, $1/n \in A$.

Clearly $0 \lt \alpha \le 1$.

If $\alpha =1 $ then of course the claim is true. If $\alpha \lt 1$. Let $(b_n)$ be a decreasing sequence in $(\alpha, 1)$ with $\alpha = \inf_n b_n$. By definition of $A$ and $\alpha$, for every $n$, $C\cap[b_n,1]$ is countable, for otherwise $b_n\in A$ and $b_n \le \alpha$. Thus $$\begin{align} C\cap [\alpha,1] &= C\cap \bigcup_n [b_n, 1] \
&= \bigcup_n (C\cap [b_n, 1]) \end{align}$$ is a countable union of countable sets, so it’s countable.

Source: https://math.stackexchange.com/questions/1639608/intersection-of-uncountable-sets

Exercise 1.5.11 (Schröder–Bernstein Theorem). Assume there exists a 1–1 function function $f: X \rightarrow Y$ and another 1–1 function $g: Y \rightarrow X$. Then there exists a 1–1, onto function $h: X \rightarrow Y$ and hence $X \sim Y$.

The strategy is to partition $X$ and $Y$ into components $X = A \cup A'$ and $Y = B \cup B'$ with $A \cap A’ = \emptyset$ and $B \cap B’ = \emptyset$, in such a way that $f$ maps $A$ onto $B$ and $g$ maps $B'$ onto $A'$.

(a) Explain how achieving this would lead to a proof that $X \sim Y$. $f: A \rightarrow B$ is a 1–1, onto function; $g: B’ \rightarrow A'$ is a 1–1, onto function; Then $h(x)=f(x)$ if $x \in A$ and $h(x)=g^{-1}(x)$ if $x \in A'$ is a $X \rightarrow Y$ 1–1, onto function and hence $X \sim Y$.

(b) Set $A_1 = X \setminus g(Y)$ (what happens if $A_1 = \emptyset$?) and inductively define a sequence of sets by letting $A_{n+1} = g(f(A_n))$. Show that ${A_n : n \in \Bbb{N}}$ is a pairwise disjoint collection of subsets of $X$, while ${f(A_n) : n \in \Bbb{N} }$ is a similar collection in $Y$.

For $k \ge 2$, since $A_k = g(f(A_{k-1})) \subseteq g(Y)$, $A_k$ and $A_1$ are disjoint.

For $2 \le m \lt n$, if there exists $a \in A_m \cap A_n$, then for some $a_{m-1} \in A_{m-1}$ and $a_{n-1} \in A_{n-1}$, $f(g(a_{m-1})) = a = f(g(a_{n-1}))$. Since both $f$ and $g$ are injective, here $a_{m-1} = a_{n-1}$. Hence $A_m \cap A_n \ne \emptyset$ implies $A_{m-1} \cap A_{n-1} \ne \emptyset$. By induction, we can conclude that $A_1 \cap A_{n-m+1} \ne \emptyset$, which is contradict with part 1. Therefore $A_m$ and $A_n$ are disjoint ($2 \le m \lt n$).

Source: https://math.stackexchange.com/questions/1726578/understanding-a-proof-of-schr%C3%B6der-bernstein-theorem

(c) Let $A = \cup_{n=1}^\infty A_n$ and $B = \cup_{n=1}^\infty f(A_n)$. Show that $f$ maps $A$ onto $B$. Trivial because for every $b \in B$, $b = f(a_n)$ for some $a_n \in A_n \subseteq A $.

(d) Let $A’ = X\setminus A$ and $B’ = Y \setminus B$. Show $g$ maps $B'$ onto $A'$. Suppose there is an element $a’ \in A’\not \in g(B’)$. Since $a'$ cannot be in $A_1$ there has to be an element $b \in f(A_n)\subset B$ s.t. $g(b)=a'$. Since $b \in f(A_n)$ we can write it as $f(a)=b$ and therefore $a'=g(f(a))\in A_{n+1}$. But this is a contradiction to where $a'$ lives.

Source: https://math.stackexchange.com/questions/1726578/understanding-a-proof-of-schr%C3%B6der-bernstein-theorem

Exercise 1.6.9. Using the various tools and techniques developed in the last two sections (including the exercises from Section 1.5), give a compelling argument showing that $\cal P(\Bbb N) \sim \Bbb R$.

First note that that $\Bbb R$ can inject into $ \cal P(\Bbb Q)$ by mapping $r$ to ${q\in\Bbb Q\mid q \lt r}$. Since $\Bbb Q$ is countable there is a bijection between $\cal P(\Bbb Q)$ and $\cal P(\Bbb N)$. So $\Bbb R$ injects into $\cal P(\Bbb N)$.

Then note that we can map $x\in 2^\Bbb N$ to the continued fraction defined by the sequence $x$. Or to a point in $[0,1]$ defined by $\sum\frac{x(n)}{3^{n+1}}$, which we can show is injective in a somewhat easier proof.

Supplement Solutions for New Questions in Chapter 1.2 to 1.4 in Understanding Analysis Second Edition

Thu, 05 Aug 2021 01:00:00 +0000

Note: There may be LaTex display issues due to blogdown rendering limitations. A complete well-formatted solution can be found by clicking the download icon above.

Exercise 1.2.2. Show that there is no rational number r satisfying $2^r=3$.

Suppose $r=\frac{a}{b}$ with positive integers $a,b$.

Then, we get $$2^{\frac{a}{b}}=3$$

which can be expressed as

$$2^a=3^b$$

This is clearly a contradiction because the left side is even and the right side is odd.

Source: https://math.stackexchange.com/questions/1427219/prove-there-is-no-rational-r-satisfying-2r-3

Exercise 1.2.4. Expressing $\Bbb N$ as an infinite union of disjoint infinite subsets.

Let $A_{i}$ consist of all the numbers of the form $2^im$ where $2\nmid m$. That is, $A_i$ consists of all the numbers that have exactly a factor of $2^i$ in them. So $$\begin{align} A_1 = {1,3,5,7,9,11, \dots}\
A_2 = {2, 6 =2^1\cdot 3, 10 = 2^1\cdot 5, 14 = 2^1\cdot 7, \dots}\
A_3 = {4 = 2^2, 12=2^2\cdot 3, 20=2^2\cdot 5, \dots}\
A_4 = {8 = 2^3, 24=2^3\cdot 3, 40=2^3\cdot 5, \dots}\
\dots \end{align} $$

Source: https://math.stackexchange.com/questions/847465/expressing-bbb-n-as-an-infinite-union-of-disjoint-infinite-subsets

As pointed out in the link above, any prime numbers can work here.

$A_1 = \Bbb N \ {x: x = 3b, b \in \Bbb N}$

$A_2 = {3a,a\in A_1}$

$A_3 = {3^2a,a\in A_1}$

$A_4 = {3^3a,a\in A_1}$

$\vdots$

Exercise 1.2.8.

Give an example of each or state that the request is impossible:

(a) $f : \Bbb N \rightarrow \Bbb N$ that is 1–1 but not onto. $f(x) = x^2+2$ because $1 \in \Bbb N$, but $f(x)>1$ $\forall x \in \Bbb N$

(b) $f : \Bbb N \rightarrow \Bbb N$ that is onto but not 1–1. $f(x) = (x-2)^2$ because $f(1)=f(3)$ while $1 \neq 3$

Exercise 1.2.10. Decide which of the following are true statements. Provide a short justification for those that are valid and a counterexample for those that are not:

(a) Two real numbers satisfy a < b if and only if a < b + $\epsilon$ for every $\epsilon$ > 0. The converse is FALSE if we take a=b=5.

(b) Two real numbers satisfy a < b if a < b + $\epsilon$ for every $\epsilon$ > 0. The statement is FALSE if we take a=b=5.

(c) Two real numbers satisfy a ≤ b if and only if a < b + $\epsilon$ for every $\epsilon$ > 0. Forward (trivial): $a \le b \lt b + \epsilon$. Reverse: Suppose $a \lt b + \epsilon$, $\forall \epsilon \gt 0$. Let $\delta = a - b$, then $b + \delta = b + a -b = a$ so that $a \not \lt b + \delta$. $\delta \not \gt 0$, so $\delta = a - b \le 0$. Hence $a \le b$.

Source: https://math.stackexchange.com/questions/1633992/if-true-prove-that-2-real-numbers-satisfy-ab-iff-ab-epsilon-forall-e/1633997

Exercise 1.2.12. Let $y_1 = 6$, and for each $n\in \Bbb N$ define $y_{n+1} = (2y_n − 6)/3$.

(a) Use induction to prove that the sequence satisfies $y_n > −6$ for all $n \in \Bbb N$.

Base Case: $y_1 = 6 > -6$
Inductive case. Assume $y_k>-6$.
$y_{k+1}=\frac{2y_k}{3}-2>\frac{2\times(-6)}{3}-2=-4-2=-6$
By induction our original claim is proved.

(b) Use another induction argument to show the sequence $(y_1, y_2, y_3, \dots)$ is decreasing.

Base Case: $y_2 = 2 < 6 = y_1$
Inductive case. Assume $y_{k+1}<y_k$.
$y_{k+2}=\frac{2y_{k+1}}{3}-2 =\frac{2y_{k+1}}{3}+\frac{-6}{3} <\frac{2y_{k+1}}{3}+\frac{y_{k+1}}{3} =y_{k+1}$
By induction our original claim is proved.

Exercise 1.3.2. Give an example of each of the following, or state that the request is impossible.

(a) A set B with inf B $\geq$ sup B. $B={1}$

(b) A finite set that contains its infimum but not its supremum. Except for $\emptyset$, by Axiom of Completeness, we cannot find such set.

(c) A bounded subset of Q that contains its supremum but not its infimum. $C={1/x|x\in\Bbb N}$ contains its supremum 1 but not its infimum 0.

Exercise 1.3.4. Let $A_1,A_2,A_3,\dots$ be a collection of nonempty sets, each of which is bounded above.

(a)Find a formula for $sup(A_1 \cup A_2)$. Extend this to $sup(\cup^n_{k=1}A_k)$. $sup {sup A_1, sup A_2}$ $sup {sup A_1, sup A_2 \dots sup A_n}$

(b) Consider $sup(\cup^{\infty}_{k=1}A_k)$. Does the formula in (a) extend to the infinite case?

No. Consider $A_i = {i}$, we have $sup(\cup^n_{k=1}A_k)=i$, but $sup(\cup^{\infty}_{k=1}A_k)$ does not exist.

Exercise 1.3.6. Given sets A and B, define $A+B = {a+b : a \in A$ and $b \in B}$. Follow these steps to prove that if A and B are nonempty and bounded above then sup(A + B) = supA + supB.

(a) Let s = sup A and t = sup B. Show s + t is an upper bound for A + B. Take $a \in A$ and $b \in B$, by definition, $a\leq s$ and $b \leq t$ and $a+b \in A+B$. So $a+b \leq s+t$.

(b) Now let $u$ be an arbitrary upper bound for A + B, and temporarily fix $a \in A$. Show $t \leq u − a$. By definition of $A + B$ and $\sup(A + B)$, for all $a \in A$ and $b \in B$, $${a + b} {\leq \sup (A + B)} {\leq u}.$$

If we fix $a \in A$, then ${\sup (A + B)} - a$ is an upper bound for $${A + B} - A = B.$$

Subtract $a$ from both sides:

$$b = {a + b} - a \leq \sup (A + B) \leq u - a.$$

And so by definition of $\sup B$, for every $a \in A$, $$\sup B =t \leq \sup (A+ B) − a\leq u - a.$$

(c) Finally, show sup(A + B) = s + t. Rearrange the previous inequality in (b): ${a} \leq \sup(A +B) − \sup B$ for all $a \in A$.

Hence, $\sup(A +B) − \sup B$ is an upper bound for any a.

By the definition of supremum, the previous inequality means: ${\sup A} \leq \sup(A + B) − \sup B \iff \sup A + \sup B \leq \sup(A + B).$ i.e.

$$s+t \leq sup(A+B)$$

Also, by inequality $a+b \leq s+t$ in (a) and the definition of supremum: $$sup(A+B)\leq s+t$$

We conclude that $$sup(A+B)= s+t.$$

(d) Construct another proof of this same fact using Lemma 1.3.8.

Let $\epsilon \gt 0.$ Then there exists $a \in A$ and $b \in B$ such that $a \gt \sup A − \frac{\epsilon}{2}$ and $b \gt \sup B − \frac{\epsilon}{2}.$ Then $a + b \in A + B$. We have $${\sup(A + B)} \geq a + b {\gt \sup A + \sup B - \epsilon} \implies { \sup(A + B) \gt \sup A + \sup B - \epsilon }.$$ Since $\epsilon$ is arbitrary, $\sup(A + B) \geq \sup A + \sup B=s+t$

Take $a \in A$ and $b \in B$, by definition, $a\leq s$ and $b \leq t$ and $a+b \in A+B$. So $a+b \leq s+t$. Also, by inequality $a+b \leq s+t$ in (a) and the definition of supremum: $$sup(A+B)\leq s+t = sup(A+B)$$

We conclude that $$sup(A+B)= s+t.$$

Source: https://math.stackexchange.com/questions/4551/how-can-i-prove-supab-sup-a-sup-b-if-ab-ab-mid-a-in-a-b-in-b

Exercise 1.3.8. Compute, without proofs, the suprema and infima (if they exist) of the following sets:

(a) ${m/n : m, n \in N$ with $m < n}$. sup: $1$ inf: $0$

(b) ${(−1)^m/n : m, n \in N}$. sup: $1$ inf: $-1$

(d) ${m/(m+ n) : m, n \in N}$. sup: 1 inf: 0

Exercise 1.3.9.

(a) If supA < supB, show that there exists an element $b \in B$ that is an upper bound for A. Take $\epsilon=supB-supA$ and take $b \in B$ where $b>supB-\epsilon /2$ as desired.

(b) Give an example to show that this is not always the case if we only assume supA ≤ supB. Take $A={0}$ and $B={-1/n,n \in \Bbb N}$

Exercise 1.3.10 (Cut Property).

(a) Use the Axiom of Completeness to prove the Cut Property. Suppose we have the axiom of completeness and assume you have $A$ and $B$ as in the statement of the cut property. Then, as $B$ is nonempty, $A$ has an upper bound. Let $c$ be the least upper bound for $A$.

For $a\in A$, $a\le c$, because $c$ is an upper bound for $A$; For $b\in B$, $c\le b$, because $b$ is an upper bound for $A$ and $c$ is the least upper bound.

Source: https://math.stackexchange.com/questions/1616583/use-the-axiom-of-completeness-to-prove-the-cut-property

(b)Show that the implication goes the other way. Suppose we know the Cut Property. Consider a nonempty set $E$ with an upper bound. Then let

$B={x\in\mathbb{R}: x\geq e \forall e\in E}$ i.e. $B$ is the set of all upper bounds of $E$

and let $A$ be the complement of $B$. $A={x\in\mathbb{R}: x\lt e$ for some $e\in E}$

Since $E$ is non-empty and bounded above, $B$ is nonempty as well as $A$. The union of $A$ and $B$ is $\mathbb{R}$ by construction. Suppose $a\in A$ and $b\in B$. If $b\le a$, we have $e\leq a \forall e\in E$, so $A\in B$: a contradiction.

Since $b>a$ for all $a \in A$ and $b \in B$, we know there exists $d$ such that $a \leq d$ and $d \leq b$ by Cut Property. We want to show that $d$ is the supremum for E.

To show that $d$ is an upper bound of $E$, suppose some $s$ in $E$ exceeds $d$. Since $(s + d)/2$ exceeds $d$, it belongs to $B$, so by the definition of $B$ it must be an upper bound of $E$, which is impossible since $s > (s + d)/2$. To show that $d$ is a least upper bound of $E$, suppose that some $a < d$ is an upper bound of $E$. But $a$ (being less than $d$) is in $A$, so it can’t be an upper bound of $E$.

Source: https://arxiv.org/abs/1204.4483

(c) give a concrete example showing that the Cut Property is not a valid statement when $\Bbb R$ is replaced by $\Bbb Q$. Hint: Find the break point in $\Bbb Q:$ $\sqrt{5},\sqrt{3},\dots$ Consider $A = (-\infty, 0) \cup {x \ge 0 : x^2 \le 3}$ and $B = {x \ge 0 : x^2 \gt 3}$ If such a number $c$ existed, we would have $c^2 = 3$. But there is no rational number for which this is true.

Exercise 1.3.11. Without worrying about formal proofs for the moment, decide if the following statements about suprema and infima are true or false. For any that are false, supply an example where the claim in question does not appear to hold.

(a) TRUE. Since $A \subset B$, $\sup B$ is an upper bound for $A$. Since $\sup A$ is the least upper bound for $A$ by definition, it must be less than or equal $\sup B$.

(b) TRUE. Take $c=(sup A + inf B)/2$ will work for nonempty sets $A$ and $B$.

Exercise 1.4.2. Let $A \subseteq \Bbb R$ be nonempty and bounded above, and let $s \in \Bbb R$ have the property that for all $n \in \Bbb N$, s + 1/n is an upper bound for A and s − 1/n is not an upper bound for A. Show s = supA.

Suppose s is not an upper bound for A. Then $\exists a \in A$ such that $s \lt a$. Take $\delta = a - s$ and $n_0 \in \Bbb N$ to be large enough so that $1/\delta < n_0$ i.e. $1/n_0 < \delta$. By definition, $s+1/n_0$ is an upper bound for $A$, but $s+1/n_0<s+\delta=a\in A$: a contradiction.

Let $\epsilon>0$. Take $n_1 \in \Bbb N$ to be large enough so that $1/\epsilon < n_1$ i.e. $1/n_1 < \epsilon$. By definition, $\exists a \in A$ such that $ s-\epsilon \lt s-1/n_1 \lt a$. Hence s = sup A.

Exercise 1.4.4. Let $a \lt b$ be real numbers and consider the set $T=\mathbb{Q}\cap[a,b]$. Show $\sup T=b$

If $x\in T$, then $x\in [a,b]$, and if $x\in [a,b]$, then $x\leq b$ i.e. $b$ is an upper bound for T.

To show that $b$ is a least upper bound of $T$, suppose that some $c \lt b$ is an upper bound of $T$. Since the rationals are dense in $\Bbb R$ there exists a rational $t$ such that $a\lt c \lt t \lt b$. This means $t \in [a,b]$ and $t \lt c$ by definition of upper bound, which is a contradiction.

Exercise 1.4.6. Which of the following sets are dense in $\Bbb R$? Take $p \in \Bbb Z$ and $q \in \Bbb N$ in every case.

(a) The set of all rational numbers $p/q$ with $q \leq 10$. Not dense in $\Bbb R$. For any distinct $\frac pq$ and $\frac{p’}{q’}$ with $q,q’\le 10$ the difference $$ \frac pq-\frac{p’}{q’}=\frac{pq’-p’q}{qq’}$$ is a fraction with non-zero numerator and denominator$\le 10^2$, hence is $\ge \frac{1}{10^2}$ in absolute value. For example, no element in this set can be found between $1/500$ and $2/500$.

Source: https://math.stackexchange.com/questions/1638526/how-do-you-show-a-set-is-dense-for-example-is-the-set-of-all-rational-numbers

(b) The set of all rational numbers $p/q$ with $q$ a power of $2$. Dense in $\Bbb R$. Consider two arbitrary real numbers $a,b$ with $a\lt b $, By the Archimedean Property there exists $n \in \mathbb N$ such that $$0\lt \frac{1}{n} \lt b-a ;;\text{ which implies} ;; 0\lt \frac{1}{2^{n}}\lt \frac{1}{n}\lt b-a$$ Thus we have $1\lt b2^n-a2^n$. As the distance between $a2^n$ and $b2^n$ is greater than $1$, there exists $m \in \mathbb N$ such that $a2^{n}\lt m\lt b2^{n}$ which implies that $a \lt \frac{m}{2^{n}} \lt b$. Since $a$ and $b$ were arbitrary, the claim is proved.

Source: https://math.stackexchange.com/questions/3968925/proof-of-dyadic-rational-numbers-are-dense-in-mathbb-r

Not dense in $\Bbb R$. Rational numbers between (-1/10,1/20) are missing. For example, no element in this set can be found between $-1/20$ and $-1/30$.

Source: https://www.reddit.com/r/HomeworkHelp/comments/7ruu7u/real_analysis_density_of_subsets_of_q_in_r/

Exercise 1.4.8. Give an example of each or state that the request is impossible. When a request is impossible, provide a compelling argument for why this is the case.

(a) Two sets A and B with $A \cap B = \emptyset$, supA = supB, $supA \not \in A $ and $supB \not \in B$. $A={x|x\in I,x\in (0,1)}$ $B={x|x\in Q,x\in (0,1)}$

(b) A sequence of nested open intervals $J_1 \supseteq J_2 \supseteq J_3 \supseteq \dots $ with $\cap^\infty_{n=1}J_n$ nonempty but containing only a finite number of elements. $J_n = (5-1/n,5+1/n), n \in \Bbb N, \cap^\infty_{n=1}J_n=5$

(c) A sequence of nested unbounded closed intervals $L_1 \supseteq L_2 \supseteq L_3 \supseteq \dots $ with $\cap^\infty_{n=1}L_n=\emptyset$ (An unbounded closed interval has the form $[a,\infty) = {x \in \Bbb R : x \geq a}$.) $L_n = [n,\infty), n \in \Bbb N, \cap^\infty_{n=1}J_n=\emptyset$

(d) A sequence of closed bounded (not necessarily nested) intervals $I_1, I_2, I_3, \dots$ with the property that $\cap^N_{n=1} I_n \neq \emptyset$ for all $N \in \Bbb N$, but $\cap^\infty_{n=1} I_n = \emptyset$. The answer is negative, because then $\cap^N_{n=1} I_n$ for all $N \in \Bbb N$ is a decreasing sequence of non-empty closed and bounded intervals and therefore its intersection is non-empty.

Source: https://math.stackexchange.com/questions/2619781/intersection-of-a-sequence-of-closed-intervals

Appendix for unused sources

By definition, $d$ is an upper bound for A. So it is an upper bound for $E$, because if there exists $e \in E$ with $d<e$, then $d<\frac{d+e}{2}$. $\frac{d+e}{2}$ cannot be in $B$ (indeed, it’s not an upper bound for $E$, because it’s less than $e$) so it must be in $A$, but this contradicts that $d$ is an upper bound for $A$.

Source: https://math.stackexchange.com/questions/2228772/assume-mathbbr-possesses-the-cut-property-and-let-e-be-a-nonempty-that-is-b

If possible, suppose $A$ has the greatest member, say $a'$. Then, $a’ \in A \Rightarrow a’ \not\in B$. We know $\exists s \in E$ such that $a’ \lt s$, since $a'<(a'+s)/2 \in B$, $(a'+s)/2 $ is an upper bound of $S$. This contradiction leads to the fact that $A$ has no greatest member. And so, $B$ has the least member. Hence, the set of upper bounds of a non-empty set 𝑆 bounded above has a least member, which is the completeness axiom in $\Bbb R$. Hence the theorem is proved.

Mathematics Theorems and Proofs in Applied Multivariate Statistical Analysis (CH.1)

Wed, 21 Jul 2021 01:00:00 +0000

Details in Chapter 1 (Johnson & Wichern, 2002)

P78 (2-48)

Cauchy-Schwarz Inequality. Let $\mathbf{b}$ and $\mathbf{d}$ be any two $p\times 1$ vectors. Then $$ \left(\mathbf{b}^{\prime} \mathbf{d}\right)^{2}\leq(\mathbf{b}^{\prime} \mathbf{b}){(\mathbf{d}^{\prime} \mathbf{d})} $$ with equality if and only if $\mathbf{b}=c\mathbf{d}$ (or $c\mathbf{d}=\mathbf{b}$) for some constant c.

Proof. The inequality is obvious if either $\mathbf{b}=\mathbf{0}$ or $\mathbf{d}=\mathbf{0}$. Excluding this possibility, consider the vector $\mathbf{b}-x \mathbf{d}$, where $x$ is an arbitrary scalar. Since the length of $\mathbf{b}-x \mathbf{d}$ is positive for $\mathbf{b}-x \mathbf{d} \neq \mathbf{0}$, in this case $$ \begin{aligned} 0<(\mathbf{b}-x \mathbf{d})^{\prime}(\mathbf{b}-x \mathbf{d}) &=\mathbf{b}^{\prime} \mathbf{b}-x \mathbf{d}^{\prime} \mathbf{b}-\mathbf{b}^{\prime}(x \mathbf{d})+x^{2} \mathbf{d}^{\prime} \mathbf{d} \
&=\mathbf{b}^{\prime} \mathbf{b}-2 \boldsymbol{x}\left(\mathbf{b}^{\prime} \mathbf{d}\right)+x^{2}\left(\mathbf{d}^{\prime} \mathbf{d}\right) \end{aligned} $$ The last expression is quadratic in $x .$ If we complete the square by adding and subtracting the scalar $\left(\mathbf{b}^{\prime} \mathbf{d}\right)^{2} / \mathbf{d}^{\prime} \mathbf{d}$, we get $$ \begin{gathered} 0<\mathbf{b}^{\prime} \mathbf{b}-\frac{\left(\mathbf{b}^{\prime} \mathbf{d}\right)^{2}}{\mathbf{d}^{\prime} \mathbf{d}}+\frac{\left(\mathbf{b}^{\prime} \mathbf{d}\right)^{2}}{\mathbf{d}^{\prime} \mathbf{d}}-2 x\left(\mathbf{b}^{\prime} \mathbf{d}\right)+x^{2}\left(\mathbf{d}^{\prime} \mathbf{d}\right) \
=\mathbf{b}^{\prime} \mathbf{b}-\frac{\left(\mathbf{b}^{\prime} \mathbf{d}\right)^{2}}{\mathbf{d}^{\prime} \mathbf{d}}+\left(\mathbf{d}^{\prime} \mathbf{d}\right)\left(x-\frac{\mathbf{b}^{\prime} \mathbf{d}}{\mathbf{d}^{\prime} \mathbf{d}}\right)^{2} \end{gathered} $$ The term in brackets is zero if we choose $x=\mathbf{b}^{\prime} \mathbf{d} / \mathbf{d}^{\prime} \mathbf{d}$, so we conclude that $$ 0<\mathbf{b}^{\prime} \mathbf{b}-\frac{\left(\mathbf{b}^{\prime} \mathbf{d}\right)^{2}}{\mathbf{d}^{\prime} \mathbf{d}} $$ or $\left(\mathbf{b}^{\prime} \mathbf{d}\right)^{2}<\left(\mathbf{b}^{\prime} \mathbf{b}\right)\left(\mathbf{d}^{\prime} \mathbf{d}\right)$ if $\mathbf{b} \neq x \mathbf{d}$ for some $x$ Note that if $\mathbf{b}=c \mathbf{d}, 0=(\mathbf{b}-c \mathbf{d})^{\prime}(\mathbf{b}-c \mathbf{d})$, and the same argument produces $\left(\mathbf{b}^{\prime} \mathbf{d}\right)^{2}=\left(\mathbf{b}^{\prime} \mathbf{b}\right)\left(\mathbf{d}^{\prime} \mathbf{d}\right)$

Extended Cauchy-Schwarz Inequality. Let $\mathbf{b}$ and $\mathbf{d}$ be any two $p\times 1$ vectors, and let $\mathbf{B}$ be a positive definite matrix. Then $$ \left(\mathbf{b}^{\prime} \mathbf{d}\right)^{2}\leq(\mathbf{b}^{\prime} \mathbf{B} \mathbf{b}){(\mathbf{d}^{\prime} \mathbf{B}^{-1} \mathbf{d})} $$

with equality if and only if $\mathbf{b}=c\mathbf{B}^{-1}\mathbf{d}$ or $\mathbf{d}=c\mathbf{B}\mathbf{b}$ for some constant c.

Proof. The inequality is obvious when $\mathbf{b}=\mathbf{0}$ or $\mathbf{d}=\mathbf{0}$. For cases other than these, consider the square-root matrix $\mathbf{B}^{1 / 2}$ defined in terms of its eigenvalues $\lambda_{i}$ and the normalized eigenvectors $\mathbf{e}_{i}$ as $\mathbf{B}^{1 / 2}=\sum_{i=1}^{p} \sqrt{\lambda_{i}} \mathbf{e}_{i} \mathbf{e}_{i}^{\prime} .$ If we set $$ \mathbf{B}^{-1 / 2}=\sum_{i=1}^{p} \frac{1}{\sqrt{\lambda_{i}}} \mathbf{e}_{i} \mathbf{e}_{i}^{\prime} $$ it follows that $$ \mathbf{b}^{\prime} \mathbf{d}=\mathbf{b}^{\prime} \mathbf{I} \mathbf{d}=\mathbf{b}^{\prime} \mathbf{B}^{1 / 2} \mathbf{B}^{-1 / 2} \mathbf{d}=\left(\mathbf{B}^{1 / 2} \mathbf{b}\right)^{\prime}\left(\mathbf{B}^{-1 / 2} \mathbf{d}\right) $$ and the proof is completed by applying the Cauchy-Schwarz inequality to the vectors $\left(\mathbf{B}^{1 / 2} \mathbf{b}\right)$ and $\left(\mathbf{B}^{-1 / 2} \mathbf{d}\right)$

Let $\mathbf{u}=\mathbf{B}^{1 / 2} \mathbf{b}$ and $\mathbf{v}=\mathbf{B}^{-1 / 2} \mathbf{d}$, we have $$ \mathbf{b}^{\prime} \mathbf{d}=\left(\mathbf{B}^{1 / 2} \mathbf{b}\right)^{\prime}\left(\mathbf{B}^{-1 / 2} \mathbf{d}\right)\leq(\mathbf{b}^{\prime} \mathbf{b}){(\mathbf{d}^{\prime} \mathbf{d})}=(\mathbf{B}^{1 / 2} \mathbf{b})^{\prime}(\mathbf{B}^{1 / 2} \mathbf{b})(\mathbf{B}^{-1 / 2} \mathbf{d})(\mathbf{B}^{-1 / 2} \mathbf{d})^{\prime}=(\mathbf{b}^{\prime} \mathbf{B} \mathbf{b}){(\mathbf{d}^{\prime} \mathbf{B}^{-1} \mathbf{d})} $$

The extended Cauchy-Schwarz inequality gives rise to the following maximization result.

Maximization Lemma. Let $\underset{(p \times p)}{\mathbf{B}}$ be positive definite and $\underset{(p \times 1)}{\mathbf{d}}$ be a given vector. Then, for an ărbitrary nonzero vector $\underset{(p \times 1)}{\mathbf{x}}$, $$ \max _{\mathbf{x} \neq \boldsymbol{\theta}} \frac{\left(\mathbf{x}^{\prime} \mathbf{d}\right)^{2}}{\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}}=\mathbf{d}^{\prime} \mathbf{B}^{-\mathbf{1}} \mathbf{d} $$ with the maximum attained when $\underset{(p \times 1)}{\mathbf{x}}=\underset{(p \times p)( p \times 1)}{\mathbf{d}}$ for any constant $c \neq 0$. Proof. By the extended Cauchy-Schwarz inequality, $\left(\mathbf{x}^{\prime} \mathbf{d}\right)^{2} \leq\left(\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}\right)\left(\mathbf{d}^{\prime} \mathbf{B}^{-1} \mathbf{d}\right)$. Because $\mathbf{x} \neq \mathbf{0}$ and $\mathbf{B}$ is positive definite, $\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}>0$. Dividing both sides of the inequality by the positive scalar $\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}$ yields the upper bound $$ \frac{\left(\mathbf{x}^{\prime} \mathbf{d}\right)^{2}}{\boldsymbol{x}^{\prime} \mathbf{B} \mathbf{x}} \leq \mathbf{d}^{\prime} \mathbf{B}^{-1} \mathbf{d} $$

Taking the maximum over $\mathbf{x}$ gives Equation $(2-50)$ because the bound is attained for $\mathbf{x}=c \mathbf{B}^{-1} \mathbf{d} .$

A final maximization result will provide us with an interpretation of eigenvalues.

Maximization of Quadratic Forms for Points on the Unit Sphere. Let $\mathbf{B}$ be a positive definite matrix with eigenvalues $\lambda_{1} \geq \lambda_{2} \geq \cdots \geq \lambda_{p} \geq 0$ and associated normalized eigenvectors $\mathbf{e}_{\mathbf{1}}, \mathbf{e}_{2}, \ldots, \mathbf{e}_{p}$. Then

$$ \max {\mathbf{x} \neq \mathbf{0}} \frac{\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}}{\mathbf{x}^{\prime} \mathbf{x}}=\lambda{1}\quad \text { (attained when } \mathbf{x}=\mathbf{e}_{1} \text {)} $$

$$ \min {\mathbf{x} \neq 0} \frac{\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}}{\mathbf{x}^{\prime} \mathbf{x}}=\lambda{p} \quad \text { (attained when } \mathbf{x}=\mathbf{e}_{p} \text {)} $$

Moreover,

$$ \max {\mathbf{x} \perp \mathbf{e}{\mathbf{1}},\ldots,\mathbf{e}{\mathbf{k}}} \frac{\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}}{\mathbf{x}^{\prime} \mathbf{x}}=\lambda{k+1} \quad \text { (attained when } \mathbf{x}=\mathbf{e}_{k+1} \text {, } k=1,2,\ldots,p-1 \text {)} $$

where the symbol $\perp$ is read “is perpendicular to.

Proof. Let $\underset{( p \times p)}{\mathbf{P}}$ be the orthogonal matrix whose columns are the eigenvectors $\mathbf{e}{1}, \mathbf{e}{2}, \ldots, \mathbf{e}{p}$ and $\mathbf{A}$ be the diagonal matrix with eigenvalues $\lambda{1}, \lambda_{2}, \ldots, \lambda_{p}$ along the main diagonal. Let $\mathbf{B}^{1 / 2}=\mathbf{P} \Lambda^{1 / 2} \mathbf{P}^{\prime}$ and $\underset{(p \times 1)}{\mathbf{y}}=\underset{(p \times p)(p \times 1)}{\mathbf{x}}$. Consequently, $\mathbf{x} \neq \boldsymbol{0}$ implies $\mathbf{y} \neq \mathbf{0}$. Thus, $$ \begin{aligned} \frac{\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}}{\mathbf{x}^{\prime} \mathbf{x}} &=\frac{\mathbf{x}^{\prime} \mathbf{B}^{1 / 2} \mathbf{B}^{1 / 2} \mathbf{x}}{\mathbf{x}^{\prime} \underbrace{\mathbf{P P}^{\prime}}_{\mathbf{I} \atop(p \times p)} \mathbf{x}}=\frac{\mathbf{x}^{\prime} \mathbf{P} \mathbf{\Lambda}^{1 / 2} \mathbf{P}^{\prime} \mathbf{P} \mathbf{\Lambda}^{1 / 2} \mathbf{P}^{\prime} \mathbf{x}}{\mathbf{y}^{\prime} \mathbf{y}}=\frac{\mathbf{y}^{\prime} \mathbf{\Lambda} \mathbf{y}}{\mathbf{y}^{\prime} \mathbf{y}} \
&=\frac{\sum_{i=1}^{p} \lambda_{i} y_{i}^{2}}{\sum_{i=1}^{p} y_{i}^{2}} \leq \lambda_{1} \frac{\sum_{i=1}^{p} y_{i}^{2}}{\sum_{i=1}^{p} y_{i}^{2}}=\lambda_{\mathrm{I}} \end{aligned} $$

Setting $\mathbf{x}=\mathbf{e}{1}$ gives $$ \mathbf{y}=\mathbf{P}^{\prime} \mathbf{e}{1}=\left[\begin{array}{c} 1 \
0 \
\vdots \
0 \end{array}\right] $$ since $$ \mathbf{e}{k}^{\prime} \mathbf{e}{1}= \begin{cases}1, & k=1 \ 0, & k \neq 1\end{cases} $$ For this choice of $\mathbf{x}$, we have $\mathbf{y}^{\prime} \mathbf{\Lambda} \mathbf{y} / \mathbf{y}^{\prime} \mathbf{y}=\lambda_{1} / 1=\lambda_{1}$, or $$ \frac{\mathbf{e}_{1}^{\prime} \mathbf{B e}_{1}}{\mathbf{e}_{1}^{\prime} \mathbf{e}_{1}}=\mathbf{e}_{1}^{\prime} \mathbf{B e}_{1}=\lambda_{1} $$ A similar argument produces the second part of $(2-51)$. Now, $\mathbf{x}=\mathbf{P y}=y_{1} \mathbf{e}_{1}+y_{2} \mathbf{e}_{2}+\cdots+y_{p} \mathbf{e}_{p}$, so $\mathbf{x} \perp \mathbf{e}_{1}, \ldots, \mathbf{e}_{k}$ implies $$ 0=\mathbf{e}_{i}^{\prime} \mathbf{x}=y_{1} \mathbf{e}_{i}^{\prime} \mathbf{e}_{1}+y_{2} \mathbf{e}_{i}^{\prime} \mathbf{e}_{2}+\cdots+y_{p} \mathbf{e}_{i}^{\prime} \mathbf{e}_{p}=y_{i}, \quad i \leq k $$ Therefore, for $x$ perpendicular to the first $k$ eigenvectors $\mathbf{e}_{i}$, the left-hand side of the inequality in $(2-53)$ becomes $$ \frac{\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}}{\mathbf{x}^{\prime} \mathbf{x}}=\frac{\sum_{i=k+1}^{p} \lambda_{i} y_{i}^{2}}{\sum_{i=k+1}^{p} y_{i}^{2}} $$ Taking $y_{k+1}=1, y_{k+2}=\cdots=y_{p}=0$ gives the asserted maximum. For a fixed $\mathbf{x}_{0} \neq \mathbf{0}, \mathbf{x}_{0}^{\prime} \mathbf{B} \mathbf{x}_{0} / \mathbf{x}_{0}^{\prime} \mathbf{x}_{0}$ has the same value as $\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}$, where $\mathbf{x}^{\prime}=\mathbf{x}_{0}^{\prime} / \sqrt{\mathbf{x}_{0}^{\prime} \mathbf{x}_{0}}$ is of unit length. Consequently, Equation (2-51) says that the largest eigenvalue, $\lambda_{1}$, is the maximum value of the quadratic form $\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}$ for all points $\mathbf{x}$ whose distance from the origin is unity. Similarly, $\lambda_{p}$ is the smallest value of the quadratic form for all points $x$ one unit from the origin. The largest and smallest eigenvalues thus represent extreme values of $\mathbf{x}^{\prime} \mathbf{B} \mathbf{x}$ for points on the unit sphere. The “intermediate” eigenvalues of the $p \times p$ positive definite matrix $B$ also have an interpretation as extreme values when $\mathbf{x}$ is further restricted to be perpendicular to the earlier choices.

An Example of the Application of Cauchy-Schwarz Inequality (Cramér, 1946)

In statistical problems, large amounts of data are collected to study a phenomenon. With a desire to derive a mathematical model to describe it, we may find, numerically, a function $\widetilde{\phi}$ to approximate a parameter $\phi$. $\widetilde{\phi}$ is called an unbiased estimator of $\phi$ if $E(\widetilde{\phi})=\phi . \quad$ That is $$ \int_{-\infty}^{\infty} \widetilde{\phi} f_{\theta}(x) d x=\phi(\theta) $$ Here, $\theta$ and $x$ are independent parameters. Differentiating this with respect to $\theta$ and interchanging integration and differentiation (provided of course that this is permissible) gives: $$ \int_{-\infty}^{\infty} \widetilde{\phi}(x) \frac{\partial f_{\theta}}{\partial \theta}(x) d x=\phi^{\prime}(\theta) $$ The rate of change of information is the function $$ S(x):=\frac{\partial}{\partial \theta} \log f_{\theta}(x) $$ called the score statistic. Plainly, $S(x)=\frac{1}{f_{\theta}(x)} \frac{\partial f_{\theta}}{\partial \theta}(x)$, so that we can write $$ \int_{-\infty}^{\infty} \widetilde{\phi}(x) S(x) f_{\theta}(x) d x=\phi^{\prime}(\theta) . $$ Also, the expectation of $S(x)$ is $$ E(S(x))=\int_{-\infty}^{\infty} S(x) f_{\theta}(x) d x=\int_{-\infty}^{\infty} \frac{\partial f_{\theta}}{\partial \theta}(x) d x=\frac{\partial}{\partial \theta} \int_{-\infty}^{\infty} f_{\theta}(x) d x=0 $$ since $$ \int_{-\infty}^{\infty} f_{\theta}(x) d x=1 $$ because the total probability is $1 .$ Thus, (4.1) can be re-written as $$ \int_{-\infty}^{\infty}(\widetilde{\phi}(x)-\phi(\theta)) S(x) f_{\theta}(x) d x=\phi^{\prime}(\theta) . $$ Applying the Cauchy-Schwarz inequality, we obtain $$ \phi^{\prime}(\theta)^{2} \leq\left(\int_{-\infty}^{\infty}(\widetilde{\phi}(x)-\phi(\theta))^{2} f_{\theta}(x) d x\right)\left(\int_{-\infty}^{\infty} S(x)^{2} f_{\theta}(x) d x\right) $$

Writing $$ I(\theta):=\int_{-\infty}^{\infty}\left(\frac{\partial \log f_{\theta}}{\partial \theta}\right)^{2} f_{\theta}(x) d x $$ (called Fisher information in statistical parlance), we can write our inequality as:

Theorem 5 (The Cramér-Rao inequality). For an unbiased estimator $\widetilde{\phi}$ of $\phi$, we have $$ \int_{-\infty}^{\infty}(\widetilde{\phi}(x)-\phi(\theta))^{2} f_{\theta}(x) d x \geq \frac{\phi^{\prime}(\theta)^{2}}{I(\theta)} . $$ Often, this is applied with $\phi(\theta)=\theta$ so that $\phi^{\prime}(\theta)=1$. The inequality then gives us a limitation on the accuracy of the unbiased estimator to the function $\theta$. Somtimes it is referred to as the information inequality. It was discovered independently by C. R. Rao [10] and H. Cramér [2] in 1945 and has played a pivotal role in statistical inference. An enlightening survey of the Cramér-Rao inequality was written by K.R. Parthasarathy [9] where the reader can find discussion of Riemannian metrics to study population models.

Regarding Theorem 5 , there is a lot of interest in estimators that actually achieve the Cramer-Rao lower bound. Such estimators are said to be asymptotically efficient. Under certain regularity conditions the maximum likelihood estimators are asymptotically efficient. In such cases the Fisher information about $\theta$ in the data is equal to the inverse of the variance of the estimator.

Reference

Cramér, H. (1946). Mathematical methods of statistics. Princeton University Press, Princeton.
Johnson, R. A., Wichern, D. W. (2002). Applied multivariate statistical analysis. Upper Saddle River, NJ: Prentice Hall. ISBN: 0130925535