(p.373) Chapter Twenty Exercises
(p.373) Chapter Twenty Exercises
Abstract and Keywords
This chapter contains a list of exercises based on the lessons from all the previous chapters as well as their possible solutions. It also contains some supplementary information or references to places where the reader might obtain the supplementary information necessary for a better understanding of the concepts introduced in this volume. In addition to this, the chapter also contains brief bibliographies to further assist the reader or instructor in completing the exercises. It must be noted that some of the problems and exercises listed in this chapter are far more accessible after the later parts of this book have been read.
Keywords: exercises, Benford's law, bibliography, supplementary information, mathematical problems
20.1 A Quick Introduction to Benford’s Law
A couple of important points.
• There are many problems that would fit in multiple chapters. To help both the instructors and the readers, we have decided to collect them here. Thus, some of the exercises in this chapter will be far more accessible after reading later parts of the book.
• In Mathematica, if you define the following function you can then use it to find the first digit: firstdigit[x_] := Floor[10^Mod[Log[10,x],1]] (a similar function is definable in other languages, but the syntax will differ slightly).
Exercise 20.1.1. If X is Benford base 10, find the probability that its significand starts 2.789.
Exercise 20.1.2. If X is Benford base 10, find the probability that its significand starts with 7.5 (in other words, its significand is in [7.5, 7.6)).
Exercise 20.1.3. If X is Benford base 10, find the probability that its significand has no 7s in the first k digits (thus a significand of 1.701 would have no 7 in its first digit, but it would have a 7 in its first two digits.
Exercise 20.1.4. Consider α^{n} for various α and various ranges of n; for example, take $\alpha \in \{2,3,5,10,\sqrt{2},\sqrt{5},\sqrt{10},\pi ,e,\gamma \}$ α (here γ is the Euler–Mascheroni constant; see
http://en.wikipedia.org/wiki/EulerMascheroni_constant for a description and properties), and let n go from 1 to N, where N ∈ {103, 105, 107}. Which of these data sets do you expect to be Benford? Why or why not? Read up about chisquare goodness of fit tests (see for example
http://en.wikipedia.org/wiki/Pearson_chi_square)and compare the observed frequencies with the Benford probabilities.
(p.374) Exercise 20.1.5. Revisit the previous problem with more values of N. The problem is that there we looked at three snapshots of the behavior; it is far more interesting to plot the chisquare values as a function of N , for N ranging from say 100 to 107 or more. You will see especially interesting behavior if you look at the first digits of π^{n}.
Exercise 20.1.6. We have seen that the Benford behavior of a sequence is related to equidistribution of its logarithm. Thus, in the previous problem it may be useful to look at a loglog plot. Thus instead of plotting the chisquare value against the upper bound N, plot the logarithm of the chisquare value against log N.
Exercise 20.1.7. Frequently taking logarithms helps illuminate relationships. For example, Kepler’s third law (see http://www.physicsclassroom.com/class/circles/Lesson4/KeplersThreeLaws) says that the square of the time it takes a planet to orbit a sun is proportional to the cube of the semimajor axis. Find data for these quantities for the eight planets in our system (or nine if you count Pluto!) and plot them, and then do a loglog plot. A huge advantage of loglog plots is that linear relations are easy to observe and estimate; try to find the best fit line here, and note that the slope of the line should be close to 1.5 (if T is the period and L is the length of the semimajor axis, Kepler’s third law is that there is a constant C such that T2 = CL3, or equivalently $T=C{L}^{3/2},$, or $\mathrm{log}T=\frac{3}{2}\mathrm{log}L+\mathrm{log}C).$ Revisit the original plot, and try to see that it supports T 2 is proportional to L3!
Exercise 20.1.8. Prove the loglaws: if log_{b} x_{i} = y_{i} and r > 0 then
• log_{b} b = 1 and logb1 = 0 (note log_{b} x = y means x = b^{y});
• log_{b}(x^{r}) = r log_{b} x;
• log_{b}(x1x2) = log_{b} x1 + log_{b} x2 (the logarithm of a product is the sum of the logarithms);
• log_{b}(x1/x2) = log_{b} x1−log_{b} x2 (the logarithm of a quotient is the difference of the logarithms; this follows directly from the previous two loglaws);
• log_{c} x = log_{b} x/ log_{b} c (this is the change of base formula).
Exercise 20.1.9. The last loglaw (the change of base formula) is often forgotten, but is especially important. It tells us that if we can compute logarithms in one base then we can compute them in any base. In other words, it suffices to create just one table of logarithms, so we only need to find one base where we can easily compute logarithms. What base do you think that is, and how would you compute logarithms of arbitrary positive real numbers?
Exercise 20.1.10. The previous problem is similar to issues that arise in probability textbooks. These books only provide tables of probabilities of random variables drawn from a normal distribution,^{1} as one can convert from such a table to probabilities for any other random variable. One such table is online here: (p.375) http://mathsisfun.com/data/standardnormaldistributiontable.html . Use a standard table to determine the probability that a normal random variable with mean μ = 5 and variance σ^{2} = 16 (so the standard deviation is σ = 4) takes on a value between −3 and 7. Thus, similarly to the change of base formula, there is an enormous computational saving as we only need to compute probabilities for one normal distribution.
Exercise 20.1.11. Prove $\frac{d}{dx}{\mathrm{log}}_{b}x=\frac{1}{x\mathrm{log}b}.$. Hint: First do this when b = e, the base of the natural logarithms; use e^{log x} = x and the chain rule.
Exercise 20.1.12. Revisit the first two problems, but now consider some other sequences, such as n!, cos(n) (in radians of course as otherwise the sequence is periodic), n^{2}, n^{3}, n^{log n}, n^{log log n}, n^{log log log n}, n^{n}. In some situations log_{4} does not mean the logarithm base 4, but rather four iterations of the logarithm function. It might be interesting to investigating n^{log}f^{(}n^{) n} under this definition for various integervalued functions f.
Exercise 20.1.13. Revisit the previous problem but for some recurrence relations. For example, try the Fibonacci numbers (F_{n+2} = F_{n+1} + F_{n} with F_{0} = 0 and F_{1} = 1) and some other relations, such as the following.
• Catalan numbers: ${C}_{n}=\frac{1}{n+1}\left(\begin{array}{l}2n\\ n\end{array}\right);$ these satisfy a more involved recurrence
(see http://en.wikipedia.org/wiki/Catalan_number).
• Squaring Fibonaccis: ${G}_{n+2}={G}_{n+1}^{2}+{G}_{n}^{2}$ with G_{0} = 0 and G_{1} = 1.
• F_{p} where p is a prime (i.e., only look at the Fibonaccis at a prime index).
• The logistic map: x_{n+1} = rx_{n}(1−x_{n}) for various choices of r and starting values x_{0} (see http://en.wikipedia.org/wiki/Recurrence_relation).
• Newton’s method for the difference between the nth prediction and the true value. For example, to find the square root of α we use ${x}_{n+1}=\frac{1}{2}\left({x}_{n}+\frac{\alpha}{{x}_{n}}\right),$ and thus we would study the distribution of leading digits of $\sqrt{\alpha}{x}_{n}.$ One could also look at other roots, other numbers, or more complicated functions. For more on Newton’smethod, see http://mathworld.wolfram.com/NewtonsMethod.html.
• The 3x + 1 Map: x_{n+1} = 3x_{n} + 1 if x_{n} is odd and x_{n}/2 if x_{n} is even (though some authors use a slightly different definition, where for x_{n} even, one instead lets x_{n+1} = x_{n}/2^{d}, where d is the highest power of 2 dividing x_{n}). It is conjectured that no matter what positive starting seed x_{0} you take, eventually x_{n} cycles among 4, 2, and 1 for n sufficiently large (or is identically 1 from some point onward if we use the second definition). We return to this problem in Chapter 3.
(p.376) For the remaining problems, whenever a data set satisfies Benford’s Law we mean the strong version of the law. This means the cumulative distribution function of the significand is F_{X}(s) = log_{10}(s) for s ∈ [1, 10), which implies that the probability of a first digit of d is log_{10}(1 + 1/d).
Exercise 20.1.14. If a data set satisfies (the strong version of) Benford’s Law base 10, what are the probabilities of all pairs of leading digits? In other words, what is the probability the first two digits are d_{1}d_{2} (in that order)? What if instead our set were Benford base b?
Exercise 20.1.15. Let X be a random variable that satisfies (the strong version of) Benford’s Law. What is the probability that the second digit is d? Note here that the possible values of d range from 0 to 9.
Exercise 20.1.16. Building on the previous problem, compute the probability that a random variable satisfying the strong version of Benford’s Law has its kth digit equal to d. If we denote these probabilities by p_{k}(d), what is lim_{k→∞} p_{k}(d)? Prove your claim.
Exercise 20.1.17. Find a data set that is spread over several orders of magnitude, and investigate its Benfordness (for example, stock prices or volume traded on a company that has been around for decades).
Exercise 20.1.18. Look at some of the data sets from the previous exercises that were not Benford, and see what happens if you multiply them together. For example, consider n^{2} ·cos(n) (in radians), or n^{2} √10^{n} cos(n), or even larger products. Does this support the claim in the chapter that products of random variables tend to converge to Benford behavior?
Exercise 20.1.19. Let μ_{k;b} denote the mean of significands of k digits of random variables perfectly satisfying Benford’s Law, and let μ_{b} denote the mean of the significands of random variables perfectly following Benford’s Law. What is μ_{k;b} for k ∈ 1, 2, 3? Does μ_{k;b} converge to μ_{b}? If yes, bound μ_{k;b} − μ_{b} as a function of k.
Exercise 20.1.20. Benford’s Law can be viewed as the distribution on significands arising from the density $p(x)=\frac{1}{x\mathrm{log}(10)}$ on [1, 10) (and 0 otherwise). More generally, consider densities p_{r}(x) = C_{r}/x^{r} for x ∈ [1, 10) and 0 otherwise with r ∈ (−∞,∞), where C_{r} is a normalization constant so that the density integrates to 1. For each r, calculate the probability of observing a first digit of d, and calculate the expected value of the first digit.
20.2 A Short Introduction to The Mathematical Theory of Benford’s Law
For a more detailed development of this material, see An Introduction to Benford’s Law [BerH5] by Berger and Hill.
(p.377) 20.3 Fourier Analysis and Benford’s Law
20.3.1 Problems from Introduction to Fourier Analysis
The following exercises are from the chapter “An Introduction to Fourier Analysis,” from the book An Invitation toModern Number Theory (Princeton University Press, Steven J. Miller and Ramin TaklooBighash). This chapter is available online on the web page for this book (go to the links for Chapter 3).
Exercise 20.3.1. Prove e^{x} converges for all x ∈ ℝ (even better, for all x ∈ C). Show the series for e^{x} also equals
which you may remember from compound interest problems.
Exercise 20.3.2. Prove, using the series definition, that e^{x+y} = e^{x}e^{y} and calculate the derivative of e^{x}.
Exercise 20.3.3. Let f, g, and h be continuous functions on [0, 1], and a, b ∈ C. Prove
1. ⟨f, f⟩ ≥ 0, and equals 0 if and only if f is identically zero;
2. $\u3008f,g\u3009=\overline{\u3008g,f\u3009};$

3. ⟨af + bg, h⟩ = a⟨f, h⟩ + b⟨g, h⟩.
Exercise 20.3.4. Find a vector $\overrightarrow{v}=\left(\begin{array}{c}{v}_{1}\\ {v}_{2}\end{array}\right)\in \text{}{\u2102}^{2}$ such that ${v}_{1}^{2}+{v}_{2}^{2}=0,$ but $\u3008\overrightarrow{v},\overrightarrow{v}\u3009\ne 0.$
Exercise 20.3.5. Prove x^{n} and x^{m} are not perpendicular on [0, 1]. Find a c ∈ ℝ such that x^{n} − cx^{m} is perpendicular to x^{m}; c is related to the projection of x^{n} in the direction of x^{m}.
Exercise 20.3.6 (Important). Show for m, n ∈ 𝔸 that
Exercise 20.3.7. Let f and g be periodic functions with period a. Prove αf(x) + βg(x) is periodic with period a.
Exercise 20.3.8. Prove any function can be written as the sum of an even and an odd function.
Exercise 20.3.9. Show
This agrees with our intuition: after removing the projection in a certain direction, what is left is perpendicular to that direction.
(p.378) Exercise 20.3.10. Prove 1. ⟨f(x) − S_{N}(x), e_{n}(x)⟩ = 0 if n ≤ N; 2. $\widehat{f}(n)\le {\displaystyle {\int}_{0}^{1}}f(x)dx;$
3. Bessel’s Inequality: if ⟨f, f⟩ < ∞ then $\sum _{n=\infty}^{\infty}}\widehat{f}(n){}^{2}\le \u3008f,f\u3009;$
4. Riemann–Lebesgue Lemma: if ⟨f, f⟩ < ∞ then lim_{n→∞} f(n) = 0 (this holds for more general f; it suffices that ${\int}_{0}^{1}}f(x)dx<\infty );$
5. Assume f is differentiable k times; integrating by parts, show $\widehat{f}(n)\text{}\ll \frac{1}{{n}^{k}}$ and the constant depends only on f and its first k derivatives.
Exercise 20.3.11. Let h(x) = f(x) + g(x). Does $\widehat{h}(n)=\widehat{f}(n)+\widehat{g}(n)?$ Let k(x) = f(x)g(x). Does$\widehat{k}(n)=\widehat{f}(n)\widehat{g}(n)?$
Exercise 20.3.12. If ⟨f, f⟩, ⟨g, g⟩ < ∞ then the dot product of f and g exists: ⟨f, g⟩ < ∞ (see Remark 11.2.4 of [MiTB]). Do there exist f, g : [0, 1] → C such that ${\int}_{0}^{1}}f(x)dx,{\displaystyle {\int}_{0}^{1}}g(x)dx<\infty \text{}but\text{}{\displaystyle {\int}_{0}^{1}f}(x)\overline{g}(x)dx=\infty ?$ Is f ∈ L^{2}([0, 1]) a stronger or an equivalent assumption to f ∈ L0, 1?
Exercise 20.3.13. Define
Prove A_{N} is an approximation to the identity on $\left[\frac{1}{2}\text{,}\frac{1}{2}\right].$ If f is continuously differentiable and periodic with period 1, calculate
Exercise 20.3.14. Let A(x) be a nonnegative function with ${\int}_{\mathbb{R}}A}(x)dx=1.$ Prove A_{N}(x) = N · A(Nx) is an approximation to the identity on R.
Exercise 20.3.15 (Important). Let A_{N}(x) be an approximation to the identity on $\left[\frac{1}{2}\text{,}\frac{1}{2}\right].$ Let f(x) be a continuous function on $\left[\frac{1}{2}\text{,}\frac{1}{2}\right].$ Prove
Exercise 20.3.16. Prove the two formulas above. The geometric series formula will be helpful:
Exercise 20.3.17. Show that the Dirichlet kernels are not an approximation to the identity. How large are ${\int}_{0}^{1}}{D}_{N}(x)dx\text{}and\text{}{\displaystyle {\int}_{0}^{1}{D}_{N}}{(x)}^{2}dx?$
(p.379) Exercise 20.3.18. Prove the Weierstrass Approximation Theorem implies the original version of the Weierstrass Theorem.
Exercise 20.3.19. Let f (x) be periodic function with period 1. Show
Exercise 20.3.20. Let $\widehat{f}(n)=\frac{1}{{2}^{\leftn\right}}.$ Does$\sum _{\infty}^{\infty}\widehat{f}}(n){e}_{n}(x)$ converge to a continuous, differentiable function? If so, is there a simple expression for that function?
Exercise 20.3.21. Fill in the details for the above proof. Prove the result for all f satisfying ${\int}_{0}^{1}}f(x){}^{2}dx<\infty .$
Exercise 20.3.22. If ${\int}_{0}^{1}}f(x){}^{2}dx<\infty ,$ show Bessel’s Inequality implies there exists a B such that $\widehat{f}(n)\le B$ for all n.
Exercise 20.3.23. Though we used a+b^{2} ≤ 4a^{2} + 4b^{2}, any bound of the form ca^{2} + cb^{2} would suffice. What is the smallest c that works for all a, b ∈ C?
Exercise 20.3.24. Let $\text{f(x)=}\frac{1}{2}\leftx\right\text{}on\text{}\left[\frac{1}{2},\frac{1}{2}\right].$ Calculate$\sum _{n=0}^{\infty}\frac{1}{{(2n+1)}^{2}}}.$ Use this to deduce the value of $\sum _{n=1}^{\infty}\frac{1}{{n}^{2}}}.$ This is often denoted ζ(2) (see Exercise 3.1.7 of [MiTB]). See [BoPo]^{2} for connections with continued fractions, and [Kar]^{3} for connections with quadratic reciprocity.
Exercise 20.3.25. Let f(x) = x on [0, 1]. Evaluate$\sum _{n=1}^{\infty}\frac{1}{{n}^{4}}}.$
Exercise 20.3.26. Let f(x) = x on $\left[\frac{1}{2},\frac{1}{2}\right].$ . Prove $\frac{\pi}{4}={\displaystyle \sum _{n=1}^{\infty}\frac{{(1)}^{n+1}}{{(2n1)}^{2}}}.$ See also Exercise 3.3.29, and see Chapter 11 of [BorB]^{4} or [Schum]^{5} for a history of calculations of π.
Exercise 20.3.27. Find a function to determine$\sum _{n=1}^{\infty}\frac{1}{{n}^{4}}}.$
Exercise 20.3.28. Show the Gaussian $f(x)=\frac{1}{\sqrt{2\pi {\sigma}^{2}}}\text{}{e}^{{(x\mu )}^{2}/2{\sigma}^{2}}$ is in S(R) for any μ, σ ∈ ℝ.
Exercise 20.3.29. Let f (x) be a Schwartz function with compact support contained in [−σ, σ] and denote its Fourier transform by f(y). Prove for any integer A > 0 that $\widehat{f}(y)\le {c}_{f}{y}^{A},$ where the constant c_{f} depends only on f, its derivatives and σ. As such a bound is useless at y = 0, one often derives bounds of the form $\widehat{f}(y)\le \frac{\tilde{{c}_{f}}}{{(1+y)}^{A}}.$
Show f(x) is continuous but F(0) is undefined. Show F(x) converges and is well defined for any x ∈ 𝔸.
Exercise 20.3.31. If g(x) decays like x^{−(1+η)} for some η > 0, then G(x) =_{n∈Z} g(x + n) converges for all x, and is continuous.
Exercise 20.3.32. For what weaker assumptions on f, f', f" does $\sum _{n\in \mathbb{Z}}f}(n)={\displaystyle \sum _{n\in \mathbb{Z}}\widehat{f}}(n)?$
Exercise 20.3.33. One cannot always interchange orders of integration. For simplicity, we give a sequence a_{mn} such that $\sum _{m}({\displaystyle \sum _{n}{a}_{m,n}})}\ne {\displaystyle \sum _{n}({\displaystyle \sum _{m}{a}_{m,n}})}.$ For ≥ _{mn nm} m, n 0 let
Show that the two different orders of summation yield different answers (the reason for this is that the sum of the absolute value of the terms diverges).
Exercise 20.3.34. Find a family of functions x) such that f_{n}(
and each f_{n}(x) and f(x) is continuous and f_{n}(x), f(x) ≤ M for some M and all x.
Exercise 20.3.35. Let f, g be continuous functions on I = [0, 1] or I = R. Show if f, f, g, g < ∞then h = f ∗g exists. Hint: Use the Cauchy–Schwarz inequality. Show further that $\widehat{h}(n)=\widehat{f}(n)\widehat{g}(n)$ if I = [0, 1] or if I = R. Thus the Fourier transform converts convolution to multiplication.
Exercise 20.3.36. Let X _{1},X_{2} be independent random variables with density p. Prove
Exercise 20.3.37 (Important). If for all i = 1, 2, … we have f_{i}, f_{i} < ∞, prove for all i and j that f_{i} ∗ f_{j}, f_{i} ∗ f_{j} < ∞. What about f_{1} ∗ (f_{2} ∗ f_{3}) (and so on)? Prove f_{1} ∗ (f_{2} ∗ f_{3}) = (f_{1} ∗ f_{2}) ∗ f_{3}. Therefore convolution is associative, and we may write f_{1} ∗ ···∗ f_{N} for the convolution of N functions.
Exercise 20.3.38. Suppose X_{1}, … , X_{N} are i.i.d.r.v. from a probability distribution p on R. Determine the probability that X_{1} + · · · + X_{N} ∈ [a, b]. What must be assumed about p for the integrals to converge?
(p.381) Exercise 20.3.39. One useful property of the Fourier transform is that the derivative of ĝ is the Fourier transform of 2πixg(x); thus, differentiation (hard) is converted to multiplication (easy). Explicitly, show
If g is a probability density, note ${\widehat{g}}^{\prime}(0)=2\pi i\mathbb{E}[x]\text{}and\text{}{\widehat{g}}^{\u2033}(0)=4{\pi}^{2}\mathbb{E}[{x}^{2}].$
Exercise 20.3.40. If B(x) = A(cx) for some fixed c = 0, show $\widehat{B}(y)=\frac{1}{c}\widehat{A}\left(\frac{y}{c}\right).$.
Exercise 20.3.41. Show that if the probability density of X_{1} + · · · + X_{N} = x is (p ∗ · · · ∗ p)(x) (i.e., the distribution of the sum is given by p ∗ · · · ∗ p), then the √probability density of $\frac{{X}_{1}+\cdots +{X}_{N}}{\sqrt{N}}=x$ x is ($(\sqrt{N}p\ast \cdots \ast \sqrt{N}p)(x\sqrt{N}).$ By Exercise 20.3.40, show
Exercise 20.3.42. Show for any fixed y that
Exercise 20.3.43. Show that the Fourier transform of e^{−2π2y2} at x is $\frac{1}{\sqrt{2\pi}}\text{}{e}^{{x}^{2}/2}.$
Hint: This problem requires contour integration from complex analysis.
Exercise 20.3.44. Modify the proof to deal with the case of p having mean μ and variance σ^{2}.
Exercise 20.3.45. For reasonable assumptions on p, estimate the rate of convergence to the Gaussian.
Exercise 20.3.46. Let be two probability densities satisfying p_{1}, p_{2}
Consider S_{N} = X_{1} +· · ·+X_{N}, where for each i, X_{1} is equally likely to be drawn randomly from p_{1} or p_{2}. Show the Central Limit Theorem is still true in this case. What if we instead had a fixed, finite number of such distributions p_{1}, … , p_{k}, and for each i we draw X_{i} from p_{j} with probability q_{j} (of course, q_{1} + · · · + q_{k} = 1)?
Exercise 20.3.47 (Gibbs Phenomenon). Define a periodic with period 1 function by
Prove that the Fourier coefficients are
(p.382) Show that the Nth partial Fourier series S_{N}(x) converges pointwise to f (x) wherever f is continuous, but overshoots and undershoots for x near 0. Hint: Express the series expansion for S_{N}(x) as a sum of sines. Note $\frac{\mathrm{sin}(2m\pi x)}{2m\pi}={\displaystyle {\int}_{0}^{x}\mathrm{cos}}(2m\pi t)dt.$ Express this as the real part of a geometric series of complex exponentials, and use the geometric series formula. This will lead to
which is about 1.179 (or an overshoot of about 18%) when $x=\frac{1}{4n\pi}.$ What can you say about the Fejér series T_{N}(x) for x near 0?
Exercise 20.3.48 (Nowhere Differentiable Function). Weierstrass constructed a continuous but nowhere differentiable function! We give a modified example and sketch the proof. Consider
Show f is continuous but nowhere differentiable. Hint: First show a < 1 implies f is continuous. Our claim on f follows from noting that if a periodic continuous function g is differentiable at x_{0} and $\widehat{g}(n)=0$ unless n = ±2^{m}, then there exists C such that for all n, $\widehat{g}(n)\le Cn{2}^{n}.$ To see this, it suffices to consider x_{0} = 0 and g(0) = 0. Our assumptions imply that (g, e_{m}) = 0 if 2^{n−1} < m < 2^{n+1} and m = 2^{n}. We have $\widehat{g}({2}^{n})=(g,{e}_{{2}^{n}}{F}_{{2}^{n1}}(x))$ where F_{N} is the Fejér kernel. The claim follows from bounding the integral (g, e_{2}^{n}F_{2}^{n}_{−1}(x)). In fact, more is true: Baire showed that, in a certain sense, “most” continuous functions are nowhere differentiable! See, for example, [Fol].^{6}
Exercise 20.3.49 (Isoperimetric Inequality). Let γ(t) = (x(t), y(t)) be a smooth closed curve in the plane; we may assume it is parametrized by arc length and has length 1. Prove the enclosed area A is largest when γ(t) is a circle. Hint: By Green’s Theorem
The assumptions on γ(t) imply x(t), y(t) are periodic functions with Fourier series expansions and ${(\frac{dx}{dt})}^{2}+{(\frac{dy}{dt})}^{2}=1.$. Integrate this equality from t = 0 to t = 1 to obtain a relation among the Fourier coefficients of $\frac{dx}{dt}and\frac{dx}{dt}$ (which are related to those of x(t) and y(t)); (20.21) gives another relation among the Fourier coefficients. These relations imply 4πArea(A) ≤ 1 with strict inequality unless the Fourier coefficients vanish for n > 1. After some algebra, one finds this implies we have a strict inequality unless γ is a circle.
Exercise 20.3.50 (Applications to Differential Equations). One reason for the introduction of Fourier series was to solve differential equations. Consider the vibrating string problem: a unit string with endpoints fixed is stretched into some (p.383) initial position and then released; describe its motion as time passes. Let u(x, t) denote the vertical displacement from the rest position x units from the left endpoint at time t. For all t we have u(0, t) = u(1, t) = 0 as the endpoints are fixed. Ignoring gravity and friction, for small displacements Newton’s laws imply
where c depends on the tension and density of the string. Guessing a solution of the form
solve for a_{n}(t). One can also study problems on R by using the Fourier transform. Its use stems from the fact that it converts multiplication to differentiation, and vice versa: if g(x) = f(x) and h(x) = xf(x), prove that $\widehat{g}(y)=2\pi iy\widehat{f}(y)$ and $\frac{d\widehat{f}(y)}{dy}=2\pi i\widehat{h}(y).$ This and Fourier inversion allow us to solve problems such as the heat equation
with initial conditions u(x, 0) = f(x).
20.3.2 Problems from Chapter 1: Revisited
Many of the problems from Chapter 1 are appropriate here as well. In addition to reexamining those problems, consider the following.
Exercise 20.3.51. Is the sequence a_{n} = n^{log n} Benford?
Exercise 20.3.52. In some situations log_{4} does not mean the logarithm base 4, but rather four iterations of the logarithm function. Investigate n^{log}_{f}^{(}n^{) n} under this definition for various integervalued functions f.
20.3.3 Problems from Chapter 3
Exercise 20.3.53. Assume an infinite sequence of real numbers {x_{n}} has its logarithms modulo 1, {y_{n} = log_{10} x_{n} mod 1}, satisfying the following property: as n → ∞the proportion of y_{n} in any interval [a, b] ⊂ [0, 1] converges to b − a if b −a > 1/2. Prove or disprove that {x_{n}} is Benford.
Exercise 20.3.54. As $\sqrt{2}$ is irrational, the sequence {x_{n} = n} is uniformly distributed modulo 1. Is the sequence $\left\{{x}_{n}^{2}\right\}$ uniformly distributed modulo 1?
Exercise 20.3.55. Does there exist an irrational α such that α is a root of a quadratic polynomial with integer coefficients and the sequence ${\left\{{\alpha}^{n}\right\}}_{n=1}^{\infty}$ is Benford base 10?
(p.384) Exercise 20.3.56. We showed a geometric Brownian motion is a Benfordgood process; is the sum of two independent geometric Brownian motions Benfordgood?
The next few questions are related to a map we now describe. We showed that, suitably viewed, the 3x + 1 map leads to Benford behavior (or is close to Benford for almost all large starting seeds). Consider the following map. Let R(x) be the number formed by writing the digits of x in reverse order. If R(x) = x we say x is palindromic. If x is not a palindromic number set P(x) = x + R(x), and if x is palindromic let P(x) = x. For a given starting seed x_{0} consider the sequence where x_{n+1} = P(x). It is not known whether there are any x_{0} such that the resulting sequence diverges to infinity, though it is believed that almost all such numbers do. The first candidate to escape is 196; for more see http://en.wikipedia.org/wiki/Lychrel_number (this process is also called “reverseandadd,” and the candidates are called Lychrel numbers).
Exercise 20.3.57. Consider the reverseandadd map described above applied to a large starting seed. Find as good of a lower bound as you can for the number of seeds between 10^{n} and 10^{n+1} such that the resulting sequence stabilizes (i.e., we eventually hit a palindrome).
Exercise 20.3.58. Come up with a model to estimate the probability a given starting seed in 10^{n} and 10^{n+1} has its iterates under the reverseandadd map diverge to infinity. Hint: x plus R(x) is a palindrome if and only if there are no carries when we add; thus you must estimate the probability of having no carries.
Exercise 20.3.59. Investigate the Benfordness of sequences arising from the reverseandadd map for various starting seeds. Of course the calculation is complicated by our lack of knowledge about this map, specifically we don’t know even one starting seed that diverges! Look at what happens with various Lychrel numbers. For each N can you find a starting seed x_{0} such that it iterates to a palindrome after N or more steps?
Exercise 20.3.60. Redo the previous three problems in different bases. Your answer will depend now on the base; for example, much more is known base 2 (there we can give specific starting seeds that iterate to infinity).
Exercise 20.3.61. Use the Erdös–Turan Inequality to calculate upper bounds for the discrepancy for various sequences, and use those results to prove Benford behavior. Note you need to find a sequence where you can do the resulting computation. For example, earlier we investigated a_{n} = n^{log n}; are you able to do the summation for this case?
Exercise 20.3.62. Consider the analysis of products of random variables. Fix a probability p (maybe p = 1/2), and independent identically distributed random variables X_{1}, …,X_{n}. Assume as n → ∞the product of the X_{i}’s becomes Benford. What if now we let $\tilde{{\mathcal{X}}_{n}}$ be the random variable where we toss n independent coins, each with probability p, and if the ith toss is a head then X_{i} is in the product (if the product is empty we use the standard convention that it is then 1). Is this process Benford?
(p.385) Exercise 20.3.63. Redo the previous problem, but drop the assumption that the random variables are identically distributed.
Exercise 20.3.64. Redo the previous two problems, but now allow the probability that the ith toss is a head to depend on i.
Exercise 20.3.65. Consider
this is the function from Example 3.3.5 and led to nonBenford behavior for the product. Can you write down the density for the product?
Exercise 20.3.66. In the spirit of the previous problem, find other random variables where the product is not Benford.
Exercise 20.3.67. Consider a Weibull random variable with a scale parameter α of 1 and translation parameter β of 0; so f(x; γ) = x^{γ−1} exp(x^{γ}) for x ≥ 0 and is zero otherwise. Investigate the Benfordness of chaining random variables here, where the shape parameter γ is the output of the previous step.
Exercise 20.3.68. The methods of [JaKKKM] led to good bounds for chaining exponential and uniform random variables. Can you obtain good, explicit bounds in other cases? For example, consider a binomial process with fixed parameter p.
Exercise 20.3.69. Apply the methods of Cuff, Lewis, and Miller (for the Weibull distribution) to other random variables. Consider the generalized Gamma distribution (see http://en.wikipedia.org/wiki/Generalized_gamma_distribution for more information), where the density is
for x > 0 and 0 otherwise, where a, d, p are positive parameters.
For the next few problems, let f_{r}(x) − 1/(1 + x^{r}) with r > 1.
Exercise 20.3.70. Show that for
Exercise 20.3.71. Verify the Fourier transform identity used in our analysis:
where b ∈ [0, 1].
(p.386) 20.4 Benford’s Law Geometry
Exercise 20.4.1. Perform a chisquare goodnessoffit test on the data values in Table 4.1.
Exercise 20.4.2. Let the random variable X have the Benford distribution as defined in this chapter. Find E[X]. Next, generate one million Benford random variates and compute their sample mean. Perform this Monte Carlo experiment several times to ensure that the sample means are near E[X].
Exercise 20.4.3. Let T ∼ exponential(1). Find the probability mass function of the leading digit to threedigit accuracy. Compare your results to those in Table 4.2.
Exercise 20.4.4. Redo the previous exercise, but instead of finding the probability mass function of the leading digit, find the cumulative distribution function of the significand (i.e., find the probability of observing a significand of at most s).
Exercise 20.4.5. Determine the set of conditions on a, b, and c associated with W ∼ triangular(a, b, c) which result in T = 10^{W} following Benford’s Law.
Exercise 20.4.6. Use R to confirm that the cumulative distribution function F_{x}(x) = Prob(X ≤ x) = log_{10}(x + 1) results in a probability mass function that gives the distribution specified in Benford’s Law. What is the range of x?
Exercise 20.4.7. Use R to determine whether the cumulative distribution function F_{x}(x) = Prob(X ≤ x) = x^{2} (for some range for x) results in a probability mass function that gives the distribution specified in Benford’s Law. If yes, what is the range for x?
Exercise 20.4.8. Which of the following distributions of W follow Benford’s Law?
• f_{W}(w) ∼ U(0, 3.5).
• f_{W}(w) ∼ U(17, 117).
• f_{W}(w) = w^{3}−w^{2}+w for 0 ≤ w ≤ 1, and 1−w^{3}+w^{2}−w for 1 ≤ w ≤ 2.• f_{W}(w) = $\sqrt{w}$ for 0 ≤ w ≤ 1, and 1 − $\sqrt{w1}$ for 1 ≤ w ≤ 2.
Exercise 20.4.9. Let b_{1} and b_{2} be two different integers exceeding 1. Is there a probability density p on an interval I such that if a random variable X has p for its probability density function then X is Benford in both base b_{1} and b_{2}? What if the two bases are allowed to be real numbers exceeding 1? Prove your claims.
20.5 Explicit Error Bounds Via Total Variation
Exercise 20.5.1. Find TV(sin(x), [−π, π]).
Exercise 20.5.2. Confirm that TV(h, J) = TV^{+}(h, J) +TV^{−}(h, J).
(p.387) Exercise 20.5.3. Let Y_{o} and Z be independent random variables such that Y_{o} has a density f_{o} with TV(f_{o}) < ∞ and Z has distribution π. Verify that Y := Y_{o} + Z ? has density f(y) = f_{o}(y − z) π(dz) with TV(f) ≤ TV(f_{o}).
Exercise 20.5.4. Show that an absolutely continuous probability density f on R satisfies
Exercise 20.5.5. Let γ_{a,σ} be the density of the Gamma distribution Gamma(a, σ) with shape parameter a > 0 and scale parameter σ > 0, i.e.,
1. Show that for a ≥ 1, TV(γ_{a,σ}) = σ^{−1} TV(γ_{a,1}) and TV(γ_{a,1}) = 2((a − 1)/e)^{a−1}/Γ(a).
2. It is well known that Γ(t+1) = (t/e)^{t}1+o(1)) as t→∞(this is Stirling’s formula). What does this imply for TV(γ_{a,1})? Show that $\text{TV(}{\gamma}_{a,\sigma}\text{)}\to \text{}0\text{}as\text{}\sqrt{a}\sigma \to \infty \text{}and\text{}a\ge 1.$
Exercise 20.5.6. Let X be a strictly positive random variable with density h on (0,∞). Verify that Y := log_{B}(X) has density f given by f(y) = log(B)B^{y}h(B^{y}) for y ∈ ℝ.
Exercise 20.5.7. Let X be a random variable with distribution Gamma(a, σ) for some a, σ > 0; see Exercise 20.5.5.
1. Determine the density f_{a,σ} of Y := log_{B}(X). Here you should realize that f_{a,σ}(y) = f_{a,1}(y − log_{B}(σ)). Show then that
What happens as a→∞?
2. To understand why the leading digits of $X=\sigma (a+\sqrt{a}{Z}_{a})$ _{a}) for a random variable Z_{a} with mean zero and variance one. (Indeed, the density of Z_{a} converges uniformly to the standard Gaussian density as a → ∞.) Now investigate the distribution of Y = log_{B}(X) as a→∞.
20.6 Lévy Processes and Benford’s Law
Exercise 20.6.1. Provide an example of a noncontinuous cadlag function.
Exercise 20.6.2. Prove that a Weiner process is also a Lévy process.
Exercise 20.6.3. Prove that a Poisson process is also a Lévy process.
(p.388) Exercise 20.6.4. Prove that the exponential Lévy process {exp(X_{t})} (t ∈ ℝ) is a martingale with respect to (F_{t}) := σ{X_{s} : s ≤ t} if and only if E[exp(X_{t})] = 1.
Exercise 20.6.5. Let f(t) = E[exp(itξ)], g(t) = E[exp(itη)] ( t ∈ ℝ) be the characteristic functions of (real) valued random variables $\xi ,\eta (i=\sqrt{1}).$ Recall that exp(it) = cost + i sin t (t ∈ ℝ) and E[exp(itξ)] := E[cos(tξ)] + iE[sin(tξ)] (t ∈ ℝ). Finally, $\overline{a+ib}:=\text{}a\text{}\text{}ib\text{}\left(a,\text{}b\text{}\in \text{}\mathbb{R}\right)$ denotes the complex conjugate of a + ib. Note that f^{2}(t) = f(t) · f¯ (t). Show the following.
1. f is continuous, f(0) = 1, and f(t) ≤ 1, t ∈ ℝ.
2. f ¯ is a characteristic function.
3. f · g is a characteristic function. Hence, f^{2}is a characteristic function.
4. Let h_{1}, h_{2}, … be characteristic functions. If a_{1} ≥ 0, a_{2} ≥ 0, … are real numbers such that a_{1}+a_{2}+· · · = 1, then a_{1}h_{1}+a_{2}h_{2}+· · · is a characteristic function.
5. Show that every characteristic function h is nonnegative definite, i.e., for all n ≥ 2, real t_{1}, … , t_{n}, and complex a_{1}, … , a_{n} we have that
Exercise 20.6.6. Show that, for each real numberp > 0, f(z) := cos(2πpz) (z ∈ ℝ) is a characteristic function. Deduce that g(z) := (cos(2πpz))^{2} (z ∈ ℝ) is a characteristic function.
Exercise 20.6.7. (This exercise gives an example of a characteristic function which “wildly fluctuates.”) It follows from Exercises 20.6.6 and 20.6.5(4) that
is a characteristic function. Show that h is of infinite total variation over each nondegenerate interval [a, b], i.e.,
the supremum taken over all n ≥ 1 and real numbers a ≤ z_{1} < z_{2} < · · · < z_{n+1} ≤ b.
Hint: It suffices to prove the claim for intervals [r + 7^{−N}, r + 2 · 7^{−N}] (being convenient for calculations!) where N ≥ 1 is an integer and r ≥ 0 a real number. Let k ≥ N+1 and denote by I(k) the set of integers j such that 1+(r+7^{−N})7^{k} < j ≤ ((r+2·7^{−N})7^{k}). For j ∈ I(k) put t_{2}^{j−}_{1}(k) = (j−1/4)7^{−k}, t_{2j}(k) = j ·7^{−k}. Show, by using the inequalities a + b ≥ a − b and (cos b)^{2} − (cos a)^{2} ≤ 2b − a (a, b ∈ ℝ) that
(p.389) Exercise 20.6.8. 1. Try to guess how the integral $\underset{a}{\overset{b}{\int}}f(z)\text{}\mathrm{exp}(itz)dz$ behaves as t→∞if f : [a, b] → R is a step function of the form $\text{f(t)=}{\displaystyle {\sum}_{j=1}^{m}{\text{c}}_{\text{j}}{\mathbb{I}}_{{\text{[b}}_{\text{j}}{}_{\text{1}}{\text{,b}}_{\text{j}}\text{)}}\text{(t)}}$ where a ≤ b_{0} < b_{1} < · · · < b_{m} ≤ b.
2. Verify your guess when f is an indicator function of an interval.
3. How does the above integral behave when f is continuous on [a, b]?
Exercise 20.6.9. Show that a Lévy measure Q satisfies Q(R r (−α, α)) < ∞ for all α > 0.
Exercise 20.6.10. Let X be a Lévy process having Lévy measure Q. Show that, for fixed c > 0 and s ≥ 0, the process X^{∗} given by X^{∗}= − (t ≥ 0) is a ^{∗} _{t} X_{ct+s} X_{s} Lévy process having Lévy measure Q= cQ.
Exercise 20.6.11. Let N = (N_{t}) (t ≥ 0) be a Poisson process with parameter λ > 0.
1. Verify that the generating triple of N is given by (λ, 0,Q^{∗}) where Q^{∗} has total mass λ concentrated on {1}.
2. Verify (6.15) directly for X = N, i.e.,
holds for all c > 0, s ≥ 0, and every Borel set A ⊂ R.
Exercise 20.6.12. Let T_{t} = ${T}_{t}={\displaystyle \sum}_{j=1}^{{N}_{t}}{\varsigma}_{j}\left(t\text{}\ge \text{}0\right)$ (t ≥ 0) denote the compound Poisson process of Example 6.1.21. (Here, (N_{t}) is a Poisson process with parameter λ > 0; ζ_{1}, ζ_{2}, … are independent random variables with a common distribution Q_{1} such that Q_{1}({0}) = 0.Furthermore, the processes (ζ_{n}) and (N_{t}) are independent of each other.)
1. Show that the characteristic function g_{t} of T_{t} (t ≥ 0) is given by
for all z ∈ ℝ and t ≥ 0.
2. It can be shown (see the reference in Example 6.1.21) that (T_{t}) is a Lévy process. Determine its generating triple (β, σ^{2},Q).
Exercise 20.6.13. Let W be a (standard) Brownian motion (BM). Show that, for each c > 0,W^{∗} = (cW_{t/c}^{2}) is a BM (scaling property).
Exercise 20.6.14. Let ξ ∼ N(μ, σ^{2}) where μ ∈ ℝ andσ > 0.
1. Deduce from (6.26) that the characteristic function of ξ is given by
and
Exercise 20.6.15. Let W = (W_{t}) be a BM. Put
1. Show that S_{t,u} is a random variable. (This requires a little argument since the definition of S_{t,u} involves uncountably many random variables!) Hint: Recall that all sample paths of W are continuous.
2. Show that W_{n}/n → 0 (n→∞) a.s.
3. Since, for each fixed t ≥ 0, (W_{u+t} −W_{t}) (u ≥ 0) is a BM, it follows that for each t > 0, S_{t,1} has the same distribution as S_{0},1. (∗) Furthermore, we have that
(see, e.g., [KaSh]). Use (2) as well as (∗) and (∗∗) to show that W_{t}/t → 0 (t→∞) a.s.
Hint: Use the Borel–Cantelli Lemma.
Exercise 20.6.16. Let ξ_{1}, ξ_{2}, … be independent random variables defined on some probability space (Ω,F, P), which have a common distribution given by P(ξ_{n} = +1) = p, P(ξ_{n} = −1) = 1 − p =: q (n ≥ 1), where 0 < p < 1. Put S_{n} := ξ_{1} + · · · + ξ_{n}, n ≥ 0 (S_{0} = 0and let (F),_{n}) (n ≥ 0) be the filtration generated by (ξ_{n}). (Note that F_{0} = {∅,Ω}.)
1. Show that Y_{n} := (q/p)_{Sn} (n ≥ 0) is an (F_{n})martingale.
2. Put c(α) := E[exp(αξ_{1})] = p exp(α) + q exp(−α) (α ∈ ℝ). Show that, for every fixed α ∈ ℝ, Z_{n} := exp(αS_{n})/(c(α))^{n} (n ≥ 0) is an (F_{n})martingale.
Exercise 20.6.17. Let ξ_{1}, ξ_{2}, … be independent random variables defined on the same probability space, which have a common distribution given by P(ξ_{n} = +1) = P(ξ_{n} = −1) = 1/2. Put S_{0} = 0 and S_{n} = ξ_{1} + · · · + ξ_{n} (n ≥ 1) which means that (S_{n}) is a simple symmetric random walk on Z, starting at 0. Let (F_{n}) be the filtration generated by (ξ_{n}). Show that following two sequences are (F_{n})martingales:
Hint: Note that E[ξ_{n}F_{n−1}] = E[ξ_{n}] = 0 a.s. (since ξ_{n} is independent of F_{n−1}), and that $\mathbb{E}[{S}_{n1}^{2}{\xi}_{n}{\mathcal{F}}_{n1}]={S}_{n1}^{2}\mathbb{E}[{\xi}_{n}]=0$ a.s. (since S_{n−1} is F_{n−1}measurable). Note that S_{n} = S_{n−1} + ξ_{n}.
Exercise 20.6.18. Let (Ω,F, P) be a probability space and let (F_{n}) (n ≥ 0) be any filtration on (Ω,F). In the sequel let Z = (Z_{n}) (n ≥ 0) and H = (H_{n}) (n ≥ 1) be sequences of random variables defined on (Ω,F) such that Z is adapted and H is predictable which means that, for all n ≥ 1, H_{n} is F_{n−1}measurable. The sequence H • Z given by
is called the Htransformof Z or the (discrete) stochastic integral of H with respect to Z. Nowlet Z be an (F_{n})martingale and assume that H_{j}(Z_{j}−Z_{j−1}) ∈ L^{1}, j = 1, 2, …. Show that H • Z is an (F_{n})martingale.
Hint: Use the iteration property of conditional expectations (see Example 6.1.29).
Exercise 20.6.19. Let W = (W_{t}) be a BM and let (F_{t}) be the filtration generated by W. Show that the following processes are (F_{t})martingales:
1. (W_{t}).
2. (W^{2} _{t} − t).
3. (W^{4} _{t} − 6tW^{2} _{t} + 3t^{2}).
Hint: Note that W_{t} −W_{s} is independent of F_{s} (0 ≤ s ≤ t).
Exercise 20.6.20. Let (N_{t}) be a Poisson process with parameter λ > 0, and put M_{t} = N_{t} − λt (t ≥ 0). Let (F_{t}) be the filtration generated by (N_{t}).
1. Show that (M_{t}) is an (F_{t})martingale.
Hint: N_{t} − N_{s} is independent of F_{s} (0 ≤ s < t).
2. Show that (M^{2} _{t} − λt) is an (F_{t})martingale.
Hint: Write M^{2} _{t} −M^{2} _{s} = (M_{t} −M_{s})^{2} + 2M_{s}(M_{t} −M_{s}) (0 ≤ s < t).
Exercise 20.6.21. Let (N_{t}) be a Poisson process with parameter λ > 0, and let c > 0 be any constant.
1. Determine the constant μ(c) such that the process (exp(cN_{t} + μ(c)t)) (t ≥ 0) is a martingale with respect to the filtration (F_{t}) generated by (N_{t}). Hint: Use Theorem 6.1.30 and Exercise 20.6.11.
2. Verify directly that the process obtained in (1) is an (F_{t})martingale.
Hint: Use that E[exp(c(N_{t} − N_{s}))F_{s}] = E[exp(c(N_{t} − N_{s}))] a.s. (0 ≤ s < t) since N_{t} − N_{s} is independent of F_{s}.
(p.392) Exercise 20.6.22. Let ξ have a binomial distribution with parameters n ≥ 1 and 0 ≤ p ≤ 1, i.e.,
1. Use Azuma’s inequality (Theorem 6.3.1) to prove the following inequality which is due to H. Chernoff (Ann. Math. Statist. 23 (1952), 493–507):
Hint: ξ has the same distribution as a sum of suitable 0–1 random variables ξ_{1}, … , ξ_{n}.
2. Verify (∗) directly for n = 1.
Exercise 20.6.23. Prove (6.147).
Hint: First note that g(z) =: exp(I(z)), where
Then (6.147) says that
In order to prove (∗) note that the cosine is ≤ 0 on the intervals J(k) := [(2k − 1)π − π/2, (2k − 1)π + π/2], and that
Hence
Using (∗∗) and comparing with a certain Riemann integral finally yields (∗).
Exercise 20.6.24. A process Z_{t} = Z_{0} exp(X_{t}), t ≥ 0 (Z_{0} > 0) is observed at time points t = 0, 1, 2, …, T, where (X_{t}) is a Lévy process of jumpdiffusion type as in Example 6.5.2. Let H_{0}(2) denote the null hypothesis which says that there exist α ∈ ℝ, c ≥ 2, λ ≥ 0 and a distribution Q_{1} on R satisfying Q_{1}({0}) = 0 such that (X_{t}) is associated with α, c, λ, and Q_{1}. (Note that H_{0}(2) has a meaning different from that at the beginning of Section 6.5!) Let H _{0}(2) be rejected if )L_{T}/T − p_{10}(1) ≥ 0.1 (see (6.100) and (6.150)). Let the level of significance be 0.1. (Note that the rejection of H_{0}(2) entails the rejection of the null hypothesis that (Z_{t}) is a Black–Scholes process having volatility ≥ 2; see (6.27).) How large has T to be? (Answer: T ≥ 1715.)
Exercise 20.6.25. A process Z_{t} = Z_{0} exp(X_{t}), t ≥ 0 (Z_{0} > 0) is observed at the time points t = 0, 1, 2, …, T, where (X_{t}) = αt+T_{t}, t ≥ 0. Here, α ∈ ℝ; (T_{t}) is a compound Poisson (or CP)process associated with λ > 0 and Q_{1} = N(μ, σ^{2}) (see (p.393) Example 6.1.21). Suppose that the null hypothesis H_{0}(λ^{∗}, σ^{∗}) (λ^{∗} > 0, σ^{∗} > 0) is to be tested, which says that there exist α ∈ ℝ, μ ∈ ℝ, λ ≥ λ^{∗}, and σ ≥ σ^{∗} such that X_{t} = αt + T_{t} (t ≥ 0), and (T_{t}) is a CPprocess associated with λ and Q_{1}. Verify that the test outlined in Exercise 20.6.24, which rejects λ^{∗}, σ^{∗}) if $\left{\tilde{L}}_{T}/T{p}_{10}(1)\right$ (≥ 0.1, is not applicable no matter how the level of significance 0 < p_{0} < 1 is chosen.
Hint: Show that there does not exist any (finite) constant Σ^{∗} satisfying (6.153) (g being the characteristic function of X_{1}, (X_{t}) being an arbitrary Lévy process satisfying H_{0}(λ^{∗}, σ^{∗})). Use Exercise 20.6.14(2).
Exercise 20.6.26. Suppose we observe a process Z_{t} = Z_{0} exp(μt + cX_{t}), t ≥ 0 (Z_{0} > 0) at time points t = 0, 1, … , T. Let (X_{t}) be a gamma process with parameters α and Δ, and consider (as in Example 6.5.5) the null hypothesis H_{0}(c^{∗}, α^{∗},Δ^{∗}) where B = 10, c∗ = α^{∗} = 1,Δ^{∗} = 2, p_{0} = v = 0.1,m = 1, d_{1} = 1, and λ(10) = (2π/ log 10)^{2} (recall that log is the natural logarithm).
1. Show that in this special case we can choose Σ^{∗} = (log 10)^{2}/24.
2. How large has the time horizon T to be? (Answer: T ≥ 2129 (instead of T ≥ 2582 as in Example 6.5.5!).)
Exercise 20.6.27. Prove the following elementary result (Lemma 6.6.7): Leta_{1}, a_{2}, · · · be real numbers such that 0 ≤ a_{n} < 1 (n ≥ 1) and ${\sum}_{n=1}^{\infty}{a}_{n}<\infty .$ Then
Exercise 20.6.28. Prove the claim in Example 6.1.28.
Exercise 20.6.29. Prove the iteration property of conditional expectations (see Example 6.1.29).
Exercise 20.6.30. Prove Lemma 6.2.1.
20.7 Benford’s Law as a Bridge Between Statistics and Accounting
An auditor decides to run a Benford’s Law test on a data set that consists of 1000 legitimate expense records from a business, plus a number of fraudulent transactions that an employee is making to a front for a business set up in a relative’s name. Because the employees of the business have to obtain special approval for expenditures over $10,000, the fraudulent transactions are all for amounts between $9000 and $9999. For the 1000 legitimate expenditures, we have this data: (p.394)
First Digit 
Observed 

1 
314 
2 
178 
3 
111 
4 
92 
5 
88 
6 
59 
7 
56 
8 
56 
9 
46 
Exercise 20.7.1. Using the Benford Law test at http://web.williams.edu/Mathematics/sjmiller/public_html/benford/chapter01/MillerNigriniExcelBenfordTester_Ver401.xlsx (or any other suitable software), verify that the data conforms reasonably well to Benford’s Law.
Exercise 20.7.2. Use trial and error (or some more clever approach) to determine how many fraudulent transactions with first digit 9 would need to be added to the 1000 legitimate observations above in order for the hypothesis that the data follows Benford’s Law to be rejected at a five percent significance level. Does this seem plausible?
Exercise 20.7.3. What is the role of sample size in the sensitivity of Benford’s Law? Suppose there are 10,000 legitimate observations instead of 1000, but the ratios for legitimate observations remains the same, i.e., the number of observations for each digit is multiplied by 10. Try the problem again. What changes?
Exercise 20.7.4. In which of the following situations is an auditor most likely to use Benford’s Law?
• An analysis of a fast food franchise’s inventory of hamburgers.
• An audit of a Fortune 500 company’s monthly total revenue over the fiscal year.
• An analysis of a multibillion dollar technology company’s significant assets.
Exercise 20.7.5. Give an additional example of a way that including Benford’s Law in an introductorylevel statistics class will meet the four goals of the GAISE report of 2005.
Exercise 20.7.6. Determine whether the following situations are Type I errors, Type II errors, or neither.
• An auditor uses Benford’s Law to analyze the values of canceled checks by a business in the past fiscal year. The auditor finds that there are significant spikes in the data set, with 23 and 37 appearing as the first two digits more often than expected. After further investigation, it was found that there were valid nonfraudulent explanations for the variations in the first digits.
(p.395) • An auditor finds that a company’s reported revenue does not follow Benford’s Law. Further investigation is taken, and it is found that a manager has been rounding up her weekly sales to the nearest thousand to earn an incentive based on a weekly sales benchmark. The manager claims that the inflated sales were an accounting error.
• An owner of a business has falsely claimed to give his employees bonuses on each paycheck based on their monthly sales in order to lower his income taxes. An auditor examines the data, but is unable to confidently claim that the data does not follow Benford’s Law. Rather than waste funds on a costly investigation, the auditor chooses not to investigate the owner.
Exercise 20.7.7. What are the negative effects of a Type I error in an audit? A Type II error? In what situations might one be more dangerous than the other?
Exercise 20.7.8. What are some of the reasons listed in the chapter that might explain why a data set should not be expected to follow Benford’s Law?
Exercise 20.7.9. Give an example of a reason other than fraud that explains why a data set that is expected to conform to Benford’s Law does not.
20.8 Detecting Fraud and Errors Using Benford’S Law
Exercise 20.8.1. Do the following data sets meet the requirements described by Nigrini in order to be expected to follow Benford’s Law? Explain why or why not.
• The 4digit PIN numbers chosen by clients of a local bank.
• The annual salaries of graduates from a public university.
• Numeric student ID numbers assigned by a school.
• The distances in miles between Washington, DC and the 500 most populated cities in the United States (excluding Washington, DC).
• Results to a survey of 1000 students asked to provide a number in between 1 and 1,000,000.
• The number of tickets bought for all events held in a particular stadium over the past five years.
Exercise 20.8.2. Take a company which has been at the heart of a scandal (for example, Enron) and investigate some of its publicly available data.
Exercise 20.8.3. An audit of a small company reveals a large number of transactions starting with a 5. Come up with some explanations other than fraud. Hint: There are two cases: it is the same amount to the same source each time, and it isn’t.
(p.396) 20.9 Can Vote Counts’ Digits and Benford’s Law Diagnose Elections?
Exercise 20.9.1. If X satisfies Benford’s Law, then the mean of its second digit is 4.187. What is the mean of the kth digit?
Exercise 20.9.2. If X satisfies Benford’s Law, multiply by an appropriate power of 10 so that it has k integer digits. What is the probability the last digit is d? What is the probability the last two digits are equal? What is the probability the last two digits differ by 1?
Exercise 20.9.3. Find some recent voting data (say city or precinct totals) and investigate the distribution of the first and second digits.
20.10 Complementing Benford’S Law For Small N: A Local Bootstrap
Exercise 20.10.1. Do you agree with the assessment that Nigrini’s conditions for applying Benford’s Law are mostly satisfied? Why or why not?
Exercise 20.10.2. Why does having a large σ(log_{10} x_{i}) and a large σ(log_{10} w_{i,j}) ensure that the v_{i,j} firstdigit distribution approaches Benford’s Law?
Exercise 20.10.3. What does it mean for bootstrap methods to be considered “conservative”? Identify some of the ways in which bootstrap methods are conservative.
Exercise 20.10.4. There are many conservative statistics. Look up the Bonferroni adjustment for multiple comparisons, as well as alternatives to that.
Exercise 20.10.5. How would a local bootstrap realization change if the value of Δ were changed?
Exercise 20.10.6. Confirm that if c_{bK7} > 99.924%, then c_{eK7} > 99.99960%.
20.11 Measuring the Quality of European Statistics
Exercise 20.11.1. In which of the following two scenarios would χ^{2} be larger?
• The firstdigit frequencies are mostly identical to the expected Benford distribution, but the digit 1 appears 31.1% of the time and the digit 2 appears 16.6% of the time (compared with the expected values of approximately 30.1% and 17.6%, respectively).
• The firstdigit frequencies are mostly identical to the expected Benford distribution, but the digit 8 appears 6.12% of the time and the digit 2 appears 3.58% of the time (compared with the expected values of approximately 5.12% and 4.58%, respectively).
(p.397) Exercise 20.11.2. What is μ_{b}, the value of the mean of the Benford distribution of first digits base b?
Exercise 20.11.3. What is the value of a^{∗} if μ_{e} = 3.5?
Exercise 20.11.4. Using Figure 11.1, confirm the values of χ^{2}, χ^{2}/n, and d^{∗} for the distribution of first digits for Greek social statistics in the year 2004.
Exercise 20.11.5. Using Figure 11.1 and the formula for distance measure a^{∗} used by Judge and Schechter, calculate the value of the mean of the data set (μ_{e}) in the year 2004. Confirm this value by using the formula ${\mu}_{e}=\frac{{\Sigma}_{i=1}^{9}n\text{Prob}({D}_{1}=i)}{n}.$
The final problem uses data on two fictitious countries, which is available online http://web.williams.edu/Mathematics/sjmiller/public_html/benford/chapter11/ (some of the additional readings on that web page may be useful as well).
Exercise 20.11.6. Calculate the values χ^{2}, χ^{2}/n, d^{∗}, and a^{∗} and compare the results for both countries. Which one of these two countries should be examined closer? Are the outcomes consistent?
20.12 Benford’s Law and Fraud in Economic Research
Exercise 20.12.1. Use (12.1) to find f(6) and F(6) for Benford’s Law.
Exercise 20.12.2. If X is a Benford variable defined on [1, 10), then what is the probability that the second digit is 5 given that the first digit is also 5?
Exercise 20.12.3. Use (12.4) to confirmthat when using Benford’s Law for Rounded Figures, Prob(D_{1} = 8) = 0.054.
Exercise 20.12.4. If X is a Benford variable defined on [1, 10), given that the first digit is 8, what is the probability that the second digit is 0 when rounding to two significant digits? What is the probability that the second digit is 2?
Exercise 20.12.5. Using Benford’s Law for Rounded Figures as the frequencies of first digits for a data set of 300 observed values, calculate Q_{1}, Q_{2}, M_{1}, and M_{2} using (12.6) and (12.7).
Exercise 20.12.6. Should the Q_{1} test or theM_{1} test be used for attempting to detect variations in Benford’s Law?
• What if the data set in question has a mean of 3.44?
• Which test should be used for detecting variations in the Generalized Benford Law?
(p.398) Exercise 20.12.7. The Federal Tax Office (FTO) knows that Ω =10% of tax declarations of small and medium enterprises are falsified. The FTO checks the first digits using Benford’s Law. Random samples of tax declarations are drawn and the null hypothesis (H_{o}) “Conformity to Benford’s Law” is tested at the α = 5% level of significance.
• Using (12.9), what rejection rate of H_{o}(θ) would you expect if the probability of a Type II error β lies in the interval [0.05, 0.75]?
• The FTO obtained the rejection rate θ = 0.12. Use (12.9) to calculate the probability β of a Type II error.
• The FTO arranges for an audit at the taxable enterprise if the Benford test rejects H_{o} for a certain tax declaration at the α = 5% level. What is the probability that such an audit will be provoked erroneously? And what is the probability to forbear an audit erroneously?
Exercise 20.12.8. A sample of scientific articles is taken, and 17% are found to have regression coefficients with a doubtful distribution of first digits. Use (12.10) to calculate ˆΩ.
20.13 Testing for Strategic Manipulation of Economic and Financial Data
Exercise 20.13.1. What are some of the potential reasons given in Section 13.1 for why data sets that are expected to follow Benford’s Law fail to do so?
Exercise 20.13.2. Did Benford’s Law prove financial misreporting during the financial crisis? Justify your assertion.
Exercise 20.13.3. What are some of the potential motives that banks have for manipulating VAR data?
20.14 Psychology and Benford’s Law
Exercise 20.14.1. Using (11.1) in Section 11.3, find χ^{2} for the elaborated and unelaborated data from Scott, Barnard, and May’s study found in Table 14.1.
Exercise 20.14.2. What distribution of leading digits would you expect if people were asked to randomly give an integer from 1 to N? How does your answer depend on N? Try an experiment with some of your friends and family.
(p.399) 20.15 Managing Risk in Numbers Games: Benford’s Law and the SmallNumber phenomenon
Exercise 20.15.1. What are the risks associated with a high liability limit in a fixedodds lottery game? What if the limit is too small?
Exercise 20.15.2. From the data obtained in Table 15.1, determine the probability that a given number on a ticket for the UK powerball game is a single digit.
Exercise 20.15.3. Figure 15.1 shows the proportion of tickets in a Pennsylvania Pick3 game with a given first digit. Explain why there are several outliers larger than the mean proportion and no outliers smaller than the mean proportion.
Exercise 20.15.4. What is the probability that a Type I player chooses the number 345 in a Pick3 game?
Exercise 20.15.5. Let Alice be a Type II player in a Pick3 game who bets on a number with three significant digits 80% of the time, a number with two significant digits 15% of the time, and a number with one significant digit 5% of the time. What is the probability that Alice bets on the number 345? The number 45? The number 5?
Exercise 20.15.6. In the Pennsylvania Pick3 game, the least square model indicates that 60.42% of the players are Type I players and 39.58% of the players are Type II players. Based on this model, use (15.4) to calculate the expected proportion of the betting volume on a threedigit number with first significant digit 4.
Exercise 20.15.7. Let Bob be a Type II player in a Pick3 game who bets on a number with three significant digits 80% of the time, but also has a tendency to exhibit switching behavior; that is, he will switch later digits withprobability 0.9105, and switch the digit to 0 with probability 0.1054. What is the probability that Bob bets on the number 345?
Exercise 20.15.8. Use (15.5) to calculate the probability that Bob chooses a threedigit number in between 520 and 529 inclusive.
Exercise 20.15.9. Calculate the variance using the equation in Section 15.4.1 under the scenario that all players randomly select a threedigit number.
20.16 Benford’s law in the Natural Sciences
Exercise 20.16.1. Demonstrate that (16.3) holds for α = 2.
Exercise 20.16.2. Rewrite the lognormal distribution density function (16.5) as the lognormal density function (16.6).
Exercise 20.16.3. Show that as σ grows larger, the lognormal density function approaches the power law p(x) = C_{σ}x^{−1}, where C_{σ} is a constant depending on σ.
(p.400) Exercise 20.16.4. Provide examples not mentioned in the chapter of scientific data sets that are not effectively scale invariant.
Exercise 20.16.5. Explain the intuition behind why the following distributions are approximately Benford:
• The Boltzman–Gibbs distribution (16.8).
• The Fermi–Dirac distribution (16.9).
• The Bose–Einstein distribution (16.10).
Exercise 20.16.6. Obtain a physics textbook (or a CRC handbook, or… ) and find a list of physical constants. Perform a chisquare test to determine whether the list of constants follows Benford’s Law as expected.
Exercise 20.16.7. Sandon found agreement between Benford’s Law and population and surface area data for the countries of the world. Find a source that provides the population density of each country. Then determine if population density follows Benford’s Law. This can be done using a chisquare test. In general, should the ratio of two Benford random variables be Benford?
20.17 Generalizing Benford’s Law: A Reexamination of Falsified Clinical Data
Exercise 20.17.1. Use (17.1) to calculate the average frequency of first digits in Stigler’s distribution of first significant digits. Check to see that the distribution matches the values displayed in Table 17.1.
Exercise 20.17.2. Verify (17.3), (17.5), and (17.6). Then verify that the sum of the three subsets matches (17.7).
Exercise 20.17.3. Calculate the mean of the Stigler FSD distribution and Benford FSD distribution to confirm that they are equivalent to 3.55 and 3.44, respectively.
Exercise 20.17.4. For the Estimated Maximum Entropy FSD distribution for data with an FSD mean of 3.44 shown in Table 17.3, find H (p) and ensure that the criteria from (17.13) and (17.14) are reached.
• If the Estimated Maximum Entropy FSD distribution is accurate, then the listed probabilities will maximize H (p). First, determine whether replacing ${\widehat{p}}_{1}\text{}with\text{}0.231\text{}and\text{}{\widehat{p}}_{2}$ with 0.2 still allows (17.13) and (17.14) to hold. Now find H(p). Is H(p) larger or smaller than before?
Exercise 20.17.5. If the FSD mean is 5, what will be the estimated maximum entropy FSD distribution? What is Var(d) according to (17.18)?
Exercise 20.17.6. Examining the Poehlman data in Table 17.4, calculate the difference for each digit FSD distribution given by Benford’s Law.
(p.401) Exercise 20.17.7. The estimated empirical likelihood distributions given an FSD mean will maximize $\sum _{i=1}^{9}{p}_{i}}.$ . To test this, ensure that the product of the p_{i}’s from Table 17.5 are greater than the empirical data found in Table 17.4.
Exercise 20.17.8. A researcher is trying to decide whether a data set follows Benford’s Law or Stigler’s Law. What values of the mean of the leading digit suggest Benford over Stigler? What values suggest Stigler over Benford?
20.18 Partial Volume Modeling of Medical Imaging Systems Using the Benford Distribution
Exercise 20.18.1. What is the PV effect? What implications does the PV effect have for medical imaging?
Exercise 20.18.2. Prove Corollary 18.3.4.
Exercise 20.18.3. What advantages are there to describing the PV effect using matrices as in (18.11)?
Exercise 20.18.4. What are the differences between a Rician noise model described by (18.12) and a Gaussian noise model described in (18.13)?
Exercise 20.18.5. Use (18.22) to calculate p(α) for α = 0.50, where α has two digits of precision.
Exercise 20.18.6. How is the contrast to noise ratio (CNR) affected if both the distance between the signal levels of two components and the standard deviation of each class is doubled?
20.19 Application of Benford’s Law to Images
Exercise 20.19.1. In (19.9) one of the factors is $\Gamma \left(\frac{j2\pi n+\mathrm{log}10}{c\mathrm{log}10}\right),$ where $j=\sqrt{1}.$ . Estimate how rapidly this tends to zero as n → ∞as a function of c (if you wish, choose some values of c to get a feel of the behavior).
Exercise 20.19.2. In (19.19) we find that a_{n}(c, σ) ≤ a_{n}(c^{+}) for all n; investigate how close these can be for various choices of c and σ.
Exercise 20.19.3. In Example 19.5.3 we found four zeromean Gaussians with shaping parameter c = 1 with four different standard deviations and a_{1} = 0. Can you find six zeromean Gaussians with shaping parameter c = 1 and six different standard deviations with a_{1} = 0? What about eight? More generally, can you find 2m such Gaussians for m a positive integer?
Notes:
(^{1}) The random variable X is normally distributed with mean μ and variance σ^{2} if its probability density function is $f(x;\mu ,\sigma )=\mathrm{exp}\left({(x\mu )}^{2}/(2{\sigma}^{2})\right)/\sqrt{2\pi {\sigma}^{2}}.$
(^{2}) E. Bombieri and A. van der Poorten, Continued fractions of algebraic numbers. Pages 137–152 in Computational Algebra and Number Theory (Sydney, 1992), Mathematical Applications, Vol. 325, Kluwer Academic, Dordrecht, 1995.
(^{3}) A. Karlsson, Applications of heat kernels on Abelian groups: ζ(2n), quadratic reciprocity, Bessel integral, preprint.
(^{4}) J. Borwein and P. Borwein, Pi and the AGM: A Study in Analytic Number Theory and Computational Complexity, John Wiley and Sons, New York, 1987.
(^{5}) P. Schumer, Mathematical Journeys, WileyInterscience, John Wiley & Sons, New York, 2004.
(^{6}) G. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd edition, Pure and Applied Mathematics, WileyInterscience, New York, 1999.