Unit Fractions

3 Deduction of the main results

This section contains the deductions of the headline results from the main technical proposition, Propostion 6.6.

Theorem 3.1 Solution in sets of positive density
#

If \(A\subset \mathbb {N}\) has positive upper density then there is a finite \(S\subset A\) such that \(\sum _{n\in S}\frac{1}{n}=1\).

Proof

Suppose \(A\subset \mathbb {N}\) has upper density \(\delta {\gt}0\). Let \(y=C_1/\delta \) and \(z=\delta ^{-C_2\delta ^{-2}}\), where \(C_1,C_2\) are two absolute constants to be determined later. It suffices to show that there is some \(d\in [y,z]\) and finite \(S\subset A\) such that \(R(S)=1/d\). Indeed, given such an \(S\) we can remove it from \(A\) and still have an infinite set of upper density \(\delta \), so we can find another \(S'\subset A\backslash S\) with \(R(S')=1/d'\) for some \(d'\in [y,z]\), and so on. After repeating this process at least \(\lceil z-y\rceil ^2\) times there must be some \(d\in [y,z]\) with at least \(d\) disjoint \(S_1,\ldots ,S_d\subset A\) with \(R(S_i)=1/d\). Taking \(S=S_1\cup \cdots \cup S_d\) yields \(R(S)=1\) as required.

By definition of the upper density, there exist arbitrarily large \(N\) such that \(\left\lvert A\cap [1,N]\right\rvert \geq \frac{\delta }{2}N\). The number of \(n\in [1,N]\) divisible by some prime power \(q\geq N^{1-6/\log \log N}\) is

\[ \ll N \sum _{N^{1-6/\log \log N}{\lt}q\leq N}\frac{1}{q}\ll \frac{N}{\log \log N} \]

by Mertens’ estimate Lemma 2.6. Further, by Turán’s estimate Lemma 2.3

\[ \sum _{n\leq N}(\omega (n)-\log \log N)^2 \ll N \log \log N, \]

the number of \(n\in [1,N]\) that do not satisfy

\begin{equation} \label{divs} \tfrac {99}{100}\log \log N\leq \omega (n) \leq 2\log \log N \end{equation}
1

is \(\ll N/\log \log N\). Finally, provided we choose \(C_2\) sufficiently large in the definition of \(z\), Lemma 3.5 ensures that the proportion of all \(n\in \{ 1,\ldots ,N\} \) not divisible by at least two distinct primes \(p_1,p_2\in [y,z]\) with \(4p_1{\lt}p_2\) is at most \(\frac{\delta }{8}N\), say.

In particular, provided \(N\) is chosen sufficiently large (depending only on \(\delta \)), we may assume that \(\left\lvert A_N\right\rvert \geq \frac{\delta }{4}N\), where \(A_N\subset A\) is the set of those \(n\in A\cap [N^{1-1/\log \log N},N]\) which satisfy conditions (2)-(4) of Proposition 6.6. Since \(\left\lvert A_N\right\rvert \geq \frac{\delta }{4}N\),

\[ R(A_N) \gg -\log (1-\delta /4)\gg \delta . \]

In particular, since \(y=C_1/\delta \) for some suitably large constant \(C_1{\gt}0\), we have that \(R(A_N)\geq 4/y\), say. All of the conditions of Proposition 6.6 are now satisfied (provided \(N\) is chosen sufficiently large in terms of \(\delta \)), and hence there is some \(S\subset A_N\subset A\) such that \(R(S)=1/d\) for some \(d\in [y,z]\), which suffices as discussed above.

Theorem 3.2 Solution in sets of positive logarithmic density, quantitative version
#

There is a constant \(C{\gt}0\) such that the following holds. If \(A\subset \{ 1,\ldots ,N\} \) and

\[ \sum _{n\in A}\frac{1}{n}\geq C \frac{\log \log \log N}{\log \log N}\log N \]

then there is an \(S\subset A\) such that \(\sum _{n\in S}\frac{1}{n}=1\).

Proof

Let \(C\geq 2\) be an absolute constant to be chosen shortly, and for brevity let \(\epsilon = \log \log \log N/\log \log N\), so that we may assume that \(R(A)\geq C\epsilon \log N\). Since \(\sum _{n\leq X}\frac{1}{n}\ll \log X\), if \(A'=A\cap [N^\epsilon ,N]\) we have (assuming \(C\) is sufficiently large) \(R(A')\geq \frac{C}{2}\epsilon \log N\).

Let \(X\) be those integers \(n\in [1,N]\) not divisible by any prime \(p\in [5,(\log N)^{1/1200}]\). Lemma 3.4 implies that, for any \(x\geq \exp (\sqrt{\log N})\),

\[ \left\lvert X\cap [x,2x)\right\rvert \ll \frac{x}{\log \log N} \]

and hence, by partial summation,

\[ \sum _{\substack {n\in X\\ n\in [\exp (\sqrt{\log N}),N]}}\frac{1}{n}\ll \frac{\log N}{\log \log N}. \]

Similarly, if \(Y\) is the set of those \(N\in [1,N]\) such that \(\omega (n) {\lt}\frac{99}{100}\log \log N\) or \(\omega (n)\geq \frac{101}{100}\log \log N\) then Turán’s estimate Lemma 2.3

\[ \sum _{n\leq x}(\omega (n)-\log \log n)^2\ll x\log \log x \]

implies that \(\left\lvert Y\cap [x,2x)\right\rvert \ll x/\log \log N\) for any \(N\geq x\geq \exp (\sqrt{\log N})\), and so

\[ \sum _{\substack {n\in Y\\ n\in [\exp (\sqrt{\log N}),N]}}\frac{1}{n}\ll \frac{\log N}{\log \log N}. \]

In particular, provided we take \(C\) sufficiently large, we can assume that \(R(A'\backslash (X\cup Y))\geq \frac{C}{4}\epsilon \log N\), say.

Let \(\delta =1-1/\log \log N\), and let \(N_i=N^{\delta ^i}\), and \(A_i=(A'\backslash (X\cup Y))\cap [N_{i+1},N_i]\). Since \(N_i\leq N^{e^{-i/\log \log N}}\) and \(A'\) is supported on \(n\geq N^\epsilon \), the set \(A_i\) is empty for \(i{\gt} \log (1/\epsilon )\log \log N\), and hence by the pigeonhole principle there is some \(i\) such that

\[ R(A_i)\geq \frac{C}{8}\frac{\epsilon \log N}{(\log \log N)\log (1/\epsilon )}. \]

By construction, \(A_i\subset [N_{i+1},N_i]\subset [N_i^{1-1/\log \log N_i},N_i]\), and every \(n\in A_i\) is divisible by some prime \(p\) with \(5\leq p\leq (\log N)^{1/1200}\leq (\log N_i)^{1/500}\). Furthermore, every \(n\in A_i\) satisfies \(\omega (n)\geq \frac{99}{100}\log \log N\geq \frac{99}{100}\log \log N_i\) and \(\omega (n)\leq \frac{101}{99}\log \log N\leq 2\log \log N_i\).

Finally, it remains to discard the contribution of those \(n\in A_i\) divisible by some large prime power \(q{\gt} N_i^{1-6/\log \log N_i}\). The contribution to \(R(A_i)\) of all such \(n\) is at most

\[ \sum _{N_i^{1-6/\log \log N_i}{\lt} q\leq N_i}\sum _{\substack {n\leq N_i\\ q\mid n}}\frac{1}{n} \ll \sum _{N_i^{1-6/\log \log N_i}{\lt} q\leq N_i}\frac{\log (N_i/q)}{q} \]
\[ \ll \frac{\log N_i}{\log \log N_i}\sum _{N_i^{1-6/\log \log N_i}{\lt}q\leq N_i}\frac{1}{q}\ll \frac{\log N}{(\log \log N)^2}, \]

using Lemma 2.6. Provided we choose \(C\) sufficiently large, this is \(\leq R(A_i)/2\), and hence, if \(A_i'\subset A_i\) is the set of those \(n\) divisible only by prime powers \(q\leq N_i^{1-6/\log \log N_i}\), then \(R(A_i')\geq (\log N)^{1/200}\), say. All of the conditions of Corollary 3.3 are now met, and hence there is some \(S\subset A_i'\subset A\) such that \(R(S)=1\), as required.

Corollary 3.3 Useful Technical Corollary
#

Suppose \(N\) is sufficiently large and \(A\subset [N^{1-1/\log \log N},N]\) is such that

  1. \(R(A)\geq 2(\log N)^{1/500}\),

  2. every \(n\in A\) is divisible by some prime \(p\) satisfying \(5 \leq p \leq (\log N)^{1/500}\),

  3. every prime power \(q\) dividing some \(n\in A\) satisfies \(q\leq N^{1-6/\log \log N}\), and

  4. every \(n\in A\) satisfies

    \[ \tfrac {99}{100}\log \log N\leq \omega (n) \leq 2\log \log N. \]

There is some \(S\subset A\) such that \(R(S)=1\).

Proof

Let \(k\) be maximal such that there are disjoint \(S_1,\ldots ,S_k\subset A\) where, for each \(1\leq i\leq k\), there exists some \(d_i\in [1,(\log N)^{1/500}]\) such that \(R(S_i)=1/d_i\). Let \(t(d)\) be the number of \(S_i\) such that \(d_i=d\). If there is any \(d\) with \(t(d)\geq d\) then we are done, taking \(S\) to be the union of any \(d\) disjoint \(S_j\) with \(R(S_j)=1/d\). Otherwise,

\[ \sum _i R(S_i)= \sum _{1\leq d\leq (\log N)^{1/500}} \frac{t(d)}{d}\leq (\log N)^{1/500}, \]

and hence if \(A'=A\backslash (S_1\cup \cdots \cup S_k)\) then \(R(A')\geq (\log N)^{1/500}\).

We may now apply Proposition 6.6 with \(y=1\) and \(z=(\log N)^{1/500}\) – note that condition (2) of Proposition 6.6 follows from condition (2) of the hypotheses with \(d_1=1\) and \(d_2=p\in [5,(\log N)^{1/500}]\) some suitable prime divisor. Thus there exists some \(S'\subset A'\) such that \(R(S')=1/d\) for some \(d\in [1,(\log N)^{1/500}]\), contradicting the maximality of \(k\).

3.1 Sieve Lemmas

Lemma 3.4 Sieve Estimate 1
#

Let \(N\) be sufficiently large and \(z,y\) be two parameters such that \(\log N \geq z{\gt}y\geq 3\). If \(X\) is the set of all those integers not divisible by any prime in \(p\in [y,z]\) then

\[ \left\lvert X\cap [N,2N)\right\rvert \ll \frac{\log y}{\log z}N. \]

Proof

Lemma 2.9 yields

\[ \left\lvert X\cap [N,2N)\right\rvert = \prod _{y\leq p\leq z}\left( 1-\frac{1}{p}\right)N+ O(2^z). \]

Mertens’ estimate 2.8 yields

\[ \prod _{p\leq z}\left( 1-\frac{1}{p}\right)^{-1} \gg \log z \]

and

\[ \prod _{p\leq y}\left( 1-\frac{1}{p}\right)^{-1} \ll \log y, \]

whence

\[ \prod _{y\leq p\leq z}\left( 1-\frac{1}{p}\right)^{-1}\gg \frac{\log z}{\log y}, \]

and hence

\[ \prod _{y\leq p\leq z}\left( 1-\frac{1}{p}\right)\ll \frac{\log y}{\log z}. \]

Therefore the first term above is \(\ll \frac{\log y}{\log z}N\). The second term is

\[ \ll 2^z \leq 2^{\log N}=N^{\log 2}\ll \frac{N}{\log N}\ll \frac{\log y}{\log z}N, \]

and the result follows.

Lemma 3.5 Sieve Estimate 2
#

Let \(N\) be sufficiently large and \(z,y\) be two parameters such that \((\log N)^{1/2}\geq z{\gt}4y\geq 8\). If \(Y\subset [1,N]\) is the set of all those integers divisible by at least two distinct primes \(p_1,p_2\in [y,z]\) where \(4p_1{\lt}p_2\) then

\[ \left\lvert \{ 1,\ldots ,N\} \backslash Y\right\rvert \ll \left( \frac{\log y}{\log z}\right)^{1/2}N. \]

Proof

Let \(w\in (4y,z)\) be some parameter to be chosen later. Lemma 3.4 implies that the number of \(n\in \{ 1,\ldots ,N\} \) not divisible by any prime \(p\in [w,z]\) is \(\ll \frac{\log w}{\log z}N\).

Similarly, for any \(p\in [w,z]\), the number of those \(n\in [1,N]\) divisible by \(p\) and no prime \(q\in [y,p/4)\) is

\[ \ll \frac{\log y}{\log p}\frac{N}{p}. \]

It follows that the number of \(n\in \{ 1,\ldots ,N\} \backslash Y\) is

\[ \ll \left( \frac{\log w}{\log z}+ \log y\sum _{p\geq w}\frac{1}{p\log p}\right)N. \]

By partial summation, \(\sum _{p\geq w}\frac{1}{p\log p}\ll 1/\log w\), and hence

\[ \left\lvert \{ 1,\ldots ,N\} \backslash Y\right\rvert \ll \left( \frac{\log w}{\log z}+\frac{ \log y}{\log w}\right)N. \]

Choosing \(w=\exp \left( \sqrt{(\log y)(\log z)}\right)\) completes the proof.