I recently taught an introduction to real analysis. I assigned a (covid-induced) take-home final, which included the question:
Define the set S by $$ \bigcup_{n=1}^\infty \left\{ \frac{a}{2^n}\colon 0\le a\le 4^n, a\in\mathbb{Z} \right\}. $$ Identify $\bar S$, showing that $\bar S$ is what you claim. [ Hint: for any real number $x$ and any positive integer $k$, there exists $a \in \mathbb Z$ such that $kx \in [a,a+1)$, so that $|x−\frac a k|<\frac 1 k$.]
Shockingly (to me), I received 9 essentially identical solutions that all contained the same serious, and somewhat subtle error. I first suspected a "homework help" site, and indeed found that the question had been posted and answered on chegg (a colleague has subscribed to that web site, and showed me the solution that was posted there, which was identical to the incorrect solution).
However, that's not the end of the story! Another student, when challenged said she "had never heard of chegg", and claimed she had found it in some materials she had been studying "before the exam". She then gave me a link to a github repository (see question 19) that appears to be dated 3 years ago that contained almost exactly my question (that I thought I was the first to invent!) -- with the same incorrect solution as appears on chegg. Given the quite distinctive formatting of the solutions and the distinctive error that they all contained, I am convinced this is not a case of two people independently making the same mistake.
My question: How were the "cheggspert" (i.e. the person employed by chegg to solve students' exam questions) and my student able to locate the github repository containing this question?
Needless to say, I harbour the hope that understanding how this kind of cheating takes place will give me ways to prevent it in the future.