Define the element-wise thresholding operator $T_\tau(\cdot)$ with threshold $\tau$ as $$ [T_\tau(X)]_{i,j} = \begin{cases} X_{i,j} &\mbox{if } |X_{i,j}| \ge \tau, \\ 0 & \mbox{if } |X_{i,j}| < \tau. \end{cases} $$ Clearly, $T_\tau(X)$ can be computed in quadratic $O(n^2)$ time for some $n\times n$ matrix $X$.
Question: Suppose that the thresholded matrix $T_\tau(X)$ contains only $O(n)$ nonzero elements. Is it possible to compute $T_\tau(X)=T_\tau(UV^T)$ in linear $O(n)$ time (on a serial computer) from a low-rank factorization $U,V\in\mathbb{R}^{n\times r}$ of $X=UV^T$? (Take the rank $r$ to be an absolute constant, i.e. $r=O(1)$)
Some insights: Let the rows of $U$ and $V$ be written as $u_{1},\ldots,u_{n}$ and $v_{1},\ldots,v_{n}$, as in $$ U=\begin{bmatrix}u_{1}^{T}\\ u_{2}^{T}\\ \vdots\\ u_{n}^{T} \end{bmatrix},\qquad V=\begin{bmatrix}v_{1}^{T}\\ v_{2}^{T}\\ \vdots\\ v_{n}^{T} \end{bmatrix}. $$ Then we have $X_{i,j}=u_{i}^{T}v_{j}$. By Hölder's inequality, we have $$ |u_{i}^{T}v_{j}|\le\|u_{i}\|_{p}\|v_{j}\|_{q}\qquad\frac{1}{p}+\frac{1}{q}=1. $$ So it seems like we can just compute the $p$-norm of all rows of $U$ and $V$ in $O(n)$ time, and threshold the rows directly. However, this approach seems to be extremely conservative.