Consider a formulation of LWE where we are given either $(x,S x+e)$ or $(x,u)$ --- where $S$ is an $m \times n$ secret/hidden matrix, $x$ is a randomly sampled $n \times 1$ vector, $e$ is an $m \times 1$ Gaussian error vector, and $u$ is a uniformly random sample --- and told to distinguish between these two cases. This should be hard for classical algorithms, according to the post here. Call this problem "reverse-LWE."
I had a few questions about the setting.
Is the distinguishing problem hard without $e$?
Note that in standard LWE, when we are given $(A,A s+e)$ or $(x,u)$, and told to distinguish between the two cases, the problem is easy without the error. We just solve a system of linear equations to get the $n$ entries of $s$.
However, here we need to find $m \times n$ entries of our secret matrix $S$. I do not see how to do that with just $m$ equations.
Consider a variant of the problem, where we are given either $$ \{ (x_1, Sx_1 + e_1), (x_2, Sx_2 + e_2), \ldots, (x_k, Sx_k + e_k)\} ~~\text{or}$$
$$\{ (x_1, u_1), (x_2, u_2), \ldots, (x_k, u_k)\},$$
and told to distinguish between the two cases. Call this problem "reverse LWE with a repeated secret." $k$ is polynomially large in $n$. Is this problem hard?
Note that a hybrid argument (like the one used in one of the answers here) indicates that the problem remains hard. Here is the hybrid $H_i$:
$$H_i = \{ (x_1, Sx_1 + e_1), (x_2, Sx_2 + e_2), \ldots, (x_i, Sx_i + e_i), (x_{i+1}, u_{i+1}) \ldots, (x_{k}, u_{k}) \} .$$
Then, there is a direct way to conclude that if we solve "reverse LWE with a repeated secret," we can solve reverse LWE. Since reverse LWE is hard, our problem should also be hard.
However, I am having difficulty wrapping my head around this fact.
Note that if we do not have the error term, there is a very easy way of distinguishing between the two cases, for $k \geq n+1$. Note that there can be only $n$ linearly independent $x_i$-s. So, the distinguisher just looks for $n$ distinct $x_i$-s in the given samples, notes where the matrix $S$ takes these vectors to, and for the $n+1^{\text{th}}$ distinct sample, uses linearity to first compute where $S$ takes it to, and then checks whether that's consistent with what was given.
Why is it that the error terms make this distinguisher fail? Even with a Gaussian error, because of linear dependence, shouldn't the $n+1^{\text{th}}$ distinct sample be sufficiently concentrated around some value for my distinguisher to succeed?
$${ (x_1, u_1), (x_2, u_2), \ldots, (x_k, u_k)}.$$ This should also be hard then, by a hybrid argument. However, we know this problem is easy for $k > n +1$ by the distinguisher I outlined (checking for linear dependence and noting that there can only be $n$ linearly independent $x_i$-s.) How can both be true?
– BlackHat18 Apr 04 '22 at 00:23$${ (x_1, u_1), (x_2, u_2), \ldots, (x_k, u_k)},$$ are actually indistinguishable? Doesn't the hybrid argument say that they are?
– BlackHat18 Apr 04 '22 at 15:46$${ (x_1, u_1), (x_2, u_2), \ldots, (x_k, u_k)}$$ distinguishable for $k > n$? I couldn't prove anything as, like you said, reductions that involve producing additional samples do not work.
– BlackHat18 Apr 04 '22 at 18:31