This answer is based on the idea of Dana in her answer above.
I think you can construct such a matrix using two-source lossy condensers. Fix $\delta = 0.001$ and say $N=2^n$. Suppose you have an explicit function $f(x,y)$ that takes any two independent random sources $(X, Y)$, each of length $n$ and having min-entropy at least $k = n(1/2 - \delta)$ and outputs a sequence of $n' = n/2$ bits that is $\epsilon$-close to a distribution with min-entropy at least $k'=n(1/2-3\delta)$. I think you can use standard probabilistic arguments to show that a random function satisfies these properties (with overwhelming probability) if $2k > k'+\log(1/\epsilon)+O(1)$. To probabilistic argument should be similar to what used in the following paper for lossless condensers and more general conductors:
M. Capalbo, O. Reingold, S. Vadhan, A. Wigderson. Randomness Conductors and Constant-Degree Expansion Beyond the Degree/2 Barrier
In our case, we set $\epsilon = 2^{-k'}$, so we are sure about the existence of the function that we need. Now, an averaging argument shows that there is an $n'$-bit string $z$ such that the number of $(x,y)$ with $f(x,y)=z$ is at least $2^{1.5 n}$. Suppose you know such a $z$ and fix it (you can pick any arbitrary $z$ if you additionally know that your function maps the fully uniform distribution to a distribution that is $O(2^{-n/2})$-close to uniform). Now identify the entries of your $N \times N$ matrix by the possibilities of $(x,y)$ and put a $1$ at position $(x,y)$ iff $f(x,y)=z$. By our choice of $z$, this matrix has at least $2^{1.5n}$ ones.
Now take any $2^k \times 2^k$ submatrix and let $X, Y$ be uniform distributions on the picked rows and columns, respectively. By the choice of $f$, we know that $f(X,Y)$ is $\epsilon$-close to having min-entropy $k'$. Therefore, if we pick a uniformly random entry of the submatrix, the probability of having a $1$ is at most $2^{-k'}+\epsilon\leq 2^{-k'+1}$. This means that you have at most $2^{2k-k'+1} = O(2^{n/2 + \delta})$ ones in the submatrix, as desired.
Of course coming up with an explicit $f$ with the desired parameters (in particular, nearly optimal output length) is a very challenging task and no such function in known so far.