Compressing vectors by using log-linear modeling?

Question

I have $k$ vectors, each consisting of $2^n$ positive reals adding up to 1, and I'd like to compress them by only saving $n$ reals per vector.

One approach is a no-interactions log-linear model to model $2^n$ outcomes by using $n$ parameters for $k$ different distributions, except here I'm free to pick $n$ features of my log-linear model, which are then fixed for all $k$ vectors.

Has this been done?
What algorithm to use to pick $n$ features?

score 1 · Answer 1 · answered Jul 19 '22 at 17:25

One common approach would be to use an autoencoder (neural network) with a $n$ dimensional bottleneck. As you say, the output would then be $2^n$ dimensional again and then one would apply the softmax function to it. I'd assume you could do this relatively easily with gradient descent and e.g. a cross-entropy loss. Autoencoders are of course a pretty common idea. It would sort of feature engineer its own $n$ features out of the original inputs via the layers of the neural network.

Of course, it's probably not clear what the best architecture is (e.g. how many layers until the bottle-neck, e.g. 2, with how many neurons in each layer, e.g. decreasing in some sensible way, and then the same questions when going back to the $2^n$ dimensional output), what activation functions (e.g. ReLU) to use in which layers, how to regularize (I'd assume drop-out might not be so great for a regression type task, but maybe weight decay of some form) and whether to perhaps do something like a denoising autoencoder. All those are probably best experimented with via a suitable cross-validation set-up.

Bottleneck autoencoders are actually pretty hard to train, GD just plateaus at high loss, I did some experiments here -- https://yaroslavvb.medium.com/optimizing-deeper-networks-with-kfac-in-pytorch-4004adcba1b0 — Yaroslav Bulatov, Jul 19 '22 at 17:29
Motivation for this question was Example 7 in https://arxiv.org/abs/0707.3794 -- you can recover $m$ original numbers from a set of $n$ overlapping sums. If you assume independence, $n<<m$. So it comes down to -- which overlapping sums to use? — Yaroslav Bulatov, Jul 19 '22 at 17:40

Compressing vectors by using log-linear modeling?

1 Answers1