Map tensor values with a dynamic mapping at each batch

Question

I wish to map an integer 1D tensor values for each batch, so it should be pretty fast. The mapping is from the tensor unique values to 0:(n_unique - 1).

If it were numpy, I could do something like this:

x = np.array([1,2,4,4])
rep = dict(enumerate(np.unique(x)))
rep_inv = dict(zip(rep.values(), rep.keys()))
x_map = np.vectorize(rep_inv.get)(x)
x_map

array([0, 1, 2, 2])

I found this solution for tensorflow, which works a single time:

x = tf.constant([1,2,4,4], dtype = tf.int64)

def get_table(x):
    x_unique, _ = tf.unique(x)
    x_mapto = tf.range(tf.shape(x_unique)[0], dtype=tf.int64)
    table = tf.lookup.StaticVocabularyTable(
            tf.lookup.KeyValueTensorInitializer(
                x_unique,
                x_mapto,
                key_dtype=tf.int64,
                value_dtype=tf.int64,
            ),
            num_oov_buckets=1,
        )
    return table

table = get_table(x)
x_map = table.lookup(x)
x_map

<tf.Tensor: shape=(4,), dtype=int64, numpy=array([0, 1, 2, 2], dtype=int64)>

But comes the second batch and I get an error:

OP_REQUIRES failed at lookup_table_op.cc:964 : Failed precondition: Table was already initialized with different data.

Is there a way to bypass this (bug?), solve this or use a whole different approach to achieve what is required? (using TF 2.4.0)

Map tensor values with a dynamic mapping at each batch

0 Answers0