As Steffen pointed out, the example matrix encodes the number of times a word appears in a text. The position of the encoding into the matrix is given by the word (column position on the matrix) and by the text (row position on the matrix).
Now, The hashing trick works the same way, though you don't have to initially define the dictionary containing the column position for each word.
In fact it is the hashing function that will give you the range of possible column positions (the hashing function will give you a minimum and maximum value possible) and the exact position of the word you want to encode into the matrix. So for example, let's imagine that the word "likes" is hashed by our hashing function into the number 5674, then the column 5674 will contain the encodings relative to the word "likes".
In such a fashion you won't need to build a dictionary before analyzing the text. If you will use a sparse matrix as your text matrix you won't even have to define exactly what the matrix size will have to be. Just by scanning the text, on the fly, you will convert words into column positions by the hashing function and your text matrix will be populated of data (frequencies, i.e.) accordingly to what document you are progressively analyzing (row position).