Each of the rebus works as follows:
A string X is derived from the rebus itself, and a string Y from the overall typesetting (font size, color etc) of the rebus. The answer is then X placed inside Y, since the image literally shows X in (the style) Y. The cute figures below each of the top four images act as definitions for the answer, allowing us to confirm our answers as we progress.
Rebus #1:
The rebus is L IN DF written in BOLD, so the answer is BLINDFOLD.
Rebus #2:
P OR E in CORAL giving CORPOREAL.
Rebus #3:
T IF IC in ARIAL giving ARTIFICIAL.
Rebus #4:
E TO T in TEAL, giving TEETOTAL.
And there's more!
As hinted by the numbers below, we can take the 3rd, 7th, 7th and 4th letters of these answers to form a new rebus:

So the final answer is:
CT IF IE in RED, giving RECTIFIED.