First off, actors, fashion designers, athletes, etc. are not generally noted for their scrupulous attention to grammatical niceties.
Second, speech is a relatively spontaneous process where expression and thought do not always keep in synch.
Third, you are correct:
There is two snakes in the ambulance
would not be said by any native speaker unless in a panic about finding such unwelcome guests among the first-aid supplies, under which circumstances some, perhaps even many, would be in an excusably confused frame of mind.
But that of course is not what Cage is thinking of. He is not thinking about actual snakes at all. He is thinking of a single representation that portrays two snakes twined around a single staff, the caduceus. It is a single symbol. As I said, when someone is speaking extemporaneously, they are also thinking, and the specific words uttered may not match the complexity of the thought. The thought here concerns the single representation painted on the side of an ambulance of a single, unified symbol, one that includes three components, a single staff and two snakes. Whether singular or plural verb forms are more apt is not likely to be an issue that the mind even perceives as it formulates that example.