There are many ways to compare them. Which is most useful depends not only on what the data look like, but also which differences you find most interesting. Your preferences may change, and if so the answer you find most useful may also change.
One of the things you need to decide is whether you are interested in differences in ratios, or differences in absolute counts. If set C has four thousand pizzas and one thousand cakes, is that a perfect score match for B, because the ratios are equal, or almost zero match, because the total numbers are so far apart? That is a matter of preference, which can vary based on how you want to look at things. If you think ratios are the most important, the other answer is fine, but it is far from the only option. In such cases I usually use the mutual information or Bhattacharyya divergence, or their alternate forms as information distance and Hellinger distance. I avoid Kullback–Leibler divergence because it's not symmetric, but that may not bother you the way it does me.
If absolute counts do matter, then none of those methods is appropriate, because they all assume that you're comparing probability distributions, which only describe relative ratios. Here's an easy method that does work for absolute counts: define the similarity $S=|A\cap B|/|A\cup B|$, the size (total number of objects contained, or "cardinality") of the intersection divided by the size of the union. Now, since we're talking about sets with multiplicity, intersection and union are a bit trickier to define than for simple yes-or-no sets, but it's still pretty straightforward. In particular, let $A\cap B$ be the largest multiset which is a subset of both A and B, and let $A\cup B$ be the smallest multiset which contains both A and B as subsets. Then in your example, the intersection is {pizza: 3, cake: 1} and the union is {pizza: 4, soda: 5, cake: 2}, so the similarity is 4/11, on a scale from 0 to 1. Zero happens when there are no items in common, and one happens when the two multisets are identical.
A very flexible method is to consider your sets as column vectors, and define measures of similarity as row vectors of weights to dot-product with your column vectors. Different weight vectors give different similarity metrics; common examples are price, mass, volume, calories, grams of sugar, etc. This leads to classic optimization concepts like the knapsack problem, such as considering how to obtain the largest number of calories with a fixed number of dollars. Which weight vector is best depends on the question you want your data to answer, which is entirely up to you.