Im trying to model: Y~x0+x1+x2+x3+x4, were Y is a continous variable (cost), x0 is the intercept, x1 is a continous variable (days) and x2-x4 are categorical variables with mulitple levels. The categorical variable x2 have 156 levels (each level representing a different diagnosis code, i.e. lung cancer, migraine etc). I want to include x2 in the model but I dont want 156 different dummy variables, were each dummy variable represent a diagnosis code.
Here is a picture of the frequencies of each level (censored):

About 2/3 of the levels are significant at 0.05, when Y~x2.
What is the best way to deal with this kind of problem in R?