2

enter image description here

Would it be unwise to scale the Sex variable?

I'm teaching myself Clustering and will probably attempt UMAP, T-SNE, and K-means soon.

I saw someone else scale this dataset in Python. All the other variables, I understand. The first though, I don't, since there is only 0/1, Male/Female.

Thoughts?

Also, how would you go about choosing the scales for the other categorical variables? Is it by gut feeling?

Antonio
  • 543
  • 1
  • 9
  • 1
    The issues discussed on the linked page with respect to penalized regression are essentially the same as those involved in clustering. There is no single correct answer to how to put categorical variables on comparable scales with each other or with continuous variables. You have to make careful choices based on your understanding of the subject matter. If you have more specific questions after you read that page and its links, please post a new question specific to what is still unclear to you. – EdM Dec 02 '22 at 17:38
  • @EdM Awesome. Thank you! – Antonio Dec 02 '22 at 19:08

0 Answers0