I am currently working with feature-vectors that are made up of continuous attributes, so I can use the euclidean distance for things like KNN-classification and clustering. Now I want to add a nominal attribute that has a special distance function defined. What options do I have of combining these distance functions, so I still obtain one distance for two vectors?
Asked
Active
Viewed 1,199 times
1 Answers
5
I can think of three:
- Combine them in a linear manner ($d=d_1+\alpha d_2$) and find best $\alpha$ by some optimization, let's say minimizing CV error for kNN or minimizing silhouette for clustering.
- Train separate classifiers / cluster the data few times based on both distances and then blend the results. This may not work too well because you have only 2 base methods.
- For classification only, you may use "klNN" -- get $k$ neighbors based on first metric and $l$ based on the second.