Questions tagged [distance-functions]

Distance functions refer to functions used for quantifying the notion of distance between members of a set, or between objects.

344 questions
21
votes
1 answer

When to use weighted Euclidean distance and how to determine the weights to use?

I have a set of data where each data consist of $n$ different measures. For each measure, I have a benchmark value. I would like to know how close each data is to the benchmark value. I thought of using the Weighted Euclidean Distance like…
Sara
  • 1,487
7
votes
1 answer

What options are there to combine different distance functions?

I am currently working with feature-vectors that are made up of continuous attributes, so I can use the euclidean distance for things like KNN-classification and clustering. Now I want to add a nominal attribute that has a special distance function…
4
votes
1 answer

square root missing in code?

I'm using Pairwise Mahalanobis distance in R as code to calculate the Mahalanobis distance: # express difference (X1-X2) as atomic row vector d <- as.matrix(X1-X2)[1,] # solve (covariance matrix) %*% x = d for x x <- solve(cov(R),d) #…
Ben
  • 3,443
4
votes
1 answer

Is there a name for the function | L1 sect L2 | / | L1 |

I am not able to find an existing name for the comparison function I wrote. I have two lists of values which I want to compare. At first I used jaccard, but then I recognized I need to eliminate the length of the second list as it does not seem to…
3
votes
0 answers

Similarity Amongst Recipes Using Ingredients and Reviews/Descriptions

I'm still toying with things and just learning this, so please forgive any incorrect terminology. My toy data set is a collection of recipes with a fairly significant overlap in ingredients. I'm using these as my features, and using Pearson squared…
Peck
  • 131
3
votes
0 answers

What is a good similarity measure to use when missing data is a significant issue?

I have a list of cities that I want to compare in terms of their similarity. Each city can described by a large but finite number of characteristics but most of them will have missing data for some random number of characteristics. If I consider…
2
votes
0 answers

Way to Measure Groupings Using Distances Between Individuals?

I am working on a problem that requires me to measure groupings of people. I have the location of every individual in my sample at every point in time. It's therefore trivial to calculate the distances between each individual for a certain point in…
user48944
  • 121
1
vote
0 answers

Negative Mahalanobis Distance

I would like to calculate a compound scores of several normal distributed continues standardized (z-score) variables. Some of these measures are correlated, some are not. Hence, I would like to take into account to correlation among them. If I…
Vincent
  • 281
1
vote
0 answers

covariance matrix of individuals or of the pool?

At first: I have individuals represented by vectors with four entries/properties: Individual1: Height Age Blood Gender 171 24 A w Individual2: Height Age Blood Gender 179 21 B m Individual3: Height Age Blood Gender 181 33 B…
Ben
  • 3,443
1
vote
2 answers

Mahalanobis distance as measure of dissimilarity between strings (sequences)

I'm doing some research about methods for distance-based comparison of composition of biological sequences (genes, proteins). Suppose I have two strings (named X and Y) of different lengths, but from a finite alphabet (A, C, T, G): X = 'ACGT' Y =…
s_sherly
1
vote
1 answer

Mahalanobis Distance and feature scaling

I've been using Mahalanobis distance to look for outliers. This link: https://www.cs.princeton.edu/courses/archive/fall08/cos436/Duda/PR_Mahal/M_metric.htm says that feature scaling is addressed in the computation of the Mahalanobis distance. Am I…
AlanD
  • 13
  • 4
0
votes
0 answers

Help calculating the Bhattacharya Coefficient for measuring tracking efficiency

I'm working with OpenCV which has some tracking algorithms (BOOSTING, MIL, KCF, MEDIANFLOW, TLD, ...). I've read many papers where they use the Bhattacharya Coefficient to measure the efficiency of each tracker. I know the formula is: But I don't…