Distance functions refer to functions used for quantifying the notion of distance between members of a set, or between objects.
Questions tagged [distance-functions]
344 questions
21
votes
1 answer
When to use weighted Euclidean distance and how to determine the weights to use?
I have a set of data where each data consist of $n$ different measures. For each measure, I have a benchmark value. I would like to know how close each data is to the benchmark value.
I thought of using the Weighted Euclidean Distance like…
Sara
- 1,487
7
votes
1 answer
What options are there to combine different distance functions?
I am currently working with feature-vectors that are made up of continuous attributes, so I can use the euclidean distance for things like KNN-classification and clustering. Now I want to add a nominal attribute that has a special distance function…
Björn Pollex
- 1,383
4
votes
1 answer
square root missing in code?
I'm using Pairwise Mahalanobis distance in R as code to calculate the Mahalanobis distance:
# express difference (X1-X2) as atomic row vector
d <- as.matrix(X1-X2)[1,]
# solve (covariance matrix) %*% x = d for x
x <- solve(cov(R),d)
#…
Ben
- 3,443
4
votes
1 answer
Is there a name for the function | L1 sect L2 | / | L1 |
I am not able to find an existing name for the comparison function I wrote.
I have two lists of values which I want to compare. At first I used jaccard, but then I recognized I need to eliminate the length of the second list as it does not seem to…
aufziehvogel
- 213
3
votes
0 answers
Similarity Amongst Recipes Using Ingredients and Reviews/Descriptions
I'm still toying with things and just learning this, so please forgive any incorrect terminology.
My toy data set is a collection of recipes with a fairly significant overlap in ingredients. I'm using these as my features, and using Pearson squared…
Peck
- 131
3
votes
0 answers
What is a good similarity measure to use when missing data is a significant issue?
I have a list of cities that I want to compare in terms of their similarity. Each city can described by a large but finite number of characteristics but most of them will have missing data for some random number of characteristics. If I consider…
KumaKuma
- 31
2
votes
0 answers
Way to Measure Groupings Using Distances Between Individuals?
I am working on a problem that requires me to measure groupings of people. I have the location of every individual in my sample at every point in time. It's therefore trivial to calculate the distances between each individual for a certain point in…
user48944
- 121
1
vote
0 answers
Negative Mahalanobis Distance
I would like to calculate a compound scores of several normal distributed continues standardized (z-score) variables.
Some of these measures are correlated, some are not. Hence, I would like to take into account to correlation among them.
If I…
Vincent
- 281
1
vote
0 answers
covariance matrix of individuals or of the pool?
At first: I have individuals represented by vectors with four entries/properties:
Individual1:
Height Age Blood Gender
171 24 A w
Individual2:
Height Age Blood Gender
179 21 B m
Individual3:
Height Age Blood Gender
181 33 B…
Ben
- 3,443
1
vote
2 answers
Mahalanobis distance as measure of dissimilarity between strings (sequences)
I'm doing some research about methods for distance-based comparison of composition of biological sequences (genes, proteins).
Suppose I have two strings (named X and Y) of different lengths, but from a finite alphabet (A, C, T, G):
X = 'ACGT'
Y =…
s_sherly
1
vote
1 answer
Mahalanobis Distance and feature scaling
I've been using Mahalanobis distance to look for outliers. This link: https://www.cs.princeton.edu/courses/archive/fall08/cos436/Duda/PR_Mahal/M_metric.htm says that feature scaling is addressed in the computation of the Mahalanobis distance. Am I…
AlanD
- 13
- 4
0
votes
0 answers
Help calculating the Bhattacharya Coefficient for measuring tracking efficiency
I'm working with OpenCV which has some tracking algorithms (BOOSTING, MIL, KCF, MEDIANFLOW, TLD, ...).
I've read many papers where they use the Bhattacharya Coefficient to measure the efficiency of each tracker.
I know the formula is:
But I don't…