Precision of Computer Vision algorithms

Question

Let's say the task is to determine element position on image. First very important thing is correct detection of object then some algorithms of calculating position are used (for examble blob analysis). Everything depends on multiple things (detection correctness, used algorithms etc.)

Lets assume we have callibrated image and know error given by callibration. What are the methods to calculate reliably precision of computer (and machine) vision algorithms? Can it be done analiticaly or only by experiments and tests?

The question addres cases when we detect element position and also other computer vision problems.

I want to get references to problems which are related to computer/machine vision especially element position detection and present some correctness computations either analyticaly or experimental approach to show this precission.

Also suggestions how to improve this question are welcomed.

score 5 · Accepted Answer · answered Sep 23 '12 at 10:56

5

For example, Hartley & Zisserman suggest using preconditioning prior to homography estimation, because taking direct matrix inverse can lead to huge errors or instabilities. This applies to any numerical method working with matrix inverse.

Feature detection algorithms often use sub-pixel approximation of interest point location.

Most books discussing numerical methods also deal with their stability analysis.

Sometimes you need to do some statistics to analyze precision and accuracy of your estimator (be it a least-squares estimator or maximum likelihood estimator). This is useful in algorithms like RANSAC, which deal with outliers. You would also like to know, how well the estimated transform fit your data and possibly discard results that are too inaccurate.

When working with finite differencing or doing some filtering, a slight Gaussian blurring is done to remove noise, which would otherwise cause huge errors in second derivatives.

Some problems in computer vision are ill-posed. A method of regularization (such a Tikchonov regularization) is necessary in order to solve them. Examples where this is necessary include computing anisotropic diffusion.

answered Sep 23 '12 at 10:56

Libor

4,255
24
38

So this applies when we have detected some features and match them to model features with statistics (and this matching gives error which we can compute). How about computing feature detection errors. For example if features are blobs extracted by thresholding? – krzych Sep 23 '12 at 11:50
I think you cannot compute "detection error" given only the image. There need to be some context in which you can say the feature is erroneous. – Libor Sep 23 '12 at 12:28
Exactly but what conntext. How to design some tests to figure out feature detection correctness? – krzych Sep 23 '12 at 12:45
2

As H&Z noted in their book: "This is an chicken and egg problem..." We cannot say which features are "good" and which are "bad" without matching them first. There are some developments in designing feature descriptors so that they are matched well to a larger datasets. Given measurement of a descriptor 'quality', you can discriminate features which are not likely to be matched. – Libor Sep 23 '12 at 12:51
But there must be some method to evaluate correctness of whole system. I think that it is very important for machine vision applications especially when we talk about element positioning. As I said in question I am also interested in some ways of testing this correctness. – krzych Sep 24 '12 at 20:43
Depends on application. For example, I am using feature detection for image stitching. Here the correctness of matched images can be measured by ratio of total number of interest points (features) found in overlap region to interest points actually used for matching. David Lowe (author of SIFT) also suggest taking ratio of 1st nearest neighbor distance to 2nd nearest neighbor distance to discard potential false matches between interest points. These ratios form a distribution which can be well separated. – Libor Sep 25 '12 at 06:44
I am interesting in different applications as I said. Mainly machine vision. I'm detecting position of element mainly. To show one of examples see my topic there http://stackoverflow.com/questions/12560339/calculating-element-position-by-computing-transformation/ – krzych Sep 25 '12 at 07:08
To my knowledge, you you can compute correct/false matches ratio or similarity with transformed sample (Dave suggested cross-correlation in the comment in linked question). – Libor Sep 25 '12 at 07:43
But there I'm not asking about correctness ratio but precission in px so I can convert it to real world measures. I've just added reference to this question to show range of problems I'm interested in. – krzych Sep 25 '12 at 10:46
Sorry I didn't get you right. Why not to sum distances between source and transformed points? – Libor Sep 25 '12 at 11:25
But then you get precision of particular match. The question is how to compute precision of used algorithm (and it does not adress question I've included, it's only one of example algorithms). So I have some algorithm extracting features in some way and matching it to image, and I want to compute precission of this algorithm or maximal error in pixels it gives. – krzych Sep 25 '12 at 11:51
@Libor, The link on your profile doesn't work (To your homepage). – Royi Jul 23 '23 at 17:59

John Robertson · Answer 2 · 2012-10-03T20:08:17.847

This doesn't answer the whole question, but it addresses part of what the OP asks about.

It can only be done experimentally. To do it analytically would require information about what the algorithm should have returned. But to know that, you need a known always correct computer vision algorithm to compare against (as well as detailed analytical descriptions of the image being tested against). Analytical solutions require knowledge of a ground truth which is analytical rather than hand generated on a case by case basis. But we don't have an analytical way to generate a ground truth -- that is what we are trying to develop.

Given that it can only be done experimentally, you may want to look at google scholar. If you are after people location there will be a lot of papers dedicated to locating a person, or parts of a person like a head or hands. Car location will also have a lot of speciallized attention. Other objects will just want generic algorithms.

Some references could improve this answer. – krzych Oct 01 '12 at 16:29 — krzych, Oct 01 '12 at 16:29

Precision of Computer Vision algorithms

2 Answers2