4

If you go to FFFFOUND! and click on some image you will notice that on the new page, under the image, there is a section called "You may like these images." which suggests 10 images that look similar to the original.

What would be a good algorithm to achieve this functionality for a collection of images?

Any documentation, books, etc. related to such algorithms is very appreciated. Also algorithms for finding similar images that yield better results than those seen on FFFFOUND! website are also welcome.

Sadeq Dousti
  • 16,479
  • 9
  • 69
  • 152
  • Maybe that behaviour is achieved by tagging every image with some keywords rather than performing a complex image-reasoning task. Two image are then similar if they share at least one keyword (the more keywords they share, the more similar they are). – Giorgio Camerani Nov 03 '10 at 10:25
  • 3
    Have you seen this StackOverflow post? – Sadeq Dousti Nov 03 '10 at 10:52
  • this is far too broad a question: as Walter points out, there are many ways this could be framed. If you were to narrow things down to specific problem formulations that involve a discussion of algorithms, and theory in general, your question might be better suited for this forum. Else, it's likely to get closed. – Suresh Venkat Nov 04 '10 at 05:18
  • So the problem is that I need an algorithm that is able to find similar images using just the information contained within the images themselves. In what way similar? That's why I mentioned the FFFFOUND! website so that you can take a look and see what I'm talking about. Also for the people that are saying that tags and context are used in order to achieve this, I conclude that they didn't pay enough attention to the question. If you look at that website you will notice that there are no tags whatsoever. The only data that is used to suggest the images is the data within the images themselves. – please delete me Nov 04 '10 at 09:06
  • @Sadeq Dousti: Thank you for the post. Yes, I have seen it but I'm trying to gather as much information as possible on the subject. If you read my question you will see I also asked about books, documentation etc. info which isn't present anywhere in that question's answers. – please delete me Nov 04 '10 at 09:08
  • 1
    @Robert: Let me clarify my previous comment. I've just been on FFFFOUND! for the second time. Looking at the similar images suggested, I see that in most cases such similarity is a very high-level semantic similarity, rather than a similarity in colors, brightness, contrast... (for example, on page 2 there is a picture of Barbie together with a picture of a McDonald restaurant at night). Discovering such abstract similarities involves many complex AI tasks (both low-level such as edge detection, and high-level such as knowledge representation and reasoning), it is probably AI-complete and, ... – Giorgio Camerani Nov 04 '10 at 11:08
  • 1
    @Robert: ...as far as I know, it is out of reach for modern AI. That's why I suspected that the grouping of images on that site is based on something simpler such as tagging or (as Jeffrey suggested below) analysis of the users behaviour. – Giorgio Camerani Nov 04 '10 at 11:16
  • Thank you Mr. Walter, this looks like a good direction to investigate. What do you know about Wavelet Transforms (Haar), do you think I could achieve better results compared with the techniques that you have mentioned? There is also another technique that people are talking about which goes like this: Resize the images to some very small size (16x16, for example) and then compute the Manhattan distance between two given images. Would this be appropriate? Thank you for your time. (You should move your comments in the answers section so that I can vote them.) – please delete me Nov 04 '10 at 11:33
  • 1
    @Robert: You are most welcome. My personal sensation is that, if machines will be ever able to achieve complex intelligent behaviours like the one described above, this would be thanks to a combination of both low-level processing and high-level processing (see the Connectionism vs. Computationalism debate on Wikipedia). Wavelet Transforms would be certainly helpful for the low-level processing (the information they provide could be used as some of the inputs to higher level tasks such as shape detection, object recognition, object classification), but... – Giorgio Camerani Nov 04 '10 at 13:45
  • 1
    @Robert: ...I doubt they would solve the problem alone (of course, if you are looking for low-level similarities they could be enough to satisfy your needs). Trying to make progress on complex problems like that is a very broad affair, which spans over several topics (and which could be itself the subject of an entire PhD experience). Concerning that technique of scaling an image down to a small size, although certainly interesting, it is certainly very basic, too (what if you have 2 copies of the same image, the second being rotated 90 degrees?): it may work only for color similarities. – Giorgio Camerani Nov 04 '10 at 14:02

4 Answers4

2

You could take a collective intelligence approach and try to determine similarity between various users based on their tastes of other images. If I like a bunch of images and another user likes majority of the same images, then it could be said that I will most likely like pictures the other user likes.

You might want to try and find information on Collaborative Filtering, Euclidean Distance/Pearson correlation (within the context of social networks.)

Jeffrey
  • 121
  • 1
1

Depends on how you define similarity.

Two images may be similar in context. But that also depends on how you define the context.

Another metric of similarity may be some mathematical distance function such as euclidean distance.

I guess you want algorithms which categorize an image using some information about the context of the image, such as tags. The field of Machine Learning deals with this type of problem.

A more interesting approach, but much harder and not so well developed, is to also learn the tags. That also falls in the field of Machine Learning.

George
  • 674
  • 4
  • 11
0

this is a recent breakthrough google research result where a large distributed network running stochastic gradient descent optimization (SGD) and working with unlabeled images (from random youtube videos and frames) was able to develop highly meaningful emergent feature detectors including facial recognition and other object recognition (cats, human bodies, etc). as the researchers note in the article, this was previously considered impossible by conventional wisdom. most prior experiments tend to focus on labelled data samples or supervised algorithms. note however the training phase is extremely CPU intensive.

it seems likely the technology will have wideranging applications in the future including eg, most basically, for detecting similarity of images, but with much more advanced possibilities such as for cutting edge problems in AI.

based on its operational similarity to an old biological evidence/observations/speculation sometimes referred to as "grandmother neurons" (ie high level "feature detectors") it is seen as not so merely "ad hoc" and may have some real relation to actual "algorithms" used by the human brain for image recognition (or perhaps even deeper or more general cerebral processing).

presented at ICML 2012: 29th International Conference on Machine Learning, Edinburgh, Scotland, June, 2012.

[1] Building High-level Features Using Large Scale Unsupervised Learning by Le et al

[2] How Many Computers to Identify a Cat? 16,000 by John Markoff/NYT

vzn
  • 11,014
  • 2
  • 31
  • 64
0

One possible approach would be to see who has saved the image (below it) and use that as an algorithm for finding other images, similarly to how Facebook suggests friends.

Upon examination, my theory holds for a few elements.