Although image search has become a popular feature in many search engines, including Yahoo, MSN, Google, etc., the majority of image searches use little, if any, image information to rank the images. Instead, commonly only the text on the pages in which the image is embedded (text in the body of the page, anchor-text, image name, etc) is used.
In this paper, Yushi Jing and Shumeet Baluja propose an algorithm similar to PageRank that uses the similarity between images as implicit votes. "We cast the image-ranking problem into the task of identifying authority nodes on an inferred visual similarity graph and propose an algorithm to analyze the visual link structure that can be created among a group of images. Through an iterative procedure based on the PageRank computation, a numerical weight is assigned to each image; this measures its relative importance to the other images being considered." The paper assumes that people are more likely to go from an image to other similar images. "By treating images as web documents and their similarities as probabilistic visual hyperlinks, we estimate the likelihood of images visited by a user traversing through these visual-hyperlinks. Those with more estimated visits will be ranked higher than others."
The system was tested on the most popular 2000 queries from Google Image Search on July 23rd, 2007, by applying the algorithm to the top 1000 results produced by Google's search engine and the results are promising: users found 83% less irrelevant images in the top 10 results, from 2.83 results in the current Google search engine to 0.47.
Shown here is a similarity graph generated from the top 1000 search results of "Mona-Lisa". The largest two images contain the highest rank.