Skip to main content

Content-based image search engine

Content-based image search engine

A Google image search for “tiger” yields lots of tiger photos—but also returns images of a tiger pear cactus stuck in a tire, a racecar, Tiger Woods, and many others. Why? Today's large Internet search engines look for images using text linked to images rather than looking at what is actually in the picture.

Electrical engineers from the Jacobs School are making progress on a different kind of search engine—one that analyzes the images themselves. This approach may be folded into next-generation image search engines for the Internet; and in the shorter term, could be used to annotate and search commercial and private image collections.

At the core of this Supervised Multiclass Labeling (SML) system is a set of simple yet powerful algorithms developed at UCSD by a team led by Nuno Vasconcelos, a professor in the Department of Electrical and Computer Engineering. Once the system is trained, it can be set loose on a database of unlabeled images. The system calculates the probability that various objects or "classes" it has been trained to recognize are present—and labels the images accordingly. After labeling, images can be retrieved via keyword searches. Accuracy of the UCSD system has outpaced that of other content-based image labeling and retrieval systems in the literature. The SML system also splits up images based on content—for example, separating a landscape photo into mountain, sky and lake regions.