Powered by glolg
Display Preferences Most Recent Entries Chatterbox Blog Links Site Statistics Category Tags About Me, Myself and Gilbert XML RSS Feed
Saturday, July 23, 2011 - 19:24 SGT
Posted By: Gilbert

All The Same

Following hot on the heels of Google+, Google has released Search By Image (GSBI), which lets users now search using pictures. Got a historically relevant scene that you can't put your finger on? Baffled at what minor landmark was taken in your collection of holiday snapshots? No problem, just upload it and let Google figure it out.

Eh wait, isn't that exactly what TinEye does, as mentioned here, like, a year and a half ago?

Let us evaluate the search and leave what this means for TinEye till a little later. First up are some specimens from my European travels [N.B. Images of search results as presented below were rearranged for succinctness]:



GSBI cleverly ignored alvin's head at the bottom left and correctly identified the tower as belonging to the Plaza de Espana in Seville. Trying again with a wide shot, GSBI again got it right:



But wait ah, the visually similar images are not of the Plaza, but of random buildings/monuments sticking out into a bluish sky (I suppose there are many of these in existence). So how did they make the positive identification? Turns out that somebody took a shot (inset, below) at almost exactly the same angle as me (main picture, below), so much so that it was designated a "matching image" by GSBI:





Less famous curiosities like a statue of Don Juan (above) fared worse, with totally irrelevant results. It should be stated the only other image of it that I could find on the web after a brief (textual) search resides on a now-defunct blog, but I was slightly puzzled at the seeming dearth of bronze sculptures standing before trees in the entire world.

There are of course a ton of experiments that could be done to try and pry out the general algorithm used, but let's be content with just a couple more. Whatever the algo is, it appears great at detecting regions within images, as this search using Paul Scholes shows:



The matching image was, in fact, where I cropped the test image from. I found it slightly suspicious that no ginger-haired guys were deemed more similar to Scholes than (from left to right) Deschamps, Mourinho, Hiddink and (wait for it) Rooney, though! This led me to suspect that when returning "visually similar" images, GSBI uses more than pure image data.

To test that, I simply flipped the cropped Scholes photo horizontally, such that now he's looking to the right, and resubmitted it:



Who would have thought that it would have made that much of a difference? All football-related mugs are gone, replaced by a slew of random people (with a pensive-looking George Bush in the third row down).

And how is all this done? Someone new to all this might reason that similar-looking images can be detected by resizing them to the same dimensions, then summing the difference between each pixel. This very simple idea is actually not all that unreasonable, and certainly works, albeit under rather limited conditions.

Quite apart from the issue of matching is the issue of speed; even a low-resolution image routinely contains hundreds of thousands of pixels, and this can add up fast. One simple hack is to reduce (desample) the image by reducing the resolution, and doing initial lookups on such "fingerprints", which would allow non-matching images to be quickly discarded.

But what about subimages? The most obvious way would be to resize the image to various scales, and test it against each image at each possible approximate position. Unfortunately, the number of combinations swiftly multiplies, and if we add considerations like rotation into the mix, this can be shown to quickly become impractical (today at least).

A pretty good and publicly-known algorithm, Scale Invariant Feature Transform (SIFT), that addresses each of these concerns has been out in the open for over a decade. Those interested in the details can read the Wikipedia article above (or even the original paper), but I believe the general idea is that key features are first detected (so that effort is not wasted on say the background) and transforming these key feature (sets/clusters) into a universal space to accomodate orientation.

So does GSBI do that? Interestingly, rotating the cropped Scholes image by a couple of degrees returns the same output as the unrotated copy, and rotating the face by as much as 45 degrees still maintains much the same results, although the source is no longer acknowledged as a "matching image". Astoundingly, rotating 180 degrees more (such that Scholes is now upside down), we still get "manchester united" as the best guess, with Rooney as the most visually similar (to an upside down face? I know he's no great looker but...)



Recalling that merely flipping horizontally breaks the recognition, one might suspect that whether or not GSBI uses SIFT (or an adaptation), they happen to have the same responses towards some classes of transforms (i.e. rotation) and not others (i.e. flips) [N.B. It is trivial to adapt the algo to accommodate this if desired]. Additionally, GSBI appears to be ace at detecting major features too - try a simple, clear drawing (like this Hello Kitty), and you'll notice that GSBI returns many similar-but-not-quite examples, with variations such as:
  • Ribbon at ear replaced by a star
  • Colors not the same
  • Words in image
  • Lack of facial outline, only eyes, nose and ribbon
  • Non line drawing image, such as photo of object with Hello Kitty imprinted upon it
But is this the point? Certainly, GSBI very likely follows many of the principles that techniques like SIFT employ (as does TinEye, for that matter), but also most probably includes a huge amount of customization and tweaking; for instance, I suspect that they don't use the hashing method described in the original paper to look up similar images - heck, they're probably the best in the world at searching huge datasets.

This is just one step in the entire process, and who knows what else they have done? Perhaps they have used other measures such as histogram or Fourier transform frequency running in parallel? Who knows?

The point is that TinEye may well find it very difficult going forward from here. Can their algorithm outperform Google's significantly to carve out a niche? I'm not sure. Will their index of images (nearly two billion at least count, it has to be said) allow them to stay competitive? Rather unlikely, given that a random single-digit search on Google Images already returns on the order of sixty billion images.

A few days ago, Google searching for the next Google hit the news, but much of the time it would make more sense for Google to simply reimplement the idea themselves. Given that they have much of the world's top talent oozing out of that ears, that can't be too farfetched. It seems to come down to having something so wonderful that nobody at Google (or some other big company) can replicate it speedily, having it so out of left field that they don't recognize its potential, or small enough that they don't bother or feel nice enough to ignore.

And good luck with patents too, as df noted on Facebook recently. While of course a good idea, figuring out what is a minor adaptation of a patented idea, and what is actually original enough, is not easy. A famous example of a likely too-broad is Amazon's One-Click patent, though that has nothing on this. Heck, a local company got peanuts despite patenting the ThumbDrive, with many manufacturers basically ignoring their claims - maybe even legally, since patents aren't international by default.


Ms Robo feels that they can do better


I'll be on reservist the coming week, and taking this attitude with me:


Wish me luck
(Source: Somewhere on the Internet)




comments (0) - email - share - print - direct link
trackbacks (3) - trackback url


Next: Anatomy of a Massacre


Related Posts:
Image Concerns
Halfway Through The Moonlight
The Hundred Thousand
Long Goodbye
Weekend B-activities

Back to top




3 trackbacks


Trackback by Hack Kings Road

Hack Kings Road - [bert's blog]


June 21, 2014 - 07:32 SGT     

Trackback by Pick 6 Leak Proof

Pick 6 Leak Proof - [bert's blog]


August 7, 2014 - 06:26 SGT     

Trackback by how to hack wifi

how to hack wifi - [bert's blog]


October 1, 2014 - 15:21 SGT     


Copyright © 2006-2025 GLYS. All Rights Reserved.