[bert's blog - All The Same]

TCHS 4O 2000 [4o's nonsense]

alvinny [2] - csq - edchong
jenming - joseph - law
meepok - mingqi - pea
pengkian [2] - qwergopot - woof
xinghao - zhengyu

HCJC 01S60 [understated sixzero]

andy - edwin - jack
jiaqi - peter - rex
serena

SAF 21SA

khenghui - jiaming - jinrui [2]
ritchie - vicknesh - zhenhao

Others
Lwei [2] - shaowei

- website links -

Alien Loves Predator
BloggerSG
Cute Overload!
Cyanide and Happiness
Daily Bunny
Hamleto
Hattrick
Magic: The Gathering
The Onion
The Order of the Stick
Perry Bible Fellowship
PvP Online
Soccernet
Sluggy Freelance
The Students' Sketchpad
Talk Rock
Talking Cock.com
Tom the Dancing Bug
Wikipedia
Wulffmorgenthaler

bert's blog v1.21
Powered by glolg
Programmed with Perl 5.6.1
on Apache/1.3.27 (Red Hat Linux)

best viewed at 1024 x 768 resolution
on Internet Explorer 6.0+
or Mozilla Firefox 1.5+

entry views: 2023
today's page views: 87 (5 mobile)
all-time page views: 3400806

most viewed entry: 18739 views
most commented entry: 14 comments
number of entries: 1228

page created Sun Jul 13, 2025 03:23:17

- tagcloud -

academics [70]
art [8]
changelog [49]
current events [36]
cute stuff [12]
gaming [11]
music [8]
outings [16]
philosophy [10]
poetry [4]
programming [15]
rants [5]
reviews [8]
sport [37]
travel [19]
work [3]

miscellaneous [75]

- category tags -

academics art changelog current events cute stuff gaming miscellaneous music outings philosophy poetry programming rants reviews sport travel work

tags in total: 386

i am now probably: playing DotA! [?]
(status updated every 30 minutes)

name: Lim Yong San, Gilbert -

gender: Male
nationality: Singaporean
race: Chinese
dob: 25^th January 1984

height: 1.74m (5'8½")
weight: 67kg (147 pounds)
blood type: A+

Download full resume [PDF] [DOC]
currently: National University of Singapore
(studying Computer Science & Economics)
tertiary: Hwa Chong Junior College*
secondary: The Chinese High School*
* merged into Hwa Chong Institution in 2005
primary: Shuqun Primary School
pre: Jurong Christian Church Kindergarten

fav colour: Green
fav soccer clubs: Manchester United,
Brighton and Hove Albion,
English National Team
hobbies: Many (a few in no particular order:)
reading & writing
programming (sometimes)
webgame timesin ks
DotA
kicking ping-pong balls
all manner of sports involving balls
sleeping

Saturday, July 23, 2011 - 19:24 SGT
Posted By: Gilbert

All The Same
Following hot on the heels of Google+, Google has released Search By Image (GSBI), which lets users now search using pictures. Got a historically relevant scene that you can't put your finger on? Baffled at what minor landmark was taken in your collection of holiday snapshots? No problem, just upload it and let Google figure it out.

Eh wait, isn't that exactly what TinEye does, as mentioned here, like, a year and a half ago?

Let us evaluate the search and leave what this means for TinEye till a little later. First up are some specimens from my European travels [N.B. Images of search results as presented below were rearranged for succinctness]:

GSBI cleverly ignored alvin's head at the bottom left and correctly identified the tower as belonging to the Plaza de Espana in Seville. Trying again with a wide shot, GSBI again got it right:

But wait ah, the visually similar images are not of the Plaza, but of random buildings/monuments sticking out into a bluish sky (I suppose there are many of these in existence). So how did they make the positive identification? Turns out that somebody took a shot (inset, below) at almost exactly the same angle as me (main picture, below), so much so that it was designated a "matching image" by GSBI:

Less famous curiosities like a statue of Don Juan (above) fared worse, with totally irrelevant results. It should be stated the only other image of it that I could find on the web after a brief (textual) search resides on a now-defunct blog, but I was slightly puzzled at the seeming dearth of bronze sculptures standing before trees in the entire world.

There are of course a ton of experiments that could be done to try and pry out the general algorithm used, but let's be content with just a couple more. Whatever the algo is, it appears great at detecting regions within images, as this search using Paul Scholes shows:

The matching image was, in fact, where I cropped the test image from. I found it slightly suspicious that no ginger-haired guys were deemed more similar to Scholes than (from left to right) Deschamps, Mourinho, Hiddink and (wait for it) Rooney, though! This led me to suspect that when returning "visually similar" images, GSBI uses more than pure image data.

To test that, I simply flipped the cropped Scholes photo horizontally, such that now he's looking to the right, and resubmitted it:

Who would have thought that it would have made that much of a difference? All football-related mugs are gone, replaced by a slew of random people (with a pensive-looking George Bush in the third row down).

And how is all this done? Someone new to all this might reason that similar-looking images can be detected by resizing them to the same dimensions, then summing the difference between each pixel. This very simple idea is actually not all that unreasonable, and certainly works, albeit under rather limited conditions.

Quite apart from the issue of matching is the issue of speed; even a low-resolution image routinely contains hundreds of thousands of pixels, and this can add up fast. One simple hack is to reduce (desample) the image by reducing the resolution, and doing initial lookups on such "fingerprints", which would allow non-matching images to be quickly discarded.

But what about subimages? The most obvious way would be to resize the image to various scales, and test it against each image at each possible approximate position. Unfortunately, the number of combinations swiftly multiplies, and if we add considerations like rotation into the mix, this can be shown to quickly become impractical (today at least).

A pretty good and publicly-known algorithm, Scale Invariant Feature Transform (SIFT), that addresses each of these concerns has been out in the open for over a decade. Those interested in the details can read the Wikipedia article above (or even the original paper), but I believe the general idea is that key features are first detected (so that effort is not wasted on say the background) and transforming these key feature (sets/clusters) into a universal space to accomodate orientation.

So does GSBI do that? Interestingly, rotating the cropped Scholes image by a couple of degrees returns the same output as the unrotated copy, and rotating the face by as much as 45 degrees still maintains much the same results, although the source is no longer acknowledged as a "matching image". Astoundingly, rotating 180 degrees more (such that Scholes is now upside down), we still get "manchester united" as the best guess, with Rooney as the most visually similar (to an upside down face? I know he's no great looker but...)

Recalling that merely flipping horizontally breaks the recognition, one might suspect that whether or not GSBI uses SIFT (or an adaptation), they happen to have the same responses towards some classes of transforms (i.e. rotation) and not others (i.e. flips) [N.B. It is trivial to adapt the algo to accommodate this if desired]. Additionally, GSBI appears to be ace at detecting major features too - try a simple, clear drawing (like this Hello Kitty), and you'll notice that GSBI returns many similar-but-not-quite examples, with variations such as:

Ribbon at ear replaced by a star
Colors not the same
Words in image
Lack of facial outline, only eyes, nose and ribbon
Non line drawing image, such as photo of object with Hello Kitty imprinted upon it

But is this the point? Certainly, GSBI very likely follows many of the principles that techniques like SIFT employ (as does TinEye, for that matter), but also most probably includes a huge amount of customization and tweaking; for instance, I suspect that they don't use the hashing method described in the original paper to look up similar images - heck, they're probably the best in the world at searching huge datasets.

This is just one step in the entire process, and who knows what else they have done? Perhaps they have used other measures such as histogram or Fourier transform frequency running in parallel? Who knows?

The point is that TinEye may well find it very difficult going forward from here. Can their algorithm outperform Google's significantly to carve out a niche? I'm not sure. Will their index of images (nearly two billion at least count, it has to be said) allow them to stay competitive? Rather unlikely, given that a random single-digit search on Google Images already returns on the order of sixty billion images.

A few days ago, Google searching for the next Google hit the news, but much of the time it would make more sense for Google to simply reimplement the idea themselves. Given that they have much of the world's top talent oozing out of that ears, that can't be too farfetched. It seems to come down to having something so wonderful that nobody at Google (or some other big company) can replicate it speedily, having it so out of left field that they don't recognize its potential, or small enough that they don't bother or feel nice enough to ignore.

And good luck with patents too, as df noted on Facebook recently. While of course a good idea, figuring out what is a minor adaptation of a patented idea, and what is actually original enough, is not easy. A famous example of a likely too-broad is Amazon's One-Click patent, though that has nothing on this. Heck, a local company got peanuts despite patenting the ThumbDrive, with many manufacturers basically ignoring their claims - maybe even legally, since patents aren't international by default.

The urn, ok, but do I look like a plug to you?

The urn, ok, but do I look like a plug to you?

Ms Robo feels that they can do better

I'll be on reservist the coming week, and taking this attitude with me:

Just joking, it's probably be fun. I wish.