[bert's blog - Thirty Summers]

TCHS 4O 2000 [4o's nonsense]

alvinny [2] - csq - edchong
jenming - joseph - law
meepok - mingqi - pea
pengkian [2] - qwergopot - woof
xinghao - zhengyu

HCJC 01S60 [understated sixzero]

andy - edwin - jack
jiaqi - peter - rex
serena

SAF 21SA

khenghui - jiaming - jinrui [2]
ritchie - vicknesh - zhenhao

Others
Lwei [2] - shaowei

- website links -

Alien Loves Predator
BloggerSG
Cute Overload!
Cyanide and Happiness
Daily Bunny
Hamleto
Hattrick
Magic: The Gathering
The Onion
The Order of the Stick
Perry Bible Fellowship
PvP Online
Soccernet
Sluggy Freelance
The Students' Sketchpad
Talk Rock
Talking Cock.com
Tom the Dancing Bug
Wikipedia
Wulffmorgenthaler

bert's blog v1.21
Powered by glolg
Programmed with Perl 5.6.1
on Apache/1.3.27 (Red Hat Linux)

best viewed at 1024 x 768 resolution
on Internet Explorer 6.0+
or Mozilla Firefox 1.5+

entry views: 1102
today's page views: 626 (21 mobile)
all-time page views: 3386667

most viewed entry: 18739 views
most commented entry: 14 comments
number of entries: 1226

page created Fri Jun 20, 2025 14:43:51

- tagcloud -

academics [70]
art [8]
changelog [49]
current events [36]
cute stuff [12]
gaming [11]
music [8]
outings [16]
philosophy [10]
poetry [4]
programming [15]
rants [5]
reviews [8]
sport [37]
travel [19]
work [3]

miscellaneous [75]

- category tags -

academics art changelog current events cute stuff gaming miscellaneous music outings philosophy poetry programming rants reviews sport travel work

tags in total: 386

i am now probably: napping. zzz [?]
(status updated every 30 minutes)

name: Lim Yong San, Gilbert -

gender: Male
nationality: Singaporean
race: Chinese
dob: 25^th January 1984

height: 1.74m (5'8½")
weight: 67kg (147 pounds)
blood type: A+

Download full resume [PDF] [DOC]
currently: National University of Singapore
(studying Computer Science & Economics)
tertiary: Hwa Chong Junior College*
secondary: The Chinese High School*
* merged into Hwa Chong Institution in 2005
primary: Shuqun Primary School
pre: Jurong Christian Church Kindergarten

fav colour: Green
fav soccer clubs: Manchester United,
Brighton and Hove Albion,
English National Team
hobbies: Many (a few in no particular order:)
reading & writing
programming (sometimes)
webgame timesin ks
DotA
kicking ping-pong balls
all manner of sports involving balls
sleeping

Sunday, Dec 21, 2014 - 23:59 SGT
Posted By: Gilbert

Thirty Summers
To think that a man
Has but fifty years to live under heaven.
Surely this world
Is nothing but a vain dream
Living but one life,
Is there anything that does not decay?

- Atsumori

It struck me that thirty years had been considered a midway mark of sorts among various cultures (plus-minus ten years or so), and for all the talk of it being the "new twenty", I still felt behooved to take stock of my life thus far. After some deliberation, I decided to organize this looking-back into three parts: what I have done (and happened to me), what I am doing (and is happening to me), and a little on what I suppose could be done (not necessarily by me)

Up Till Now

There exists that mysterious sort of man, of whom nobody quite knows his background, which thus becomes the root of much idle speculation. Some of these types even manage to parlay this to their gain. For this, it helps if he is tall and dark.

I have no interest in becoming that man.

Then there are those who, intent on preserving their privacy and foiling identity thieves, make it a point of pride to leave as little of a mark online - or anywhere else, for that matter. Their names and thoughts grace neither Google nor Facebook, their few ministrations made under pseudonyms.

Guess what? Not me either.

My own little tale, then, began in 1984, for not being one of the fairer gender, I see no call for my age to be obscured. While it has been suggested that "middle-class" is a badly-overused appellation, in this case I could find no other phrase to better convey my birth and circumstances.

Home was a large HDB flat, which was just as well, as it housed no fewer than three new families, with my parents and uncle waiting for theirs to come online (the backlog was real, even back then). Before that, the maternal side of my family had been based in Lim Chu Kang, where an army camp now stands, or so I heard. Oh, there were stories told by my grandma back then, including the one where she looted (but not very skilfully) a Japanese barracks at the end of the Occupation, but this is neither here nor there.

My youngest years, then, were spent bunking out in the living room, but to a child one room was as good as any other. Those were happy times, and I was by all accounts a talkative one. For some reason, my cousins and I were enrolled in a kindergarten run by a Lutheran church.

The teachers were nice enough, and I quite enjoyed the activities, such as the play on Peter, James and John fishing, but I am afraid that I never bought into the more believe-y bits; I still recall one of the teachers sounding almost disgusted at the thought that man could have arisen from monkey-kind, and though I couldn't have known anything substantial about evolution then, it did not seem to me that unreasonable (and wouldn't a tail be fun?)

It was something like that.

About this time, my parents shifted to their new place at the other end of the island, but I liked my current lodgings so much that I refused to leave. My grandma tried to get me into her alma mater Rulang Primary (which had some reputation, even then), but to no avail. Eventually, I got into Shuqun Primary after a ballot, and by all accounts did okay there.

Of course, on reflection, it was maybe a tad unfair, since I actually liked doing assessments (I kid you not, I was an odd one). You know how writing lines was generally considered a punishment? Well, my grandma's idea of teaching Chinese consisted mainly of jotting the new words of the week at the top of a page, and having me fill the rest of the page in by repetition. I suppose it did work.

So, by the time my primary school career ended, I was designing my own examination papers on my mum's old typewriter, having (willingly!) seen all too many of them. On hindsight, I was far too eager to please. In the middle, there was some test or other that ended with a couple of my classmates (girls in their I-love-ponies phase, if I recall rightly) being transferred out. For me, I elected to stay as I liked it there, and who did the MOE think they were?

Recess entertainment consisted of, in my first couple of years, disrupting the seniors' badminton games by snagging their shuttlecocks mid-rally, which eventually transitioned into kicking a tennis ball around the basketball court. I recall pleading to join the (in no particular order) Scouts, football and basketball teams as my ECA, which all got vetoed by grandma as I could get hurt. Badminton - which I liked, but not very much - was allowed, on condition I only played singles. I wound up rather better at doubles, and never told.

Having done decently on the leaving examinations, I chose to apply to The Chinese High School, mainly as I had actually visited once before for some Art Olympiad and liked the look of the place. The competition was, understandably, tougher, and I finished in the middle of the pack my first term there, which was emphatically not good enough.

Here sharp blade. You do honourable thing.

Here sharp blade. You do honourable thing.

(Source: knowyourmeme.com)

I'm not sure if I cared that much, but my grandma was not happy, having made it her business to pore over whatever grade data she could finagle. Well, it got better after I figured the new stuff out. Tried water polo for a while, but that ended after I found out that I could barely open my eyes in the pool without goggles. Didn't bother to ask about other sports, settled back on badminton (not very enthusiastically)

Interestingly, I never really considered joining the computer clubs, even though I was taken with QBasic and HTML during those heady days. Ran for class committees with I daresay relatively pure, but still probably misguided motivations, which in practice amounted mostly to being designated debt collector for the teachers. I was not as wise as I thought I was.

Tried to remain a good student, and likely was considered one. I was almost always seated at the back of the class, which might have started out due to sorting by height, but would at some point become my rightful spot. I didn't bother the teachers much, and they repaid the compliment. Apparently, at least some of them knew that I was sometimes reading other stuff under the desk, but let it slide.

There was that time when my grandma figured that my still less-than-ideal class rank was due to not paying (or being paid) enough attention, and personally saw me seated at the front. Cue quiet swapping back to normal once she left.

For all her efforts, my O-level grades weren't perfect-perfect, and neither were my junior college ones (my days there of which I do not like to recall too vividly), but yeah, wabi-sabi kintsukuroi, eh? But frankly, it's true - nobody gives a shit after awhile, and on hindsight I wouldn't swap a grade upgrade for any of those wonderful October Blues afternoons spent at Beauty World fragging each other in CS. Grandma would have to be content with some consolation prizes.

And in any case, not snagging a "big" scholarship had probably less to do with results, since a number of guys managed it with no more. On reconsideration, my interview skills could probably be improved - I vaguely remember when, on being asked about impressions on Bill Gates during a round with Microsoft representatives, innocently informing them that he was considered the anti-Christ by some (which was, and remains, completely true), and then proudly regaling them with tips on how to install backdoors on servers (it was a pretty clever and original method, so I thought)

It galled me that they picked some other fellows.

Actually, there was some offer to study English abroad, if I recall rightly, but my heart had been set on computer science for some time by then. As with my decision to stay at my primary school, my choice of high school, etc, I decided to go with my intuition, all the more as I figured I could more easily navigate literature by myself, than computers. That, and actually having a good excuse for doing the equivalent of under-the-table reading, most of the time.

After a two-year national service stint where I frankly had it easy, but did pick up some skills that I hope will never have to be used for real, I became a CS major at NUS, and then added Economics to that because why not? They then very generously accepted my application to grad school (no interview involved), which is where I'm still hanging out.

Oh, and made the acquaintance of some delightful hamsters too (Mr. Ham'd kill me if I neglected this attribution)

What I Do

Come to think of it, I've been asked this a couple of times in real life these past few months. To be sure, one's profession often ties in with one's identity, to the extent that many once (and still do?) named their families after their job - what you do affects who you are. So, in two words: pixel manipulator. Hmm, not specific enough.

Conventionally, I might someday be considered a computer scientist. Personally, I rather prefer problem solver.

This is as I perceive many problems not properly lying within the field of computer science, but that I have no less of an urge to tinker with - indeed, I quite fancy the thought of renting a tiny room somewhere, cluttering it up and placing a pipe on the table for effect, then lazing about with feet up, waiting for customers drawn by the sign outside:

G. Lim
Problem Solver and Universal Analyst

No problem too trivial*
No domain too esoteric
No method too taboo
No solution, no payment

Situations explained without fear or favour**
Usually accurate, from experience
Utmost discretion assured
Results guaranteed or your money back

*School assignments excepted
**The easily offended need not apply

For now though, I have - mostly by blind chance - wound up getting involved with the practice of medical imaging, to the point of accidentally publishing some stuff on it. My current work centers about designing algorithms to inspect retinal fundus images, and automatically determine whether some particular eye is fine, or if the image should be forwarded to a specialist for further examination.

This turned out to be slightly less straightforward than initially expected, and led to the examining (and discarding) of a fair number of techniques; my current practice is to first segment potential candidates by level-set analysis with bells and whistles if required, before normalizing and running these candidates though a convolutional neural network [CNN] (the general approach of which has been popularized somewhat), before some domain-specific cleaning up of the results.

And as to why CNNs... well, they really do work. From a practical standpoint, i.e. getting results, which I suppose should be a consideration where predictions have real implications (up to and including blindness), it only makes sense to focus on the most promising known approaches. Sure, with more time, one could dream up any number of other methods (of which I had a number, all not quite as good), but this is not always advisable (and which reminds me of the likely not-uncommon practice of labs accepting grants for problems they already have plausible solutions for, then spending it on tangentially-related blue-sky investigations)

Moreover, it can't be bad to get a toe in on the new "deep learning" revolution that's lighting up the scene, no? Human-level face and object recognition, natural language scene description, all the way to flying cars and all that?

Well, a dirty little secret - deep learning really isn't all that complicated, and not even all that new.

To be sure, it derived from early attempts at artificial intelligence, which tried to simulate a brain, in much the same way as aviation pioneers of antiquity tried to imitate birds. It was known that the brain was composed of neurons and their connections, so bar there actually being some incorporeal soul involved, would not reconstructing it also reproduce its potential for intelligence?

Epoch 4 of 20 processing...

Just waiting for its brain
(Source: dragonage.wikia.com)

Problem one - the human brain has a (literally) astronomical number of neurons and connections, quite a bit more than computers then (or even now) could comfortably simulate. Nor were their interactions quite understood (see recent Kaggle problem). No problem, the scientists did what they do best, abstracted all the biological complexities out, made a bunch of behavioural assumptions, and vastly reduced the scale of their models, winding up with something akin to a Boltzmann machine (a form of Markov random field)

Unfortunately, even these much-cut-down models were near-impossible to train in general, so researchers tended to work on models where the neuronal structure was further dumbed-down, to a single intermediate hidden layer between inputs and outputs (the basic multilayer perceptron [MLP] - see past example on this blog/autoencoder), or even none at all (the perceptron/restricted Boltzmann machine).

And, some of them actually worked! Aeroplanes don't have to flap their wings, after all. Further, it was proven in 1989 that the basic MLP was a universal approximator - it could learn any (self-consistent) relations, given enough neurons (the idea possibly not too distinct from Fourier series approximation)

However, once again, theory did not translate well to practice - "enough" neurons could mean "really too many", and the proof of existence of a configuration for any pattern did not give particular insight on how to arrive at the configuration - i.e. how to train the network - in the first place.

[N.B. In the other direction, some theoretical issues aren't a practical hindrance. For example, it has been noted that CNNs can be fooled by very unlikely-looking cases, but while intriguing to ruminate about, a practitioner could simply mitigate this by processing multiple slightly-perturbed versions of the input (though that wouldn't be very elegant)]

There were countless attempts at this, but most settled on some variant of gradient descent backpropagation, which was simply to adjust the network connection weights, so as to reduce the error on the latest input(s). The possible tweaks are myraid - consider one input at a time, or many? To keep a memory of past examples, or not? To use the second derivative to further speed convergence, or not? - without even going into how to tune the standard parameters such as the learning rate, but the basic concept was the same.

So far so good. Inevitably, researchers dabbled with adding in more intermediate layers, which as extensions go, was a pretty obvious one. They largely discovered that this made the networks untrainable again, which was soon explained as being due to the vanishing gradient problem - learning becomes less and less effective down the stack of layers, as the error being propagated is in a sense less informative.

Still, the use of hierarchies was never totally abandoned, and justification was found in that different layers could have additional constraints imposed, which conveniently reduced the number of free parameters. One particular success story was in handwritten character classification, where CNNs have never quite been dethroned; however, the theoretically-neater support vector machines made their appearance soon after, together with the easy-to-use random forests and other new techniques. Faced with this onslaught, connectionism retreated to that dusty backroom where old fads go to die.

Then, a turnaround of epic proportions.

It's a graphics card, if you didn't recognize it.

It's a graphics card, if you didn't recognize it.

This happened.

About 2009, exciting results began trickling in in earnest - it turned out that you could effectively train neural networks with many layers (hence, deep), whether convolutional or not, and get mindblowing results. Further, this was without significant advances in the architecture or training methods, both decades-old (hence, not new - though they were possible [e.g. layer-wise pretraining], with more to arrive soon)

But... but... vanishing gradient?

To be honest, I haven't found an explicit explanation as to why it doesn't seem to apply any longer, but one guess could be that "can't converge" actually meant "can't converge within a reasonable timeframe", and what has changed is the definition of "reasonable timeframe". Consider, GPUs train up to fifty times faster than modern CPUs, which are in turn by Moore's Law, on the order of a thousand times faster than two decades ago! Since backpropagation is not easily parallelizable, it could conceivably have taken years of supercomputer time back then, to replicate a run-of-the-mill experiment of today.

True, there have been a number of innovations such as dropout (random shutting off of a certain proportion of neurons, to reduce dependencies, previously covered) and the rectified linear/maxout activation function, but the interesting observation is that they appear not so much as to improve actual performance, but rather allow that performance to be attained more quickly (simulates a committee of networks for dropout, faster convergence for maxout); attempts at optimizing the architecture, such as by manually designating connections, have been all but discarded.

As one fellow contestant from the ICPR 2012 workshops remarked - it's all in having a bigger network, and more data.

Now, this might be underselling recent advances a little, but on the whole, I cannot fault the sentiment. True, in a contest environment, every little enhancement and trick could well be important - but having sufficient size and depth in one's network has become practically a given.

This is at once both exciting and disconcerting - exciting in that, if this is true, any old bugger can achieve (super)human-level A.I. on a wide variety of (mostly visual) tasks, simply by buying the hardware and running a CNN toolbox (which are becoming more common). It is disconcerting for the same fact: that any old bugger can achieve it. Indeed, I've seen a few recent papers from prestiguous conferences starting off with impressive theoretical premises... only to wind up with rather less than impressive results on MNIST and other standard datasets. A case of brute force and raw data beating higher-level consideration, as in computer chess?

Rxe7 Bxe7 c9 RESIGN

FEAR THE MACHINE
(Source: blastr.com)

Anyway, with Stephen Hawking the latest prominent personage to leap on the "fear artificial intelligence" bandwagon, I'd respond, as a guy who's been milling around the trenches a bit: relax, nothing that huge is happening - the latest big thing is effectively fiddling with chained multiplications of (relatively) large matrices. Though, come to think of it, that could turn out to be how brains function...

No, if A.I. is to destroy us, I'd gather it will be on an economic/social level. Recall when I mentioned that my work aimed to automate the clearest cases, thus reducing the workload of specialists? Well, assuming that it does work, this could mean that each busy human grader would have, say, 30% less to do (yay)

However, it could also mean that 30% of them lost their jobs.

Certainly, I am hoping that some combination of increased productivity/higher wages/lower examination prices/wider screening takeup happens, but realistically, it's out of my hands. Not that tanking it would help, though - somebody will get it right in this comparatively narrow domain. A.I. might not turn the world into grey goo, but it'll certainly eat jobs, and I'm not sure which is the worse.

From Here On

As usual, I'm not too certain of it myself, but there're a few things I had hoped to tinker a little on. For starters, one of my thesis committee members raised the possibility of non-fixed layer hierarchies (which reminded me a little of the octree structure, with selective resolution), which I thought was not without merit, but doubted that it would beat mere brute force without something extra... of which I've no clue what it is.

One other thing I tried was to try to leverage any patterns in CNN output, in segmentation problems. The idea was, perhaps there is something systematic in the errors, and if so, it might be possible to correct for those errors by working on the initial output, to produce a better second-degree output (and perhaps even further). Well, it doesn't seem to offer an improvement yet, but I'd be glad if anyone made this work.

Haven't got going with Extreme Learning Machines either, though it seems that there may be something going for random transforms (again, I suppose, with enough data)

And in a very recent paper, it has been suggested that there is actually nothing special about "deepness", in the sense that a single-hidden layer neural network can store the same (or a similar-enough) function (which is actually not a big surprise); it is just that current training methods do not allow it to learn the function directly from the original data. Fascinating, and moreover practically useful, since a compressed model could be so much faster to execute...

But yeah, I dunno. Some 80's music then (maybe a repeat):

Next: An Xmas Very Special

Related Posts:

A Tangled Web
Groundwork
Conference Report
What I Do
Light It Up