[bert's blog - Robo Goes App]

TCHS 4O 2000 [4o's nonsense]

alvinny [2] - csq - edchong
jenming - joseph - law
meepok - mingqi - pea
pengkian [2] - qwergopot - woof
xinghao - zhengyu

HCJC 01S60 [understated sixzero]

andy - edwin - jack
jiaqi - peter - rex
serena

SAF 21SA

khenghui - jiaming - jinrui [2]
ritchie - vicknesh - zhenhao

Others
Lwei [2] - shaowei

- website links -

Alien Loves Predator
BloggerSG
Cute Overload!
Cyanide and Happiness
Daily Bunny
Hamleto
Hattrick
Magic: The Gathering
The Onion
The Order of the Stick
Perry Bible Fellowship
PvP Online
Soccernet
Sluggy Freelance
The Students' Sketchpad
Talk Rock
Talking Cock.com
Tom the Dancing Bug
Wikipedia
Wulffmorgenthaler

bert's blog v1.21
Powered by glolg
Programmed with Perl 5.6.1
on Apache/1.3.27 (Red Hat Linux)

best viewed at 1024 x 768 resolution
on Internet Explorer 6.0+
or Mozilla Firefox 1.5+

entry views: 2844
today's page views: 1290 (11 mobile)
all-time page views: 3736838

most viewed entry: 18739 views
most commented entry: 14 comments
number of entries: 1256

page created Sat Mar 7, 2026 02:11:16

- tagcloud -

academics [70]
art [8]
changelog [49]
current events [36]
cute stuff [12]
gaming [11]
music [8]
outings [16]
philosophy [10]
poetry [4]
programming [15]
rants [5]
reviews [8]
sport [37]
travel [19]
work [3]

miscellaneous [75]

- category tags -

academics art changelog current events cute stuff gaming miscellaneous music outings philosophy poetry programming rants reviews sport travel work

tags in total: 386

i am now probably: doing homework. gah [?]
(status updated every 30 minutes)

name: Lim Yong San, Gilbert -

gender: Male
nationality: Singaporean
race: Chinese
dob: 25^th January 1984

height: 1.74m (5'8½")
weight: 67kg (147 pounds)
blood type: A+

Download full resume [PDF] [DOC]
currently: National University of Singapore
(studying Computer Science & Economics)
tertiary: Hwa Chong Junior College*
secondary: The Chinese High School*
* merged into Hwa Chong Institution in 2005
primary: Shuqun Primary School
pre: Jurong Christian Church Kindergarten

fav colour: Green
fav soccer clubs: Manchester United,
Brighton and Hove Albion,
English National Team
hobbies: Many (a few in no particular order:)
reading & writing
programming (sometimes)
webgame timesin ks
DotA
kicking ping-pong balls
all manner of sports involving balls
sleeping

Wednesday, July 18, 2012 - 02:34 SGT
Posted By: Gilbert

Robo Goes App
As my main line of research has gotten a little bogged down, I resolved to clear my mind by applying it to a new challenge, inspired by how #%*@! frustrating entering even relatively short sentences on a smartphone virtual keyboard can be.

A few first measurements: the HTC One V's screen is approximately 4.8cm by 8cm, but with only about 6.8cm of the length available for icons; this works out to 2.04 square cm at a maximum for each of the sixteen icon slots, which is not all that much to begin with (whip out a ruler and take a look). As unappealing as that is, the size of individual keys on the default virtual keyboard is far worse - just 0.4cm by 0.8cm each, or just 0.32 square cm.

Now, let us face it, mashing keys that tiny with any sort of speed and accuracy and without any tactile feedback is a right chore; whatever else one might say about traditional handphones, their raised keys at least made multitasking input possible, which has moreover brought some measure of fame to our nation. Sure, it still can't quite match up to a real keyboard (41 seconds for 160 characters [or 32 words using the standard five-character-per-word translation], making it only about a third as fast as the best typists), but it could be done. Such feats are no longer possible... or are they?

Hilarious does however come free with fast

Hilarious does however come free with fast

Fast or correct; choose one
(Source: damnyouautocorrect.com)

Given the ubiquity of smartphones nowadays (around a billion exist either now or in a few years, depending on who you listen to), any way to input data more quickly (and comprehensibly) onto them must have enormous potential market upside, both for the creator of the method and the platforms that adopt it - and these miserable 0.32 square centimetre things are the best we have?

Examining the currently-available options:

Voice-to-text
(e.g. Default speech-to-text, Vlingo, etc)

I haven't tried it myself, but I'm sure that most will agree that it has its inherent limitations, earning a reputation for talking to oneself among them.
Bigger keys
(e.g. Big Buttons, Cellular, etc)

Pros: The most straightforward approach.

Cons: Layout may no longer be QWERTY-standard, obscures more of the underlying app.
Predictive
(e.g. SwiftKey [demo], A.I.type, etc)

These focus on predicting subsequent words, so that the frustrating tapping of individual keys is no longer that necessary.

Pros: It does help, so much so that the Android 4.0 default keyboard works almost like that (well, not quite, since SwiftKey learns very deeply indeed), with three suggestions shown for each input, and they're the top paid app on Google Play.

Cons: Still have to look at the screen while typing. Depends on user's texting patterns staying mostly constant over time (which is probably true), which does not carry over to new or shared devices. Privacy concerns as personal history is mined.
Swiping
(e.g. Swype [demo], SlideIT, etc)

With these, the user swipes his finger over the letters making up a word without having to raise it. There is some overlap with predictive types, since some paths travelled can have multiple interpretations.

Pros: No tapping required, fast.

Cons: Again, have to look at the screen. Also, Swype for one seems to be pursuing a strategy of getting itself preinstalled as the default (going for broke?)
Custom
(e.g. MessagEase, 8pen, etc)

The traditional keyboard layout is dispensed with.

Pros: Potentially very fast - may not have to look at the screen.

Cons: Steep learning curve - how many Dvorak users are there? Similar (if reduced) problem of portability if most devices don't come with the app. Indeed, MessagEase has been around for some time but not quite caught on.
Blind Typing
(e.g. BlindType [demo])

Pros: Ah, finally a method that does not presume that users want to look at the screen! Note that it's not particularly new, having been bought by Google in October 2010. Despite that, however, it appears to have mostly vanished.

Cons: Can't take advantage of predictive input when typing blind - but that's no reason not to offer this option, if nothing as a secondary option. Phones with front-facing cameras might then offer predictions when it detects the user looking at the screen. Some phones, anyway (ahem)

For the following explorations, Mr. Robo has gathered some resources. The full bigram statistics are gathered from the supplementary material to "Case-sensitive letter and bigram frequency counts from large-scale English corpora" (Jones, 2004, maybe free to access someday?), Usenet column. The statistics on the ten thousand most popular English words (making up 80.59% of all word occurances) were gathered from the Project Gutenberg list on Wiktionary, and while slightly dated should be good enough for our demonstrations.

While we have messed about with bigrams here before, a complete overview has yet to be presented. This deficiency shall be made up now:

Slightly spotty

All bigram frequencies
Each row represents the bigrams starting with the same letter;
the darker the green, the more often the second letter follows the first.
Note "ju" and "qu", of especial interest to Scrabble players,
as well as the columns corresponding to vowels.

Notably, bigram statistics inferred from the Gutenberg corpus have far less detail than Jones', with a full 221 of the 676 bigrams not represented at all in the 10000 words, though it should be said that a similar proportion of Jones' bigrams have negligible counts (<0.005%), with "qz" taking the prize for rarest bigram. This discrepancy may underline the need for a truly comprehensive corpus in production apps.

It should also be called to attention that the highly-popular "th" occurs more than twice as often in practice as a small corpus might suggest (1.52%), making up 4.56% in Gutenberg and 3.01% in Jones' Usenet count. This might be attributable to the small corpus not taking into account the relative frequencies of the words contained within.

So might this information be exploited for autocorrection? Quite possibly. We can cast input as a classical noisy model, where the user's actual intention may not be captured properly by the touchscreen. While actual readings are more complex, involving irregular shapes, we simplify each touch to produce a circular margin of error, with the screen then randomly picking any point within the circle. To reflect rapid typing being less precise, I together with Mr. Robo have further set this margin of error to diminish over time to some minimum in the simulation to follow.

But first, some intro music
Reverse Hamgineering

But how then is correction of text achieved? While we do not pretend to replicate current professional implementations, we believe that some basics are sufficient for a workable demo, for which only two kinds of probabilities need be considered:

Bigram probability

As just seen, many bigrams have an extremely small chance of appearing in intended text - how often do, say, "sx", "ih" or "wk" occur in proper English? Therefore, these bigrams should be weighted against when they are input.
Spatial probability

Here, we assume that the user generally does not miss his intended key by too far; mistyping "s" as "a" is not too uncommon, but mistyping it as "k", all the way to the other end of the keyboard, should be very rare especially given the one-fingered pecking style generally used with smartphones.

The theory then is very simple. Consider a hasty, fat-fingered user inputting "yhud", which can reasonably be assumed not to be what he actually intends, unless he has just fallen down a flight of stairs and wants to describe the sound made.

When the first letter is entered, we do not a priori have reason to suspect it is not correct, since single letters all have non-negligible probabilities of appearing - the user might well intend to begin his word with "y", yes, y not?

However, once the second letter is entered, we do have a much better idea of what is going on. "yh" happens to be seen very infrequently - about 0.002% of the time only, to be exact. We might then try to find more plausible alternatives, and "th", as it happens, has a far greater 3.01% probability.

But is "th" in fact acceptable? Well, "t" is right next to "y" on the keyboard, and we may apply a small probability, say 10%, that a received "y" is meant as a "t", instead of a 70% probability of actually being "y". With this, the joint probability of "th" becomes 0.1*0.0301=0.00301, which is still much bigger than the 0.7*0.00002=0.000014 of "yh".

This idea can be extended to longer strings by basic dynamic programming, and as a finishing step, the most probable candidates obtained after this can be polished up with a dictionary, since bigrams represent only extremely local information. The live Javascript demo follows - to use it, center the green circle over the intended letter and click, and errors simulating the fat finger effect will be introduced automatically.

The demo is, of course, limited - for one, it uses only a static and slightly archaic 10000-word dictionary, so good luck trying to correct for words like "processor". Also, it was designed only to check with words of the same length as the input, instead of predicting. For example, the input "th" gives alternatives "ty" and "ti", which while reasonable guesses are probably inferior to offering "the" and "this". Incorporating such predictions is not that difficult - just one more layer of probabilities - and is left as an exercise to the reader.

Blindly Hammering Away

Now, this is old news, and I gather the autocorrect in some word processors may do something like this. What about some of the more exciting newer inventions? Take blind typing: how can a user just press at approximate positions, perhaps not even on any keys, and get his word correctly predicted?

While we have no idea how the makers of BlindType and the like really do it in their apps, we can offer at least one suggestion. Note that even when a user is typing without looking, assuming he has memorized the QWERTY keyboard layout (not too unlikely for most), there are still certain invariants that can be exploited.

Let us say that he touches some roughly central point - then, if we assume he is blind-typing, we cannot guess what letter that represents with any confidence. However, say that he then presses a point at a roughly four o'clock position, down and to the right of the original point. This immediately tells us that the second point is probably not on the top row, and also that the first point is then probably not on the bottom row.

As the user continues marking more points, assuming that his internal vision of the keyboard is roughly correct, we can thereby get an increasingly better idea of what he is trying to input. For example, consider the word "this" again. It has four letters, and therefore three moves: "t" to "h", "h" to "i" and "i" to "s". Each of these moves can be represented by an angle, or bearing, for which we follow standard mathematical practice of zero degrees to due east, and going counter-clockwise. Then, the bearings are 311°, 47° and 197°.

It is easy to see that every word has a corresponding bearing signature; then, assuming that the user has touched the screen a correct number of times (as many as there are letters), it remains only to match the signature he has generated with a dictionary of precomputed signatures. There are of course many minor enhancements and considerations, such as having near-enough touches assumed to be the same letter, but this is the gist of it.

The algorithm may be tested with the prototype above. The three closest matches will be displayed once two or more clicks (touches) are made. The efficacy of typing blind can be tested by clicking the "Do it Blind!" button, which removes the keyboard display. You should note that it is not necessary to click anywhere near the actual keys for good predictions to be made - as long as the angles are right, large movements are not required.

Clearly, this method does have its limitations too. For one, many words (particularly shorter ones) may share very similar signatures, in particular those with letters all on the same row, such as "try" - currently, it is mispredicted as "has", since we consider only the angles, and not the relative distances, for now. This should however be a fairly trivial fix to add. Have a try?

A bigger issue for blind typing is that the top prediction should be very highly accurate, since the user is assumed to continue typing new words without stopping to correct them. Therefore, probability considerations at the word and sentence level should be a given, in actual products.

Finally, it can be observed that this idea can be adapted to swipe-type input; the problem is complicated by there not being distinct touches to indicate the number of letters, but this is mitigated by being able to rely on actual key positions once more. Some letters can then be distinguished by a change in direction of the swiping motion, and the trick then would be to decide which letters to keep when they are passed over in a straight line - for example, "pit" and "pot" would result in exactly the same path traced, so again additional context has to be used to distinguish them.

Mr. Robo has informed me that he has a couple of promising new methods for keyboard input, which however will take a bit more time to come to fruition. More to look forward to for the weekend!

Next: Least Publishable Unit

Related Posts:

You Bet Your Life
Ghosts of Games Past
Economics Thus Far
Changing Pains
Week In Review

3 trackbacks

Trackback by armed Forces life insurance

armed Forces life insurance - [bert's blog]

August 14, 2014 - 10:21 SGT

Trackback by online dating site

online dating site - [bert's blog]

August 27, 2014 - 07:07 SGT

Trackback by coconut oil shampoo

coconut oil shampoo - [bert's blog]

October 9, 2014 - 23:01 SGT