Home > iPhone > Startling Augmented Reality App

Startling Augmented Reality App

December 18th, 2010

A company called Quest Visual has come out with a rather startling iOS app called Word Lens. This one is pretty amazing in that it not only translates printed material between English and Spanish just by processing a visual image, but when it does so, it replaces the original printed words on the sign with words in the other language, so naturally and seamlessly that on the iPhone it can look like the sign was printed in the other language. It somehow manages to scan and read the text, translate it to the other language, and re-create the graphic image of the letters–immediately, almost the moment the sign appears. And it seems to do a very good job of moving with the sign, making the text look natural. It even does a decent job of mimicking the font size and color, though not the exact font and thickness.

Wordlens

When I first saw this, I assumed it was a fake, a mock-up or a concept app–but everyone seems to be saying it’s for real, and can be downloaded now. It’s “free” as in demo–the basic app only reverses or erases words to show you how it can work, and dictionaires cost $5 a pop ($5 per language to language–meaning $10 if you want English to Spanish and Spanish to English). If I lived in a Spanish-speaking country, or planned a trip to one soon, I would definitely buy the dictionary. I am not excited about the prospect of a Japanese version, however–computerized Japanese-English translation systems are still pretty horrific, when they are not hysterically funny.

Here’s the company’s video showing how the app works. A big “however” after the video below…

And the “however”? Well, I downloaded the app, it’s real, and it works… however, not nearly as well as the demo in the video suggests. While I found that it potentially can read and understand the text immediately, it does not always do so. With some text, it comes up more or less immediately, but with other text, it grinds for quite a while, “getting” some text before other text–even when they are words next to each other in exactly the same style and size.

Here are a few examples:

Catext

The app immediately got the rather large and easy title for an old Contemporary Astronomy text I have.

Iatext

This Intermediate Algebra text, however, caused it to hiccup. Notice that it both got and missed one word each in the title and the edition–and had too much trouble reading the script font with the author’s name.

Galtext

This is my favorite. It got the title Galaxies just fine… but the text in the top left corners is “A SIERRA CLUB BOOK,” and therefore should be shown here as “A ARREIS BULC KOOB.” Instead, it reads “WON WOW EIGOOB,” which reads backwards: “NOW WOW BOOGIE.” I almost have to assume that’s an intentional easter-egg kind of text translation when it can’t figure anything out…

Clearly this should not be trusted 100% with translations. I can imagine using this as a translation aid going to a foreign country. “My nipples explode with delight!”

Categories: iPhone Tags: by
  1. Troy
    December 18th, 2010 at 04:32 | #1

    I had this idea many many years ago, though not the part about REPLACING the old text with the new. That’s something that is pretty genius — obvious in retrospect but not when looking at the problem initially.

    The technology to do this isn’t that “hard”, really. Scanning for runs of text first requires detecting the horizontal “baselines” — the gaps between lines.

    Once you have that it’s a simple mathematical transformation to get things into “head on” again, and then you just scan the lines for gaps between letters.

    After that you quantize each letter and then match that quantization against a database. Accuracy can be greatly improved if this pattern matching is grammar and context-aware.

    Replacing the old with the new is relatively simple, just erase the old lines and pop in the new, and then re-warp the image back (the mathematical transformation used before has an inverse that’s easy to calculate, it’s literally just inverting a 3×3 matrix).

    I’ve been capturing sample shots of signs for YEARS now to help me with this task.

    Never got around to actually doing it, alas.

    I am stupid.

Comments are closed.