Design Related

OCR Workshop

A introduction text to a workshop I wrote with OSP, to invite participants to our workshop at Typojanchi, Seoul, the international typography biennale, in 2013


Fancy reading machines androids from science­fiction fantasies are embodied in our modern lower­profile real world as OCR software package. One of them, the free and open source Tesseract is composed of two parts that we can study, thanks to it's license. There is the engine itself, and the training data for a language partly based on what Tesseract called 'prototypes'. We could compare this 'before to type' (proto­type) to the culture a lecturer progressively gathers from his first lesson going from a novice to a fully grown expert.

By following the limit between the blank surfaces and the dark pixels of the shapes of letters, Tesseract compares its journey with other and previous ones, on images already followed in the past. It starts by learning patterns and specificities of languages, rhythms and irregularities. It goes on to recognize the body of a glyph, then it works out, bit by bit, if this glyph is a letter, form is a word, and eventually it makes out phrases. Like all of us, Tesseract learns typography in this same process, in a completely intertwined way, as sentences, script and eventually, language. More here.

Tesseract follows rules by which it can make decisions. In a basic example from latin script, if the software seems to be recognizing something resembling to iii (three times the letter 'i'), specific rules kick in to suggest that it is most lightly the letter 'm' and not a triple consonant. Grammar and language coming in at a later stage, as it did for us, still following this unusual idea of teaching software to read. The very specificities of typography and how each shape is drawn and could or couldn't be deciphered from another one arrives just after, as in the previous example the potential small parts that protrudes from the i could form the arc of the m in a more convincing way if the font is a serif one than a sans serif one.

This process does become intertwined with the actual context: with time, the system becomes familiar, and extremely efficient with some specificities of a typeface. It's shape, it's overall form and size now mean something. It would have to relearn an entirely new toolkit to be able to read a different typeface. With this, could the relations binding shapes to their meanings be noticed?

At young, naive and early stages of deciphering writing systems, slowly working out the building blocks to a legible language, we wonder how synthetic constructions (like Hangul) compare to agglutinated ones (like Latin). More specifically, how do these methods influence OCR data?

On a more contemporary note, it would be hard to deny how much screens and screen text technologies have influenced typography these days. All languages carry different meanings, different cultures with their characters. These gri(d)tty displays are no favor to typographic heritage, but they have brought on so interesting conundrums. The rendering engine ttf autohint, by example, voluntary distorts vector shapes of glyphs to optimize screen rendering. When the movement to follow the grid become displacement to fit, the boundaries between canvas based, stable and territorial, and flux based, flexible and moving, blurs itself.

In this workshop, we propose to carefully replay some of the processes the OCR system uses to reread typography from the departure point of any new learner, the one we all have known at first and mostly definitively forgotten by now... By patiently observing the various parameters at play when a letter is to be differentiated from another, the thin and variable line of separation between signification and shape, between letter and typography begins to reveal itself. Could the different parts of the letters that compose bare bones of other letters that are recreated in a kind of wild reverse engineered Metafont paradigm, where all of the shapes of the glyphs are defined with geometrical equations?

We wonder how much we can learn from methods borrowed off OCR. By replaying its methods, but basing ourselves on some parameters only, not aiming for full comprehension, but basic knowledge of how our different sets of characters work retracing its first steps only? Would the outcome of this be enough to go on to understanding typographic subtilities, enabling a bridge between specificities in shape and specificities in language?

Finally, if we know organization in Hangul and Latin are different, and that they do work along with similar ideas, could we try to avoid the main caveats of forcing comparisons between each? Instead can we focus on the systems that the OCR­by­human must use to read both for rethinking deeper specificities between the two composition methods, between these two typography, between these languages?

Return to posts listing