Ever wondered why letters look the way they do? Dr Hana Jee at York St John University in the UK investigates intriguing connections between how languages sound and how they are written. Using a number of methodologies, she has conducted innovative research to quantify these relationships. Her work began with Korean Hangul, a writing system intentionally designed to be highly logical and systematic. Dr Jee has since expanded her research to include diverse scripts like Arabic, English, Hebrew and many others. Her findings suggest unexpected patterns across writing systems, opening several fascinating future research avenues. More
Hangul is the alphabet used to write the Korean language. Unlike many writing systems that naturally evolved over time, Hangul was deliberately designed. It was created in the 15th century by King Sejong the Great to make reading and writing more accessible to common people. It consists of 24 basic letters: 14 consonants and 10 vowels.
Hangul is praised for its scientific design. For example, the basic consonant letters are designed to represent the shape of the mouth or tongue when pronouncing them. The other letters were systematically created, following consistent design principles. This systematic approach makes Hangul highly logical and relatively easy to learn compared with many other writing systems.
Dr Hana Jee’s exploration of writing systems began at The University of Edinburgh, where she was awarded her PhD. She is particularly interested in designing creative methods to quantify linguistic qualities. Her research journey began with a curiosity about the systematicity of Hangul. Here, systematicity refers to the degree to which a writing system is consistent and logical in parallel with its sound system. Her work on this topic has led her into multiple other avenues of research.
While the Korean writing system is famous for its systematic design, Dr Jee asked a slightly different question that has been overlooked so far – just how systematic is it? She started a research project that would lead her and her colleagues to quantify, for the first time, the systematicity of Hangul.
The research team took the phonemes, or basic sound units, of the Korean language and represented them as binary vectors, or a series of 1s and 0s. Each 1 or 0 corresponded to a specific feature of how the sound is made in the mouth. They compared each sound to every other sound and used several mathematical techniques to calculate the differences between them. This difference is called the ‘phonemic distance’.
They then compared how visually different each Hangul letter is from every other letter. They examined how many strokes each pair of letters have in common, reflecting the principles by which Hangul letters were created. These strokes were also represented as binary vectors. The differences between these letter shapes are called the ‘orthographic distance’.
Finally, they tested the correlation between the phonemic distances and the corresponding orthographic distances. This correlation was termed grapho-phonemic systematicity. Here, a positive correlation would mean that similar sounds tend to be represented by similar-looking letters.
Dr Jee and her colleagues quantitatively confirmed that, in Hangul, letter shapes overlap with the sounds by up to 60%. As this writing system was intentionally designed, it is an excellent case study of a writing system that logically matches symbols to sounds.
Intrigued by this finding, Dr Jee wondered if this phenomenon was specific to Korean or represented a general tendency across writing systems, even where they hadn’t been intentionally designed. She decided to extend her approach to various writing systems to see if letters have any relation with the sounds in other languages.
The research team applied three methods to quantify letter shapes and measure orthographic distances. First, they used the Hausdorff distance, which measures the topological difference between images. This is a mathematical technique to measure how different two shapes are and can be used for a variety of images including alphabets. Second, pixel count, which is simply counting the number of pixels that make up a letter when it’s displayed as a digital image. Third, perimetric complexity, which measures the complexity level of images. It calculates the ink area divided by the perimeter, or the length of the outline of the letter.
They used these methods to examine Arabic, Cyrillic, English, Finnish, Greek, Hebrew, and two artificial writing systems that were designed to substitute English orthography: the Shavian alphabet, and Pitman’s shorthand. Strikingly, the researchers found a meaningful correlation between letter shapes and sounds in all of these systems. Even Mandarin Chinese showed a systematicity between characters and their pronunciations. Fictitious languages, such as Klingon, did not show any systematicity.
Interestingly, this research found that the systematicities in different languages were revealed depending on the measurement methodology used. This, Dr Jee hypothesises, may be due to the different way the writing systems evolved.
Semitic languages, such as Arabic, English, and Hebrew, may have evolved in a way that links similar sounds with letter shapes that occupy a similar amount of space. Here, bigger letters often represent more complex sounds. Overall, the size difference between characters is shown most effectively using the pixel count method, and the pixel count method showed the highest systematicity for these written languages.
Korean, the Shavian alphabet, and Pitman’s shorthand were intentionally designed relatively recently to exploit the systematicity between letters and sounds. In these systems, similar sounds are represented by similar shapes. This is best measured using the Hausdorff distance, which is useful for comparing the overall shapes of letters.
In Chinese, similar sounds are linked with similarly complex characters. This is best shown using the perimetric complexity method, which shows not just the size of the character but also takes into account the complexity of its outline.
Dr Jee says that this provides fertile ground for future studies. For instance, we can explore when such systematicity first appeared in human history by examining archaic writings. We could also design more psychologically realistic distance measures to maximise the level of systematicity. Alternatively, we could compare fonts when studying the effects of systematicity on different groups, including early readers, dyslexic readers, and skilled adult readers.
The most recent avenue of research Dr Jee is exploring is the behavioural basis of this systematicity. This encompasses questions such as “Do people prefer the systematicity present in their own writing system?” and “Does systematicity have a real impact on how people link arbitrary symbols with sounds?” These ideas are currently being explored in three countries: the UK, South Korea, and China.
Dr Hana Jee’s ground-breaking research quantifies the systematicity of various writing systems, revealing intriguing connections between letter shapes and sounds. Her work demonstrates that this systematicity exists not only in intentionally designed systems, such as Hangul, but also in naturally evolved scripts. This research opens up new avenues to understand language evolution, literacy acquisition, and the potential benefits of systematic writing systems in communication and learning.