w/ Dieuwke Hupkes


Is the relationship between the orthographic forms representing sounds and the acoustic or phonological features of those sounds somehow motivated? Is K a particularly good symbol for a voiceless plosive phoneme, compared to, for example the more rounded B for a voiced plosive?

The relationship between orthography and sound-symbolism is an underexplored one. Some authors (e.g. Cuskley) have demonstrated that familiarity with orthography can modulate the strength of the Bouba-kiki effect, but we suggest that orthography might itself reflect the types of cognitive and perceptual biases underlying the Bouba-kiki effect,rather than being responsible for the associations that human participants make between sounds and meanings.

Thus far, there has been almost no serious consideration given to this hypothesis, especially across a wide range of languages. Here, we aim to explore the associations between orthography and phonology in a broad range of writing systems.


Using a combination of computational and experimental approaches, we aim to explore the associations between orthography and phonology both within and between the world’s alphabets, abjads, and abugidas. Is it the case, for example, that orthographic representations for unvoiced plosive consonants (e.g. /t/, /k/) are more likely to be complex, made out of straight line segments (rather than curved ones) etc.

Using a variety of computational methods we aim to quantify features like complexity and curvature of both whole symbols (letters) and their constituent parts (line segments). These features can then be correlated with phonological and acoustic characteristics, such that we can determine whether orthographic features are predictive of acoustic ones (or vice versa) either within or between languages. Could a model trained on a subset of the English alphabet, for example, guess what sound the symbol K represents?

In addition to a computational exploration, we plan to pursue these same questions experimentally using human participants. When presented with ⵒ for example, would participants be more likely to accept it as a symbol for the sound /p/ (correct), /b/ (similar) or /n/ (dissimilar). What about for individual line segments from the same character?