How ultrasound imaging helps us understand speech and accent variation

by Dr Patrycja Strycharczuk

3 May 2019

This blog post is part of our Summer Showcase series, celebrating our free festival of ideas for curious minds.


The year 1916 saw the publication of George Bernard Shaw’s Pygmalion, a play whose central theme is accent and the social capital it can carry. Pygmalion is also a lasting tribute to phonetics as a systematic scientific approach to studying human speech. This sentiment is expressed quite overtly in Shaw’s preface, which says “if the play makes the public aware that there are such people as phoneticians, and that they are among the most important people in England at present, it will serve its turn.” The scientific fascination is also reflected in a detailed description of equipment in Professor Higgins’s laboratory, including a phonograph, a laryngoscope, tuning forks, and a model of the human vocal tract. 

The fictional laboratory is believed to be modelled after Daniel Jones’s phonetics laboratory at University College London. Although Professor Higgins’s character is ostensibly based on another phonetician, Henry Sweet, there are reasons to believe that Daniel Jones was the real inspiration behind the character, and that Shaw had visited his laboratory while working on the play. While the fictional Professor Higgins has become a cultural reference for encyclopaedic knowledge of accents used in the service of elocution, the real Professor Jones gave phoneticians useful tools for studying and understanding the whole range of accents in a neutral and dispassionate way.

my-fair-lady
Audrey Hepburn as Eliza Doolittle and Rex Harrison as Professor Henry Higgins in a scene from the film 'My Fair Lady', 1964. (Photo by Warner Brothers / Getty Images)

Representing vowel sounds

Most people familiar with English accents could imitate the traditional Cockney pronunciation of ‘The rain in Spain stays mainly in the plain’. But how can we describe that characteristic ai sound with any degree of precision? Among Daniel Jones’s lasting contributions to phonetic science is the vowel quadrilateral, first published in 1917, a theoretical model of articulatory vowel space, which serves as a reference frame for describing and codifying vowel sounds. The model is based on eight cardinal vowels delineating the physical limits of tongue movement in vowel production.

Within this reference frame, any vowel can be represented in terms of how it is produced, and it can be subsequently reproduced by any other phonetician familiar with the system. As such, the vowel quadrilateral provides a simple and highly useful model for representing vowels, that has been adopted, in a modified version, by the International Phonetic Association, of which Jones was himself a president between 1950 and 1967. The same model has also been used widely in language and dialect descriptions, as well as in language teaching.

Whilst highly influential, the model requires a formidable set of skills from a phonetician, in terms of auditory transcription. One has to be able to hear very small differences between different vowel sounds, and to translate that auditory information into an articulatory tongue-based model that captures how the vowels are made. In Pygmalion, the character Colonel Pickering acknowledges the challenges of auditory vowel perception as ‘a fearful strain’, only to be told that it ‘comes with practice’. This is true, if understated, as phoneticians need years of ear training to completely master the system.

Imaging speech

Present-day phoneticians still learn the fundamentals of Jones’s model, but they no longer need to rely on their own hearing to study how speech is produced. Technological developments in speech imaging allow us to observe speech production directly, rather than reconstructing it from the resulting sound. These methods are not new or specific to phonetics. We have been able to image the vocal tract, and selected aspects of vowel production since the invention of X-rays. However, it is only now that we can acquire large volumes of articulatory images relatively easily, using techniques such as ultrasound. With ease of data acquisition comes the real leap in understanding how speech works.

We now know, for instance, that certain aspects of speech cannot be heard. An example of this is the tongue movements underlying the production of the word school by a young female from London. The final l sounds very vocalic, in a way that could be transliterated as schoow. However, the ultrasound shows that some consonantal aspects of the final l are still present. The speaker raises the tip of her tongue at the end of school, but this happens so late that the gesture is not audible. A similar delay mechanism had previously been found for the final r in Glasgow speech, as seen in the following ultrasound recordings. Visual evidence of such covert gestures provides an important puzzle for explaining sound change. Change of l to w, or the loss of final r are both common phenomena, affecting different accents of English at various points in time, and also affecting languages other than English (The Proclaimers even wrote a song about this). If we can understand how these changes happen, we can get one step closer to also understanding why.

One of the outcomes of sound change is accent variation, which we find today throughout the English-speaking world. Perceptions of this variation within the community may reinforce social divisions, especially when particular accents are stigmatised, as Shaw dramatised in Pygmalion. But it needn’t be that way. For linguists, all accents are equally interesting: they all evolve through complex interactions of intrinsic and extrinsic factors stimulating change. Indeed, modern-day linguistics celebrates accent diversity, in a radical departure from the tradition of accent softening and elocution coaching. In this sense, one aspect of Professor Higgins’s legacy, the idea that speech and accent can be approached as objects of scientific enquiry, lives on and continues to drive positive change.


Dr Patrycja Strycharczuk is a Lecturer in Linguistics and Quantitative Methods (Q-Step) at the University of Manchester, School of Arts, Languages and Cultures. She received a BA Postdoctoral Fellowship in 2013.

Additional resources:

Seeing Speech website (Lawson, Stuart-Smith, Scobbie, & Nakai, 2018) provides an introduction to modern speech imaging techniques.

You can explore the global accent variation in English through the following accent map Source: Dynamic Dialects, Lawson, Stuart-Smith, Scobbie, and Nakai, 2018).

The Real Professor Higgins. The Life and Career of Daniel Jones by Beverly Collins and Inger Mees, published by Mouton De Gruyter, is a biography of Daniel Jones, exploring his contribution to phonetic science.

Part of

Sign up to our email newsletters