Jelly and Bean

Are phonemes the basic units of speech?

Are phonemes the basic units of speech?

The short answer to this question is no. Phonemes are not the basic units of speech. Phonemes are categories of sounds abstracted from the speech of people in a given community that change the meaning of utterances, but they are not separate sounds which are combined together to form speech. They do not exist as separate units or blocks.

Mark Seidenberg shows in his book ‘Language at the Speed of Sight’ (2017) that the waveforms of words confirm that we speak in continuous articulatory gestures. There are no boundaries between separate phonemes because separate phonemes do not exist in spoken words. We only think there are phonemes because we see letters in written words and we have been trained to match letters to sounds. He concludes that: ‘The letters of a word are like beads on a string, but sounds are more like a cascading waterfall’. (page 27)

Professor Robert F Port first detailed evidence to support the notion that speech was not made up of coarticulated phonemes in a keynote speech to a linguistics conference in 2008. His published paper is entitled ‘All is prosody: Phones and phonemes are the ghosts of letters’.

Professor Port studied the realtime production and perception of speech together with how we store the speech we hear in our memory.

  1. He asked the question, if a language was made up of separate phonemes would there not be jumps in our speech as we moved from one phoneme to another. Clearly there are no distinguishable jumps in any continuous stream of speech shown on a spectrograph.
  2. He investigated the theory of phonemes merging into each other when we speak. This is called ‘coarticulation of phonemes’. But since we do not perceive known acoustic differences between some ‘phonemes’ e.g. the ‘d’ in ‘do’ and the ‘d’ in ‘day’ are acoustically different, but we do not notice this when we listen to someone speak, he concluded that it is more likely that we remember and store the words ‘do’ and ‘day’ in our memory as whole words rather than a sequence of phonemes we have combined together.
  3. He then thought about the time element of using separate phonemes in the singing of hymns and songs where notes can be lengthened or shortened in time to the music, e.g. the word ‘halleluia’ may be sung with any of the vowel syllables lengthened or shortened for a long or short time depending on the musical construction. He questioned how these time durations were accounted for in the theory of static units called phonemes which had no timing elements to them.

From this evidence Professor Port concluded that phonemes are not the basis of spoken language. Indeed, no one needs knowledge of them in order to speak or to listen to speech. There is plenty of evidence from both young children and illiterate adults to show this.

Mark Seidenberg came  to the same conclusion.  In his book he wrote: ‘This fact is crucial. Using spoken language does not require knowledge of phonemes.’   (page 28)


Phonemes are not needed until people begin to read and write. It was the invention of symbols to represent sounds that necessitated the identification and abstraction of these sounds from the spoken language.

The abstracted sounds, and there are many of them due to varying vocal tracks, dialects and accents, are called phonemes, but they are not the basic units of speech. They are simply abstractions from speech, which have been put into categories (because they vary so much from person to person) to match separate symbols so that spoken language may be represented by another medium, written language.

As Mark Seidenberg explains: ‘learning to read changes the representation of speech, promoting the emergence of an abstract unit, the phoneme. Representing spoken words this way makes it easier to read the alphabetic code, which in turn solidifies representing speech as phonemes.’ (page 28)

It is the separateness of the symbols which makes us think that the sounds are separate too. Mark Seidenberg shows with his description of puppets trying to blend separate sounds into actual words (pages 27-28) that this is never achieved.

He writes (of the sounds /b/, /a/, /t/  corresponding to the letters  ‘b’, ‘a’, ‘t’)

‘The sounds do not meld into ‘bat’ no matter how rapidly in succession they are spoken because it does not consist of three discrete segments. A discontinuity always occurs at the very end when the rapidly but discretely enunciated phonemes are followed by the word pronounced as a whole.  ……… The activity is useful because the child learns about letters and sounds. It encourages the fiction that words consist of discrete phonemes even as it demonstrates they do not.’


After 2008, Professor Port also published papers showing that the rate at which we speak is too fast for us to pay attention to the sounds in words. We pay attention to the meaning of what is being said to us in realtime not the sounds within the words we hear. It is only after the event, when we think about the words we have heard, that we are able to pick out the sounds in these words and this only happens after we have had literacy training. By learning about letters we are able to relate them to the sounds we hear in spoken words and we are able to think of them as vowels or consonants after we have been taught about these units. But before literacy training we have no need for any of these conventions or inventions in our use of spoken language.


(2017) Language at the Speed of Sight   by Mark Seidenberg    ISBN 978 0 465 01932 8

(2008)  All is prosody: Phones and phonemes are the ghosts of letters   by Professor Robert F Port

Delivery Information
Inspection copies
Teaching guides
Free writing resources