Spoken English

Spoken English

'No-one understands me!' 

'I always have to repeat myself.'

'People speak too fast in English and I can't understand them!'

Does this sound familiar? If so, read on to understand some of the features of spoken English and how you can better use them to help you.

Language types

Languages can be broadly grouped into two main types: stress-timed and syllabic-timed. What does this mean? First, we need to understand what a syllable is. For example, the word 'syllable' has three syllables

                                                   syll - a – ble

which shows you that a syllable is the smallest unit of sound in a word with a vowel sound. Here are some more examples: 'Malaysia' has three: Ma - lay - sia, and 'country' has two: coun - try. Now we need to know what 'stress' means in pronunciation terms. It means to make something stronger (louder, clearer) when we say it. Got it? Ok! Back to our types of languages:

Syllabic-timed languages

In these languages, syllables have similar lengths or amounts of time. Syllables may still have stress, but each one is roughly of similar length and each syllable should be pronounced clearly. Languages include Chinese, Vietnamese, Korean, Indonesian, Spanish, Brazilian Portuguese, Turkish, South Asian Languages, Italian, and French. 

Stress-timed languages

In stress-timed languages, like English, it's very different. Syllables all have different lengths because important (i.e. stressed) syllables are longer, and less important (i.e. unstressed) syllables are shorter. If a syllable isn't important, we don't need to hear it clearly and it's often 'reduced' (very little). In fact, if you pronounce a syllable clearly when it should be unstressed, it sounds like you're stressing it, which will then make the word very hard to understand for an English speaker. For example, photographphotographerand photographic all have different stressed syllables so if you say one of the unstressed syllables clearly, it might sound like you're saying a different word. 

This is a picture representing the difference between the two language types:

syllabic vs stress timed language

Graphic representation of the two language types

Stress in English

In English, there are two types of stress: sentence and word stress. When we say some words louder or stronger than others in a sentence, this is called 'sentence stress'. We can also stress words by changing the pitch of our voice, making it go higher or lower. This is called 'intonation'. Here you can see an example of word stress and intonation and how it can change the meaning of the same sentence. 

word stress annotation

Word stress and intonation can change meaning

We also always say one syllable the loudest inside one word, which is known as 'word stress'. Sometimes more than one syllable is stressed, but there's always one that's the strongest. This syllable must have a vowel sound (we can't stress only consonants in English). You can see word stress here:

word stress examples

Example of word stress. Image via British Council

In a stress-timed language like English, the distance between stressed syllables is usually similar. In other words, the stressed syllables have a regular 'beat' and happen at a similar speed. See the example below, where > marks the stress. 

stress sentance

Example of sentence stress

As you can see, anything that isn't stressed has to be squashed into the space between the stressed syllables, whether there are a few syllables like in the first sentence ('to get a') or double that amount like in the second sentence ('don't you come and have a'). These groups of words will take the same amount of time. It's possible to do this because in spoken English, we often make the unstressed syllables very small, or 'reduced', allowing us to fit more in, but this doesn't happen in other languages like Chinese (Mandarin and Cantonese), Arabic, Vietnamese, Korean, Turkish, or Spanish. 

This is why English speakers might not understand you when you're speaking 'clearly' by pronouncing everything correctly. Has this ever happened to you? Keep reading! 

For all language learners, it's important to know about the following:

I wanna be a schwa its never stressed meme

The 'schwa' sound /ə/

  • unstressed sounds are often 'reduced' in spoken English, with the vowel pronounced as a 'schwa' sound. This is a very little, very short, hard to hear and 'lazy' sound, like 'uh'. Sometimes it sounds like there's no sound when there is actually one or two syllables. 

e.g. 'How are you?' is pronounced 'howuhyou?', 'How are you going?' is 'howyuhgoing' and 'Bottle of water' is pronounced 'bottleuhwater'

Using one of our earlier examples, we can see how the schwas can change if the stressed syllable changes:

Different positions for schwas and stressed syllables for different word forms

connected speech

Examples of connected speech

Connected speech

  • We like to connect words in spoken English. This means that when we have a word starting with a vowel, we often take the last sound of the previous word and connect it to the vowel sound of the following word. This way, words can flow together.
    • Examples:
      • Get it' is pronounced 'Ge-tit' and sounds like 'geddit'
      •  'Take off' is pronounced 'tay-koff'
      • 'Hand it out' is pronounced 'han-di-dout'
  • If we have a vowel sound at the end of a word and the start of the next word, we put in a 'linking' sound (whatever sound is closest for your mouth position).
    • Example:
      • in 'Go away' your mouth is near the 'w' sound at the end of go, so we pronounce it 'gowuhway' and it sounds like one word rather than two. 'Say it' sounds like 'sayut' and 'Flow into' sounds like 'Flowinto'.
  • Sometimes sounds disappear. This is called 'elision'. This might happen because a sound isn't important or because it's easier to pronounce the word without the sound. 
    • Examples:
      •  'sandwich' sounds like 'samwich' (we lose the /d/)
      • 'handbag' sounds like 'hanbag' (we lose the /d/)
      •  'tell her' sounds like 'teller' (we lose the /h/)
  • If there are two similar sounds at the end of one word and the start of the next, we'll just use one sound and join them together.
    • Examples:
      • 'Get two' sounds like 'getoo' 
      • 'Make cookies' sounds like 'may cookies' with just one /k/ sound in the middle

See more examples of these features below:

How can this help me?

Good question. First, we'll consider what's important to remember in general, particularly if you have a syllabic-timed language background.

  • You won't hear all English words clearly. In fact, you won't even hear all syllables clearly - you will only hear the stressed ones. 
  • When you speak, you shouldn't pronounce all the syllables equally and clearly. Some should be longer and clearer, and the unstressed ones should all be shorter and quieter.
  • When speaking, you should stress words that carry meaning, or 'content' words. These are usually nouns, verbs, adverbs, or adjectives.
  • Any words that are not important for meaning, also known as 'form' or 'grammatical' words, should not be stressed in speech, unless they have a special meaning. They might include articles (a, an, the), pronouns, auxiliary verbs (be, do, have, and often modals), and prepositions. 
  • Words should be connected together; don't separate words that start with vowels and pronounce the beginning of them really strongly, for example. 
  • English intonation and non-verbal communication often carry more meaning than words. It's really important that your facial expressions and voice match your meaning, and that your voice goes up and down and has slow and faster syllables.
  • Pausing is also important in English. When words aren't grouped together (e.g. noun phrases or clauses within sentences often get connected together), you should be pausing in between. 

Arabic's a little different, so Arabic speakers should also remember:

  • Arabic is stress-timed, but the stress patterns are very predictable (usually you can guess them). However, in English, stress patterns change depending on what words are important in the sentence as well as on each word. Some words have their first syllable stressed, while others have the second or third, like our photograph,  photographerphotographic example from earlier. 
  • English has a lot of 'consonant clusters'. This means letters that are next to each other without a vowel in between them. This doesn't happen in Arabic, so be careful in English that you put these types of sounds close together and don't stick a vowel in if there isn't one. Examples include spring, street, and lengths - not 'sepering', 'setereet' or 'lengethes'.
  • Arabic doesn't have reduced vowel sounds (schwas), so you might find it difficult when you can't hear every syllable clearly in English. Don't worry - you're not alone! Remember that you only need to focus on the stressed syllables to understand or be understood. 
  • Arabic is phonetic, which means that the sound is the same as the spelling. English is not like this at all! English has three times as many vowel sounds as Arabic does, and different vowel spellings can sound the same. For example, the schwa sound can be made by any vowel (a,e,i,o,u). Keep this in mind when you're thinking about spelling or pronunciation. 
  • The 'r' sound is rolled in Arabic but not in English. Make sure that if you roll your 'r's while speaking English, you aren't putting stress on the syllable with the 'r' accidentally. 

How can I practise?


To practise these features, you can try the pronunciation software programs at the RMIT Learning Lab.


You can also practise syllable stress (making one syllable in a word louder than the others) and individual English sounds in free phone apps like SpeakAP or ProPower, as shown below.


pronunciation power app

Free apps like these are helpful for practising syllable stress and individual sounds in English.

Recite or sing

You can also practise by getting a talk with a transcript, like a talk from TED.com or a famous speech from youtube, and speaking it at the same time as the recording. You can then read it to someone like a Study Support Teacher and get feedback. Pop songs are also useful resources for practising pronunciation, because songs are easy for brain to catch on to and they exaggerate many of the spoken English features. 

Use the phonemic chart

This chart can help you identify and work on individual sounds in English. It can also help you read the way a word should sound if you're not sure because this is how dictionaries will show the pronunciation. Most of the previously listed apps and software programs use it, too. You can find the chart easily by searching, or just open your e-course on Blackboard (screenshot below).

phonemic chart

Find and listen to sounds from the phonemic chart in your e-course

Now it's time to practise! Post below if you have any questions or feedback. 

Want more ideas?

Learn more about studying with RMIT Training and RMIT University

Want to learn more about the pathways programs available? RMIT Training offers Academic English or Foundation Studies pathway programs.

31 May 2021


  • Learn English
  • Reading
  • Listening
  • Speaking

Related articles

aboriginal flag
torres strait flag

Acknowledgement of Country

RMIT University acknowledges the people of the Woi wurrung and Boon wurrung language groups of the eastern Kulin Nation on whose unceded lands we conduct the business of the University. RMIT University respectfully acknowledges their Ancestors and Elders, past and present. RMIT also acknowledges the Traditional Custodians and their Ancestors of the lands and waters across Australia where we conduct our business - Artwork 'Luwaytini' by Mark Cleaver, Palawa.