What is tone and what is intonation??
So I wanted to talk a bit about the overlay of tone and intonation from a linguistic perspective as I see a lot of learners practicing in ways which might actually hinder them long term. This is a midnight post without looking at my notes lmao but here’s hoping it helps!!
So, tone: broadly speaking, without getting complicated, it’s the use of pitch, accent and other markers to distinguish INDIVIDUAL WORDS from each other. It’s LEXICALLY contrastive in the same way that /p/ and /b/ are two different phonemes, making ‘pat’ and 'bat’ two different words: in tonal languages like Chinese tone functions the same, and, as we know, not all 'ma’ s are made equal: 吗妈麻骂 and so on.
Intonation is the use of pitch, 'loudness’, stress (whatever that really is) and other features across a segment that’s larger than a syllable, usually across a phrase or sentence. This is not LEXICALLY contrastive, but may make certain grammatical, structural or pragmatic (meaningful) properties of the phrase clear to the listener. In English, for instance, our subordinate clauses are often very low and flat - this helps the listener process the syntactic structure by giving them an extra heads up. The information can also be completely paralinguistic, such as informing them that the speaker is very angry or very pleased.
The problem with many learners, which makes any tonal language sound stilted and somehow off, is that they’re not taught about intonation and how it interacts with tone. They copy the individual words and are so focused on getting the ABSOLUTE pitch or tone right that they fail to place it in the context of the tone of a sentence. Because intonation CANNOT exist independently of tone in a tonal language, nor tone independent of intonation.
Looking at English for a sec: it’s a stress timed language, which means that words and syllables that aren’t stressed get 'squashed’ - think about how we pronounce 'can’ differently in the sentences 'Yes, I can!’ and 'Can you do it?’. In the first we pronounce it with a full vowel - in the second it’s reduced to a schwa. It’s not just whole sentences in English that are stress timed though - we rely on stress to distinguish between different words. Have you ever been in a conversation with an L2 speaker of English and failed to understand a word they’re saying, much to their frustration? Many times, for speakers of languages that aren’t stress timed, it’ll be because of word stress: it’s very hard for us to hear 'BEcause’ with a full vowel in the first and a schwa in the second as 'because’. This is why native English speakers often struggle, conversely, when speaking syllable-timed languages like Spanish or mora-timed languages like Japanese - we tend to be a bit too eager on the squashing front!!
There are two things to bear in mind here. Firstly, that though English doesn’t have tone, it DOES use stress to determine two words - the verb and noun 'record’ being a classic example. 'Stress’ is a particularly nasty phonological phenomenon that manifests itself in a variety of ways - it basically just means emphasising one syllable or segment over others, and can be achieved variously using length of a syllable, 'loudness’ or 'acoustic energy’, pitch and so on. This varies language by language. In English, for instance, stressed syllables are often higher in pitch.
So this has implications for our acquisition of tone as speakers of non tonal languages. One of the problems, I think, is that we are taught in isolation and given extreme examples where, especially in Mandarin, people just don’t speak like that in real life (the third tone??!. I could go on about this forever, but anyway), what it means is that we are shown tone as something completely foreign and different and hopeless and how will we ever learn it… Sure, it’s a system we don’t have, but we have something similar-ish: we also use pitch, among other things, to determine where the stress in the word is. The difference is, of course, that it’s not contrastive in English, and that it’s only visible over multiple words. But my point is that if you try to memorise a phrase or a multi syllabic word in a tonal language by just hearing the vague shape of the tone, the tone pattern, than specifically memorising the individual tones, this will not only appeal to your non-tonal L1 brain who likes to hear pitch in determining the stress of words, but it will also make it far easier to transfer your individual learned vocabulary to being used in an actual, natural utterance. This also massively speeds up your process of tone acquisition - once you start hearing tone as an intrinsic part of the word, much like how the stress on the second syllable is intrinsically part of the word 'fantastic’, you start to memorise tone without even noticing you’re doing it.
This brings us to my second point - English can use pitch as a marker for sentence intonation BECAUSE pitch is not a contrastive element in English. You can say 'horse’ as weirdly as you want, it still means horse. Tonal languages CAN’T do it in the same way, but of course they still have sentence intonation, it’s just a little different. Instead of having individual patterns of sentence intonation that manifest themselves in pitch inside individual syllables, therefore, they operate WITHIN the constraints of the tonal system of that language.
This is what trips many learners up. They try to add their non-tonal sentence intonation to a tonal language - so for instance rising tone on certain types of questions - and are then stuck saying something which really sounds more like a second tone when they wanted anything but. They are then frustrated - doesn’t Chinese have intonation?? Of course it does - but not in the way you’re used to.
I’m going to mention one thing here that is critically important. And that is that Mandarin Chinese, especially more so compared to some other tonal languages, uses RELATIVE and not absolute pitch to mark tones. This explains why it’s not difficult to understand a man’s fourth tone or a woman’s even though the absolute pitch of their voice is different - it’s the tone contour that matters. This is not quite the same in languages like Cantonese where there are some tones that have a similar contour, but some are higher than others. ALL languages, however, regardless of whether they are tonal or not, exhibit a phonological phenomenon called 'downdrift’. This means that the high flat tones at the beginning of a clause or a phrase will be considerably higher, flatter and longer than those at the end - and the relative pitch of a low tone at the beginning of the sentence, in some languages, may even be higher than a high tone at the end of an utterance. Compare the various 他 in 他说他没看见他 - they should get progressively lower and shorter. This is just one of the ways intonation operates in a tonal language - often, tones are more explicit and 'prototypical’ in main clauses or at the start of a clause than in subordinate clauses or at the end. It’s a way of telling the listener pragmatic information about the utterance, which is the job of all suprasegmental phenomena like intonation.
So, coming back to operating 'within’ the 'constraints’ of a tonal language, what does that mean? Well, Chinese can’t use pitch in the same way as English in intonation because, unlike English, pitch is a contrastive element in word formation of single syllablic words. So what it does instead is EXAGGERATE the tone of a particular word that wants to be stressed. Let’s say you want to say 这件衣服不是我新买的 and you want to emphasise various things: THIS piece of clothing, this piece of CLOTHING, this piece of clothing is NOT, this piece of clothing is not MY recently… and so on. You can do this in English with stress, in many people exhibiting itself with a high long vowel falling sharply. Try it yourself with the sentence 'She didn’t steal my handbag’, and watch how the intonation changes depending on which word you want to stress. In tonal languages, however, you’ll exaggerate the tone instead.
So what this means is that all those nodding people in YouTube videos trying to teach you the third tone ONLY speak like that when they are emphasising something - like, 不是我的，是你的. They may not even speak that way in isolation when given an individual word to read in the third tone - and so, in speech, it’s not really a falling-rising tone but just a low, short 'zombie’ tone, often accompanied by 'creaky voice’ or vocal fry. If you speak like they do - if you are the world’s greatest mimic and can copy them 100% - you’ll sound like a robot at best, and people will be utterly incapable of understanding you at worst.
WHAT DOES THIS MEAN FOR HOW WE PRACTICE TONE??
Firstly, that tone and intonation are not seperable in any language. Tone and intonation both use the same elements available to us in the process of speech production (vowel duration, vowel quality and so on) and you can’t isolate one from the other in any kind of natural speech. So when learning tone, it’s critically important to do it naturally. I don’t mean yeet yourself to China, that’s unrealistic and unnecessary, but copy native speakers saying phrases, not words. Watch how they speak when they sound excited, or sad, or angry, and copy them. Try to infuse your voice with emotion not how you think it should sound, but how it actually sounds. And above all, remember: intonation operates within the constraints of tone. You want to sound angry with the second tone but keep making it sound like the fourth? Listen to how short and abrupt second tones in words like 停 sound, and copy what you hear. Mimic native speakers. Don’t be afraid to sound silly - you are relearning a system of expressing yourself and expressing emotion. It takes time.
And you’re all doing great.