Tumgik
#i’m not articulating this well but the way music has a physiological effect on the body is fascinating and wonderful
dreamofbecoming · 10 months
Text
music is one of my favorite things about being human. and not just like, the way my favorite bands can articulate shrimp emotions in their lyrics, although that too for sure for sure. but i also mean that feeling you get when you hear an orchestral arrangement start to swell or you’re at a festival and someone is doing a traditional drum performance and the words don’t matter, it’s just sound waves it shouldn’t mean anything but you feel a clenching in your chest and goosebumps on your skin and there are tears in your eyes and it’s just vibrations in the air but you feel it in your soul. i think that’s what being a person means
111 notes · View notes
tanadrin · 4 years
Text
Realistic phoneme inventories 1: Vowels
All of the old webpages that I used to rely on for at-a-glance information about vowel systems when designing new phoneme inventories for my conlangs seem to have succumbed to various forms of link rot; and I’ve never found a good overview of how to build consonant inventories in a systematic way. So I want to set out, for my own reference and for others, an overview of both vowel and consonant systems as they tend to exist in natural languages, with an eye to creating conlangs with natural-feeling distributions of sounds.
A phoneme inventory is, of course, only one of the most basic elements of a conlang. I won’t be dealing with phonotactics, with suprasegmental features like tones, and certainly not with grammar or syntax or anything like that. A good starting point for lots of those topics is the LCK, either online or in print.
Also, phonology is not my strong suit; I’m more of a morphology person by inclination, and all my knowledge of linguistics comes essentially from years spent conlanging as a hobby. So apologies in advance if I get any terminology messed up. My primary references for this post specifically are this paper on vowel systems, this paper on consonant systems, the relevant chapters from the WALS, and conlanging resources like the LCK.
For length reasons, I’ve broken this post into multiple parts. Part 1 will deal with vowels; part 2 will deal with consonants.
1. Background information
You could, when creating a conlang, select sound arbitrarily based on what sounds pleasing to your ear, what you can easily pronounce, your favorite natural language, or your favorite IPA symbols. For conlanging as a purely artistic enterprise, those are all perfectly fine criteria; but if you want your conlang to reflect trends in natural languages--perhaps as for conlangs which are the putative natural languages of science fiction and fantasy settings--it’s helpful to understand why humans make the noises they do with their faces, and how.
The human vocal tract is a resonant column of air, stretching from the vocal cords to the lips (and nostrils, for nasal sounds), not unlike the pipe of a pipe organ, or the body of a flute. Air expelled from the lungs moves through the vocal tract and, thanks to our well-developed throat, mouth, and facial muscles, and the elaborate control over them provided by a region of the brain called Broca’s Area, we can rapidly reshape our vocal tract and manipulate the resonances of the air passing through it, producing speech.
In principle, we could make an almost infinite number of subtly different motions with our various speech-producing organs to produce an equally limitless quantity of different sounds. In practice, however, speech has to be an effective way of encoding information, or it’s useless as a communication tool. Therefore, we want sounds to be as different from one another as possible, as distinct to the ear as they can be, so they can be clearly distinguished from one another, and clearly heard over noises like wind and crackling fires and loud music. And because we talk constantly, we want speech to be as easy as possible; we are going to tend to restrict ourselves to the easiest sounds for the human vocal tract to produce.
So the IPA, the system for transcribing human speech sounds, only has about 107 basic symbols for consonants and vowels.
Tumblr media
The IPA consonant chart. Shaded areas are “articulations judged impossible.” White areas with no symbol are sounds that aren’t impossible, but which aren’t widely attested in the world’s languages to need a specific transcription.
Of all the sounds covered by the IPA which actually show up in natural languages, only a small subset are truly common. Some, like the plosives /p t k/, are nearly universal. In general, the easier to produce (and more acoustically distinct) a sound, the more common it is; and languages will tend to make use of commoner sounds first (like plain plosives) as their phoneme inventory grows, before they have recourse to less common sounds (like pharyngealized plosives).[1]
The other important thing to note about phonemes is that they’re not atomic. The human brain is an extremely powerful pattern-recognition machine, and whether we’re learning our L1 as babies or learning our sixth L2 as an adult, we break languages down into many different patterns and systems as we learn them. It’s easier to learn, for instance, the general pattern that “third-person present verbs in English end in -s” than to learn, as separate pieces of information, “okay, after ‘he, she, it’ the verb ‘put’ is ‘puts’ and the verb ‘see’ is ‘sees’ and the verb ‘run’ is ‘runs’...” etc.[2] But we usually do not learn these patterns explicitly, and even when sitting down in a classroom to learn a language, there’s only so much use you can get out of memorizing a table of conjugations or declensions--it’s hard to speak fluently if you have to pause in a conversation to go, “hmm, okay, but what’s the dative femine form of the article?” Most language learning requires acquiring an intuitive grasp of the patterns of language; and you only acquire that intuition through lots of speaking and listening.
A consequence of our dependence on intuitive understanding is that two people’s intuition can differ. For instance, many people learn the rule in English that “I” is nominative, and “me” is objective; and so, unless the first person pronoun is the object of a verb or preposition, you must use “I”: “Gowron is better at Klingon politics than I.” Sometimes, this is analyzed as having an implied verb: “Gowron is better at Klingon politics than I [am].” But over the centuries of people speaking English, many people internalized a different rule. Because pronouns crop up as the objects of verbs or prepositions more than as the subjects of verbs, the objective forms came to be reanalyzed as the default forms. The nominative form became a special, marked form--one that only occured in certain cases, which was, over time, simplified into “only when the subject of a verb.” Therefore, for these speakers of English (me included), the rule became: “when appearing to the left of a verb, use ‘I’; otherwise, use ‘me.’”
Out of the steady accumulation of such petty reanalyses, great changes in grammar are born.
The same process is at work in the sounds of a language. Just like grammar, sound is heavily systematized, and encapsulated by the brain as a set of patterns. Only instead of cases or persons or numbers, the component features of sounds on which patterns are based are called their “features.” And, as with any good Saussurian principle of sign-distinction,[3] we only care about a minimal set of features which, as a community of language-speakers, we all agree are the relevant ones for distinguishing sounds. For instance, if our language has the consonant sounds /p t k b d g m n/ (the unvoiced and voiced plosives, and some unvoiced fricatives), we might need only the feature [voiced] and [nasal] to distinguish the sounds of our language. But, because we learn these rules implicitly, and we don’t have to give our youngsters a background in up-do-date phonetics research before they can say “papa”, maybe later generations of speakers, or the next village over, notices a different set of features. After all, we have to physically produce these sounds with our mouth; they don’t exist in a perfectly idealized acoustic realm. Some speakers of our language may come to see the defining feature of the voiced stops as only [+voiced], and since there are no fricatives which they can be confused with, start pronouncing them with a less constricted airflow to make them sound even more distinct from the unvoiced stops. So gradually /b d g/ become /v ð ɣ/, the voiced fricatives produced at the same place in the mouth. As far as the speakers of the language are concerned, the sounds haven’t changed--[+fricative] is not a phonemic feature of their language! The pattern isn’t changed, at least not yet. But more sound changes will accrete over time, and they may affect the new series of fricatives differently than they do the stops; and with a few more changes like this, soon you may have a version of the language that sounds completely different and is entirely mutually unintelligible.
Sound changes are 1) regular, and 2) have no memory. While sound changes can be triggered only by certain phonetic environments (say, the voicing of /p/ to /b/ between two vowels), if the conditions for a sound change are met, it will be triggered everywhere it applies.[4] And later speakers of the language won’t remember that /v ð ɣ/ used to exist in opposition to /p t k/ (unless they take a class in historical linguistics); they’ll treat these sounds on their own terms.
When the exact production of a sound varies within a language, usually altered by context due to the physiological considerations surrounding making that particular face-noise, this phenomenon is called allophony. Different versions of the same underlying sound are allophones. The sound as a unit of the formalized pattern stored in your brain is a phoneme. A phonemic transcription (between slashes /like this/) is a transcription of a sound or sounds as these abstract phonemes. A phonetic transcription (between brackets [like this]) is a transcription of a sound or sounds as something like their actual acoustic realization.
2. The vowel space and vowel planes
Consonants involve obstructing or redirecting the flow of air through the vocal tract, often entirely (as with plosives), or turbulently (as with fricatives). Combined with the large number of distinct places of articulation, involving the teeth and tongue and palate, consonants can all sound very distinct from one another. As a consequence, small consonant inventories can restrict themselves to a small subset of the full space of possible consonants, and still be fairly distinct from one another. In fact, in languages like Hawaiian or Rotokas, with very small consonant inventories ( /m n p t~k ʔ h w~v l~ɾ/ for Hawaiian and only /p t k b~β d~ɾ g~ɣ/ for Central Rotokas; the ~ symbol indicates allophonic variation between two sounds, depending on speaker or context), it’s very unlikely there’s going to be any sound that’s really difficult for a speaker of a language with a more complicated consonant inventory like English to pronounce.[5]
Vowels, though, don’t involve specific points of contact between different parts of the vocal tract in the same way as consonants; vowels are produced by the relative position of the tongue in the mouth, with an unimpeded air flow and the vocal cords engaged. This means that vowels can vary subtly--and, as a consequence, that languages tend to spread the vowels they have out, throughout the entire articulatory and acoustic space available to them, in a way they don’t have to do with consonants.
Here’s the IPA vowel chart:
Tumblr media
The reason it’s longer at the top and on the left side is because there is more acoustic differentiation possible when the mouth is more closed versus more open, and when the tongue is more front than back. Languages will especially tend to have more close (or “high”) vowels than open (“low”) vowels. That’s not the only property that affects how vowels tend to be distributed though. Here’s a schematized diagram of the vowel space based on the actual acoustic components of the vowels: 
Tumblr media
Speech sounds are composed of different-frequency elements called “formants;” the lowest-pitch formant is F1, the next-lowest F2, and so forth. For most vowels most of the time, F1 and F2 are the really important formants. Open vowels have higher first formants, and close vowels lower first formants; front vowels have higher second formants, and back vowels have higher low formants. Here’s a similar chart, showing actual values, from the Hitch paper: 
Tumblr media
But we don’t recognize vowels just using their pitch. If it did, we could in theory have languages with hundreds of vowels: the ear and brain together can detect extremely subtle gradations of tone. Rather, what matters more is the relative value of vowels, the distinctive features like [+front] or [-high]. 
Most languages have small vowel inventories; in terms of the psychological perception of vowels, the vowel space is quite small. WALS classified any language with 4 or fewer values as “small,” languages with 5-7 vowels as “average,” and any language with more than 7 as having a “large” vowel inventory. Germanic languages like English, which have anywhere from 10 to 17 (!) vowels, are monstrously bloated by global standards. Usually for larger vowel inventories, additional features will be added besides the spatial features so that vowels don’t have to compete for space: Latin doubles its vowel inventory (/a e i o u a: e: i: o: u:/) by adding a length feature, and Turkish (/i y ɯ u ɛ œ a o/) accomplishes something similar with rounding.
Additional sets of distinctions like these, which are not spatial distinctions, create different vowel planes, where vowels do not have to compete for space with one another directly. Vowel planes may be parallel (as in Latin or Turkish), or not. There may be phonological or grammatical processes that trigger vowels moving from one vowel plane to another, as languages with vowel harmony, where vowels in a word must share a particular feature like frontness or roundedness, or vowel planes may simply exist to provide additional acoustic contrast within a language’s vowel inventory.
Traditionally, languages with large vowel inventories have been analyzed as having many degrees of front/back or height distinction: four, in languages like Danish, or even sometimes five, as in the case of one Bavarian dialect Hitch cites in his paper. However, Hitch argues that the psychological space available for the vowel plane is really divided by reference to a perceived “neutral” vowel, one that may not be phonemic in a language, but will still crop up in paralinguistic utterances (like English “ugh” or “uh-huh”). It is by comparison to this vowel that vowels acquire distinctive spatial features, and as such, there are really only nine ways, at most, to divy up the vowel plane:
Tumblr media
No language, in Hitch’s analysis, really has more than three distinctions of height or backness. When you think you have more, as in Danish, it’s time to take a look at the possibility that some apparently spatial feature really reflects an underlying contrast that isn’t spatial. Remember, it’s only the phonemic features of a sound that are fixed: the non-phonemic features can vary, sometimes by quite a lot.[6] Anything higher and fronter than the neutral vowel will count as a “high front” vowel, and its exact spatial realization may not be the same in each vowel plane.
Danish, for instance, has the vowel inventory /i e ɛ a y ø oe u o ɔ/ and is analyzed as Hitch as having the primary vowel plane
Tumblr media
The three front rounded vowels /y ø oe/ form a distinct plane, one in which the only distinctive feature is height: high round, mid round, low round. The “frontness” of these vowels is a phonetic feature, but not an important phonemic feature. They don’t contrast directly with the rounded back vowels, because back vowels are usually rounded--it makes them more acoustically distinct from mid vowels, and round back vowels show up in tons of languages, like Latin, that don’t make a phonemic contrast for rounding. And rounding has a side effect on front vowels, making them sound a more central: thus, the “front round plane” is, perceptually speaking, more of a mid round plane distinguished by the [+round] feature. Languages can have multiple secondary planes. According to Hitch, Jalapa Mazatec “may have six parallel planes.” 
Vowel harmony doesn’t have to operate across planes: Hitch provides the example of the three-vowel language Jingulu, which as /a i u/. A suffix in /u/ or /i/ will raise the preceding vowel unless a high vowel intervenes: bardarda, “younger brother” + -rni > birdirdirni, “younger sister.” But if often does, with apparent height distinctions being better understood as plane distinctions: “In these languages, the vowels in a particular word will all be from one plane or the other. It seems that the choice of plane is determined at the lexical level. In the lexicon, the words contain archiphonemes spanning both planes, and each word is marked with a feature indicating plane membership.”[8]  Even if a language doesn’t have clearly non-spatial articulatory features distinguishing its planes like nasalization or length, it can still have two vowel planes that exist side by side. For Ogbia, a language of Nigeria, Hitch gives two vowel planes corresponding to one with the advance tongue root feature (+ATR) /i e u o ɐ/ and one without (-ATR) /ɪ ʊ ɛ ɔ a/. For Nez Perce, five surface vowels /i u o æ ɑ/ correspond to two planes /i u æ/ and /i o ɑ/; a word can have vowels from one plane in it, but not both. /i/ happens to exist in both planes (possibly due to a merger of two distinct underlying vowels).
3. Vowel systems
So for any vowel system on a single plane, we’re going to have a maximum of nine vowels. Secondary systems may be the same size as the primary vowel plane; or they may be smaller. Either way, our vowel systems will tend to have one of two shapes, triangular or rectangular. In a triangular vowel system, acoustic considerations are dominant. We will have fewer open vowels, and more close vowels. In a rectangular vowel system, the psychological considerations are instead dominant, and vowels will be distributed in the nine-vowel grid in a more symmetric fashion.
Tumblr media
These nine potential positions or “archiphonemes” don’t always reflect the same division of the vowel space given on the IPA. 7 and 9, for instance, might be open-mid vowels rather than true open vowels. 2 might be a rounded front close vowel. 5 may or may not be a schwa. 8, the bottom of the IPA trapezoid or the idealized acoustic triangle, is usually [a], despite [a] being, tecnically, a front vowel! I will simply quote Hitch at length here: 
Tumblr media
With those caveats, we can then look at the possible arrangements of vowel systems, from zero vowels to nine. 
Zero vowels. “A zero-vowel language would insert vowels according to rules of epenthesis, then colour the vowels according to phonetic context. It sounds theoretically possible, but no completely convincing cases have yet been identified.”
One vowel. “There would seem to be no indisputable examples of one-vowel systems on a primary plane.” But there are languages with one-vowel secondary planes. If a language has one long vowel, for instance, it will be /a:/. But if a language has one nasalized vowel, it can be just about anything. 
Two vowels. This includes languages with a two-vowel front round plane; also, languages with a primary plane that has just a height distinction. All Northwest Caucasian languages have /ə a/ (but feature lots of allophones). Most examples Hitch cites for two-vowel systems have some kind of central vowel (/ə/ or /ɨ/) plus /a/; but Witchita has /i a/. “But this type reveals something fundamental about vowels: that [low] is the most basic of the four spatial features.”
Three Vowels. The triangular system /i u a/ is a very common system among the world’s languages, with /i/ and /u/ having lots of vertical freedom.
Tumblr media
Hitch is very down on the idea of a three-vowel on the primary plane; of the potential examples he cites, none are undisputed. But “Parisian French has a vertical three-vowel configuration /y ø oe/ on a front-rounded plane (primary /i u e ə o ɛ a ɔ/). While vertical three-vowel systems may not exist, primary plane triangular three-vowel systems are exceedingly common.”
Four vowels. The triangular 4-vowel systems (4a and 4b) add a neutral vowel to the classic 3-vowel system. A straightforward rectangular system is possible (4c); as well as a slightly more complicated variation with more room for allophony (4d).
Tumblr media
He also gives the unusual example of the Lummi dialect of North Straits Salish, which “appears to have no low vowels” /i e ə o/, though this is clearly an outlier.
Five vowels. The Latin vowel system (5a) is an extremely common triangular system; a rectangular 5-vowel system is also pretty common (5b). Three other five-vowel systems are given that are “relatively rare,” being a triangular system that combines 4a and 4b (5c), a 4c-like rectangular system with a mid front vowel added (5d), which is “asymmetrical, because the acoustic space is dominant,” and a different variation on 4c that instead adds a high central vowel. As an unusual exception, Hitch notes that Tohono O’odham “appears not to fit the pattern of any other language, and to violate a universal by having more back than front vowels with /i ɨ u o a/.”
Tumblr media
Six vowels. Adding a central mid or central high vowel to 5a gives two common triangular six-vowel systems (6a and 6b). A rectangular six-vowel system, with no central vowels, is also possible.
Tumblr media
Seven vowels. There is one possible triangular configuration, 7a, with one low vowel. Otherwise, 7-vowel systems are rectangular systems that differ only on where they place the central vowel.
Tumblr media
Eight vowels. Similarly restricted: there are only three possible configurations of eight vowel systems, depending on which central vowel is omitted. None appear to be very common, however.
Tumblr media
Nine vowels. Nine is the maximum number of vowels on a single plane, and therefore there is only one nine-vowel configuration possible. All analyses of more than nine “basic” vowels means you should start examining the possibility of multiple vowel planes.
Tumblr media
In the next post, we’ll take a look at consonant systems.
Footnotes:
[1] https://wals.info/chapter/1 (ctrl+f “size principle”) 
[2] Languages do have irregularities, where historic patterns have been obscured by sound change or other processes. But there’s a reason irregularities or fossilized forms tend to occur in commonly-used words and phrases rather than rarely used ones: it is harder to remember variant patterns for rarely used words, and so they tend to become regular by analogy.
[3] Cf. Ferdinand de Saussure, Course in General Linguistics. This is one of those texts that was mind-blowing to me when we read it in our Critical Theory course in undergrad, but now seems so obvious as to not be worth discussing. The key insight can be summed up very succinctly, though: human brains care about differences between symbols, not their absolute values. When it comes to the kind of meaningful differentiation required for communication, it’s the relative differences between signs that matter--so any sign-system can be simplified to the minimum required number of distinctions, without the loss of information or without impeding communication. This insight is relevant to everything from linguistics to information theory.
[4] Apparently irregular sound changes--why does the Early Modern English sound spelled <gh> get pronounced as /f/ in “enough,” but is silent in “through”?--are usually the result of patterns being obscured by analogy or borrowing. In this case, it’s because the prestige dialect of English that coalesced around London in the Early Modern period, and was influenced by speakers of English from all over England, sometimes borrowed words from other dialects that had undergone different sound changes. In some of those dialects, the <gh>-sound was lost. In others, it changed to /f/.
[5] The only sounds in either Rotokas or Hawaiian given above that don’t crop up as a phoneme or allophone in English are probably [β ɣ]. The former is just [v] pronounced with only the lips; the latter, the voiced equivalent of German or Scottish [x].
[6] For instance, the Middle English long vowels /iː eː ɛː aː ɔː oː uː/, had as their distinctive feature their length, not the exact contour of their sound. That meant that these long vowels could “break,” becoming diphthongs, but as long as they remained mostly distinct from one another, no confusion resulted. That breaking, plus the general reorganization of the vowel system that changed the pitch of the pure long vowels (the high ones, which could not acquire a high offglide because there was no space above them acoustically) later yielded the corresponding modern long vowels /aɪ i: i: eɪ aʊ u: oʊ/
15 notes · View notes