The Mandarin language (traditional Chinese 官話, simplified 官话; pinyin Guānhuà) has the most speakers in the world—approximately 900,000,000, or about a seventh of the planet's population. Most speakers live in China, where there are various forms of Mandarin, from the Beijing-based standard to close varieties in and around the Chinese capital city and those more distantly related towards the south and centre of the country. Many other Chinese learn the language at school, making it the lingua franca of the Chinese-speaking world, and increasingly people from much further afield are coming to Mandarin as a second or foreign language. Exact figures for the number of speakers vary depending on how close the speakers' native or non-native form is to the speech of northern China; often what is labelled 'Mandarin' may linguistically be more like a separate language.
The term Chinese language very often refers to Mandarin, though in Chinese culture this is the name for all varieties of Chinese, many of which constitute separate languages on linguistic grounds. 'Standard Chinese' usually means Standard Mandarin, which is based on a dialect once spoken by the educated elites of Beijing. Mandarin is an official language in China, including Taiwan, and also in Singapore.
The following sections refer to Standard Mandarin.
The phonology of Mandarin is well-studied as a textbook example of a tone language, where variations in pitch are all that is necessary to distinguish one word or syllable from another in meaning, as well as the more familiar distinctions made through contrasting phonemes. There are four tones in Mandarin, so the following are minimal pairs: 妈 mā (high-level tone, 'ma' as in 'mama'), 麻 má (high-rising, 'hemp'), 马 mǎ (fall-rise, 'horse') and 骂 mà (falling, 'scold'). A well-known example sentence including these four meanings is: māma qi mǎ, mǎ chi má, māma mà mǎ (妈妈骑马,马吃麻,妈妈骂马 'mother rides a horse, the horse eats hemp, mother scolds the horse').
Most Mandarin syllables end with a vowel, with only the nasal consonants [n] and [ŋ] permitted syllable-finally (and, for some speakers, [r]). These nasals are arguably a single phoneme which varies depending on whether the preceding vowel is pronounced towards the front or back of the mouth, i.e. /n/ surfaces as [ŋ] after a back vowel. [n], [ŋ] and [m] can be syllabic, usually in interjections. A single consonant only is permitted as the beginning or 'onset' of the syllable; this position cannot be left unfilled. Words usually have one or two syllables. The claims that Mandarin is a basically monosyllabic language, and that it gained many disyllabic words through compounding to compensate for final 'coda' consonants of the syllable disappearing, are disputed. Disyllabic words have long been possible in Mandarin, often as variants of monosyllabic words: mei-tan 'coal-charcoal' does not mean 'coal and charcoal', but just 'coal'. Furthermore, homophones can be disambiguated by context, and many disyllabic words are of relative recent coinage, appearing centuries after the loss of many final consonants.
Traditional Chinese dictionaries list only single characters, which also gives the false impression that Mandarin is monosyllabic. In fact, the majority of Mandarin words are polysyllabic, and what appears to be many words that are homophones, such as dozens of characters read as yì in Mandarin, are in fact separately listed morphemes of polysyllabic words, that in most cases have no independent existence as words.
Chinese characters (simplified Chinese 汉字; traditional Chinese: 漢字; hànzì) are symbols used to write varieties of Chinese and—in modified form—other languages. They are the world's oldest writing system in that they have the longest record of continuous use, dating back thousands of years. Characters in mainland China are written in a 'simplified' form, whereas elsewhere 'traditional' characters are maintained. A full list of characters would run to over 47,000 entries, but most of these are variants or obsolete; standardisation took centuries, and most literate users today know up to about 4,000. Characters can be written vertically, in columns from right to left, but it is increasingly common to see them written horizontally, left to right (newspapers take advantage of this to display articles both vertically and horizontally on the same page).
A few characters, such as 山 'mountain', somewhat resemble that which they represent, but only a small number of characters are like this and it usually difficult to recognise the meaning. Most characters are completely abstract, or what they mean is only obvious with hindsight: 馬, for example. A minority of characters, particularly some of the more frequent ones, represent words without indicating pronunciation at all (i.e. they are logographic): for example, 日 means 'sun', and is pronounced rì. However, the vast majority of characters include a pronunciation element, which gives an idea of the 'reading' (pronunciation) of the character. For example, 机 'machine', pronounced jī, incorporates a 'radical' 木 ('tree'; 'wood') which may give an idea of the meaning, and a 'phonetic' 几 which indicates the pronunciation. The phonetic represents the pronunciation of another character, whose own meaning is irrelevant (in this case, 几 jī means '[small] table'). In the same way, a children's code might use a picture of an eye to represent 'I'. The components of the character do not mean anything in themselves; 机 does not mean 'wood[en] table', for example.
A more accurate way of describing the nature of characters would refer to both the meaning and pronunciation elements found in most characters. A linguistic approach might identify most characters as 'morphosyllabic'—morphemic, in that they represent basic units of meaning (morphemes), and syllabic, in that in Chinese most characters represent a single syllable, an abstract unit of phonology; Chinese characters effectively constitute a morphemic syllabary, or script based on syllables—albeit one that contains around 850 'phonetics' to represent the 1,277 possible syllables of e.g. modern Mandarin, and thus requires an extra meaning component to distinguish them.
- Ethnologue: 'Chinese, Mandarin'. Note, however, that this figure is disputable depending on what counts as 'Mandarin', whether it is a native language for a group of speakers, and so on.
- Many traditional analyses, however, assume that there is really little distinction between 'word' and 'morpheme' in Chinese varieties; see Duanmu (2000: 146).
- Duanmu (2000: 150-154).
- Duanmu (2000: 146); DeFrancis (1984: 177-188).
- Kennedy (1964: 116-117); DeFrancis (1984: 183-184).
- 'Horse'. The lower strokes were once the legs.
- DeFrancis (1984: 187).
- DeFrancis (1984: 97-104, 111).