Semantic universals of classifier systems

The aim of the project is to examine classifiers, a type of noun categorization that is widespread in the world’s languages. Classifiers exhibit a remarkable diversity in the meanings they convey and the morphological and syntactic contexts in which they occur.

Several types of classifiers are distinguished, including the most well-known numeral classifiers which occur with numerals, as in Mandarin Chinese in the phrase yí liàng chē (one vehicle car) ‘one car’, where liàng is the classifier for vehicles. Other classifier types include noun, possessive, verbal, deictic, and locative classifiers. Classifiers are found in typologically diverse languages, ranging from the analytic languages of Southeast Asia to the polysynthetic languages of North America.

A wide range of semantic categories serve as the basis for categorization in classifier systems, including animate vs. inanimate, male vs. female, and social status, together with categories found among inanimates, including physical properties such as shape and size as well as function and value. These properties tend to correlate with the type of classifier system. For example, possessive classifiers in Oceanic languages distinguish such categories as valuable, edible, and drinkable, while verbal classifiers in the Athabaskan languages of North America are used to classify objects as, e.g., long, round, and flat/flexible.

Goal of the project

This project aims to fill in gap in research by constructing a database of classifier types in the world’s languages and determining the distribution of semantic values in classifier languages. We will create a database of 3000+ languages that will offer:

A comprehensive and innovative set of tools for classifier analysis.
The ability to make both descriptive and theoretical generalizations about classifier systems.
Support for holistic approaches as well as detailed case studies.
Ability to search for different types of classifiers and specific semantic meanings.

The database will be based on data available from over 7000 grammars and other descriptions, and from open-access databases under a Creative Commons Licence. Automatic detection followed by manual checking will be used to identify the classifier types and semantic categories found in classifier languages.

We will thus try to answer the following questions:

Does a language have a specific type of classifier, e.g., a numeral classifier or a possessive classifier?
Which semantic categories, e.g., animate, human, long, and round are found in the classifier system of that language?

Based on the compiled data, we will examine fundamental aspects of classifier systems:

the diversity of types of classifiers found in the world’s languages;
the diversity of semantic categories used for categorization of humans, animals, and inanimate objects;
the distinction between universal vs. language-specific features;
correlations between classifier type and the meanings that are expressed.

For example, with regard to the universality of semantic categories, our aim is to determine which categories are more common and which ones tend to occur in particular language families or areas. As regards correlations between classifier types and their semantics, we will establish whether such correlations are indeed found, or whether our knowledge of the meanings expressed by classifier systems is influenced by available descriptions of the most well-known languages representing particular types, e.g., of Mandarin Chinese or Japanese in the case of numeral classifiers.

Project duration

September 2023 – September 2026

Database

The database will be based on data available from over 7000 grammars and other descriptions, and from open-access databases under a Creative Commons Licence. We use the Gramfinder software as a tool to access the published materials for each language and assess which semantic values and which types of classifiers are found in each language. Automatic detection followed by manual checking is used to identify the classifier types and semantic categories found in classifier languages.

Quantitative analyses controlling for geographic area and language family will be conducted at the global scale to identify the universality and specificity of each semantic feature and classifier type. The interaction between classifier types and semantic values will also be assessed independently. Finally, the interaction among semantic values, classifier types, area, family, and cultural traits will be assessed by means of quantitative methods such as mixed models and decision trees.

At the moment, the database is under development.