ISO criteria for defining new languages

From Meta, a Wikimedia project coordination wiki
This supplementary information page has been superseded by the Language proposal policy.


The ISO has become the de facto source for language classification for most information-management and software systems. Its central standard for identifying languages is ISO 639, which comprises two current standards (639-1 and 639-2) and a proposed standard (639-3). The first two standards aim to cover the most important languages for archiving and classifying written documents; while the third hopes to fill in any remaining gaps and classify all world languages of any size. A body of eight work towards consensus on each vote.

NB: there are other standards bodies which address language classification and tagging, with less formailty and generally less uniform results. See also the IANA list of language-tags.

Criteria for adding new languages to ISO 639 Standards[edit]

ISO 639-1 (two-letter codes)[edit]

for languages most frequently useful in terminology, lexicography and linguistics, and most frequently represented in [historical] world literature.

Criteria for new language requests (from the standard, emphasis added)

  1. See ISO 639-2 criteria [below]. Any language request must also satisfy the requirements for ISO 639-2, as that list is a superset of ISO 639-1.
  2. Documentation : a significant body of documents, in both common and specialized language.
    - A significant body of existing documents (specialized texts, such as college or university textbooks, technical documentation manuals, specialized journals, subject-field related books, etc.) written in specialized languages
    - A number of existing terminologies in various subject fields (e.g. technical dictionaries, specialized glossaries, vocabularies, etc. in printed or electronic form)
  3. Recommendation by some organization
    A recommendation and support of a specialized authority (such as a standards organization, governmental body, linguistic institution, or cultural organization)
  4. Other considerations
    - the number of speakers of the language community
    - the recognized status of the language in one or more countries
    - the support of the request by one or more official bodies


ISO 639-2 (three-letter codes)[edit]

designed to supplement ISO 639.1, including language collections, and smaller languages which are also useful in terminology and bibliography.

Criteria for new language requests (from the standard, emphasis added)

  1. Documentation: At least 50 separate documents (but see below)
    The request for a new language code shall include evidence that one agency holds 50 different documents in the language or that five agencies hold a total of 50 different documents among them in the language. Documents include all forms of material and is not limited to text.
  2. Collective codes
    If the documentation criteria above are not met, a language may still be assigned a new or existing collective language code. The words languages or other as part of a language name indicates that a language code is a collective one.
  3. Dialects: on a case-by-case basis
    A dialect of a language is usually represented by the same language code as that used for the language. If the language is assigned to a collective language code, the dialect is assigned to the same collective language code. The difference between dialects and languages will be decided on a case-by-case basis.
  4. Scripts: one per language code
    A single language code is normally provided for a language even [when] the language is written in more than one script. ISO DIS 15924, Codes for the representation of names of scripts, is under development.
  5. Orthography: one per language code
    A language using more than one orthography is not given multiple language codes.