ISO Country and Language Codes: The Definitive Guide

Localization
Roman Hresko
22 Nov 2024

15 min. read

Contents

To streamline global communication, the International Organization for Standardization (ISO) has standardized nomenclature for classifying languages. That’s how ISO country and language codes came about.

Whether you create a multilingual website or localize your app, you’ll likely encounter ISO codes. But which ones should you use?

Two-letter ISO 639-1 language codes or more granular three-letter ISO 639-3 codes? Or maybe it’s best to forgo language codes and opt for ISO 3166 country codes instead?

This definitive guide will break down the differences between ISO country codes and ISO language codes to help you choose the right ones for your project.

What are ISO 639 language codes?

ISO 639 codes are unique abbreviations used to declare languages of web pages, apps, and software. These two- or three-letter codes help identify and display language-specific content for the user.

💡Example: A two-letter ISO 639-1 code for English is en and a three-letter ISO 639-2 code for English is eng. The same basic pattern holds for other languages.

Language codes vs country codes vs locale codes

Before we move any further, let’s unpack the difference between language codes, country codes, and their combination known as locale codes.

Language codes

Want to show localized (translated) content to your Spanish-speaking or French-speaking visitors?

Use the language codes: es or fr.

Country codes

Want to show prices specific to Canada or the United States?

Use the country codes: CA or US.

Locale codes (identifiers) [BCP 47]

Want to deliver both country- and language-specific content to your French-speaking visitors in France and Canada?

Use the locale identifiers that combine country and language codes: fr-FR and fr-CA.

Types of ISO 639 language codes

ISO 639 is a family of standards that has been augmented several times to cover all known languages.

Currently in use are four ISO 639 code sets:

  • ISO 639-1
  • ISO 639-2
  • ISO 639-3
  • ISO 639-5.

Let’s break them down:

ISO 639-1

ISO 639-1 defines two-letter codes.

💡Example: ar for Arabic

ISO 639-1 covers 184 languages and is used in web localization, where two-letter coding is sufficient. Although ISO 639-1 codes are not case-sensitive, the are commonly used in lower case.

ISO 639-2

ISO 639-2 defines three-letter codes for a wider range of languages than ISO 639-1.

💡Example: fre (or fra) for French.

ISO 639-2 covers 487 languages and is used in library cataloging and media metadata, where more granularity is required.

ISO 639-3

ISO 639-3 also defines three-letter codes, expanding on ISO 639-2 to include nearly all known languages.

💡Example: nci for Classical Nahuatl.

ISO 639-3 covers 7,892 languages, including modern, historical, and minority languages, and is widely used in linguistics and cultural preservation studies.

ISO 639-5

ISO 639-5 defines three-letter codes for language families and groups rather than individual languages.

💡Example: ine for Indo-European languages.

ISO 639-5 covers 115 codes and is used in linguistic research and grouping languages into broader categories.

⚠️Notice two things⚠️

First, both ISO 639-2 and ISO 639-3 cover three-letter codes. Although released later, ISO 639-3 doesn’t supersede ISO 639-2. Instead, it complements it by covering major, minority, and extinct languages.

Second, notice how I omitted ISO 639-4. That’s because it’s not a set of codes but a document governing the application of ISO codes. Essentially, a technical manual.

A table of ISO 639-1, ISO 639-2, and ISO 639-3 codes

Below is a table containing all two-letter ISO 639-1 codes and corresponding three-letter ISO 639-2 and 639-3 codes:

Language ISO 639-1 ISO 639-2 ISO 639-3
Abkhazian ab abk abk
Afar aa aar aar
Afrikaans af afr afr
Akan ak aka aka
Albanian sq alb alb
Amharic am amh amh
Arabic ar ara ara
Armenian hy arm arm
Assamese as asm asm
Avaric av ava ava
Aymara ay aym aym
Azerbaijani az aze aze
Bashkir ba bashk bashk
Basque eu eus eus
Belarusian be bel bel
Bengali bn ben ben
Bislama bi bis bis
Bosnian bs bos bos
Breton br bre bre
Bulgarian bg bul bul
Burmese my bur bur
Catalan ca cat cat
Chichewa ny che che
Chinese zh chi (zho) zho
Chuvash cv chv chv
Cornish kw cor cor
Corsican co cos cos
Croatian hr cro cro
Czech cs cze ces
Danish da dan dan
Dutch nl dut nld
Dzongkha dz dzo dzo
English en eng eng
Esperanto eo epo epo
Estonian et est est
Ewe ee ewe ewe
Faroese fo fao fao
Fijian fj fij fij
Finnish fi fin fin
French fr fre (fra) fra
Fula ff ful ful
Georgian ka geo geo
German de ger (deu) deu
Greek el gre ell
Greenlandic kl kal kal
Guarani gn gua gua
Gujarati gu guj guj
Haitian Creole ht hat hat
Hausa ha hau hau
Hebrew he heb heb
Hindi hi hin hin
Hmong hmn hmong hmn
Hungarian hu hun hun
Icelandic is ice isl
Ido io ido ido
Igbo ig ibo ibo
Indonesian id ind ind
Interlingua ia ina ina
Irish ga gle gle
Italian it ita ita
Japanese ja jpn jpn
Javanese jv jav jav
Kannada kn kan kan
Kazakh kk kaz kaz
Khmer km khm khm
Kikuyu ki kik kik
Kinyarwanda rw kin kin
Kirghiz ky kir kir
Korean ko kor kor
Kurdish ku kur kur
Lao lo lao lao
Latin la lat lat
Latvian lv lav lav
Lingala ln lin lin
Lithuanian lt lit lit
Luxembourgish lb lux ltz
Macedonian mk mac mkd
Malagasy mg mlg mlg
Malay ms may msa
Malayalam ml mal mal
Maltese mt mlt mlt
Maori mi mao mri
Marathi mr mar mar
Mongolian mn mon mon
Nepali ne nep nep
Norwegian no nor nor
Nyanja ny nya nya
Odia (Oriya) or ori ori
Pashto ps pus pus
Persian fa per per
Polish pl pol pol
Portuguese pt por por
Quechua qu que que
Romanian ro ron ron
Russian ru rus rus
Samoan sm smo smo
Sardinian sc srd srd
Serbian sr srp srp
Sesotho st sot sot
Shona sn sna sna
Sindhi sd snd snd
Sinhala si sin sin
Slovak sk slo slk
Slovenian sl slv slv
Somali so som som
Spanish es spa spa
Sundanese su sun sun
Swahili sw swa swa
Swedish sv swe swe
Tajik tg taj tgk
Tamil ta tam tam
Tatar tt tat tat
Telugu te tel tel
Thai th tha tha
Tibetan bo tib tib
Tigrinya ti tig tig
Tonga to ton ton
Turkish tr tur tur
Turkmen tk tuk tuk
Ukrainian uk ukr ukr
Urdu ur urd urd
Uzbek uz uzb uzb
Vietnamese vi vie vie
Welsh cy wel cym
Xhosa xh xho xho
Yiddish yi yid yid
Yoruba yo yor yor
Zulu zu zul zul

IETF language tags

IETF language tags are alphanumeric codes (letters and numbers) used to identify human languages on the web. The tag sets are composed of subtags separated by hyphens. Subtags are Latin letters or digits.

Why do you need IETF codes if ISO 639 codes already exist?

Some languages have two or more writing systems, making it necessary to specify the preferred system for localization. Essentially, IETF tags are codes for scripts.

Examples:

sr for Serbian in any script
sr-Latn for Serbian in Latin script
sr-Latn-fonapi for Serbian in Latin script with phonetic transcription
sr-Cyrl for Serbian in Cyrillic script

IETF tag sets are documented and regulated by the Internet Engineering Task Force (IETF). They are commonly used in computing standards and formats such as HTTP, HTML, XML, and PNG.

A table of IETF language tags

Below is a table containing IETF language tags:

Language IETF Tag Usage
Afrikaans af Afrikaans language
Amharic am Amharic language
Arabic ar Default Arabic
Arabic (Egypt) ar-EG Arabic as spoken in Egypt
Arabic (Saudi Arabia) ar-SA Arabic as spoken in Saudi Arabia
Armenian hy Armenian language
Azerbaijani (Cyrillic) az-Cyrl Azerbaijani written in Cyrillic script
Azerbaijani (Latin) az-Latn Azerbaijani written in Latin script
Basque eu Basque language
Bengali bn Bengali language
Bosnian (Latin) bs-Latn Bosnian written in Latin script
Bulgarian bg Bulgarian language
Catalan ca Catalan language
Chinese (China) zh-CN Simplified Chinese as used in China
Chinese (Hong Kong) zh-HK Traditional Chinese as spoken in Hong Kong
Chinese (Simplified) zh-Hans Simplified Chinese script
Chinese (Taiwan) zh-TW Traditional Chinese as spoken in Taiwan
Chinese (Traditional) zh-Hant Traditional Chinese script
Croatian hr Croatian language
Czech cs Czech language
Danish da Danish language
Dutch nl Dutch language
English en Default English
English (United Kingdom) en-GB English as used in the United Kingdom
English (United States) en-US English as used in the United States
Esperanto eo Esperanto language
Estonian et Estonian language
Filipino fil Filipino language
Finnish fi Finnish language
French fr Default French
French (Canada) fr-CA French as spoken in Canada
French (France) fr-FR French as spoken in France
Galician gl Galician language
Georgian ka Georgian language
German de Default German
German (Germany) de-DE German as spoken in Germany
German (Switzerland) de-CH German as spoken in Switzerland
Greek el Greek language
Hausa ha Hausa language
Hebrew he Hebrew language
Hindi hi Hindi language
Hungarian hu Hungarian language
Icelandic is Icelandic language
Igbo ig Igbo language
Irish ga Irish Gaelic
Italian it Italian language
Italian (Italy) it-IT Italian as spoken in Italy
Japanese ja Japanese language
Kazakh kk Kazakh language
Korean ko Korean language
Kurdish (Kurmanji) ku-Latn Kurdish (Kurmanji) written in Latin script
Kurdish (Sorani) ku-Arab Kurdish (Sorani) written in Arabic script
Latvian lv Latvian language
Lithuanian lt Lithuanian language
Malay ms Malay language
Maltese mt Maltese language
Norwegian (Bokmål) nb Norwegian Bokmål
Norwegian (Nynorsk) nn Norwegian Nynorsk
Pashto ps Pashto language
Persian fa Persian language (Farsi)
Polish pl Polish language
Portuguese pt Default Portuguese
Portuguese (Brazil) pt-BR Portuguese as spoken in Brazil
Portuguese (Portugal) pt-PT Portuguese as spoken in Portugal
Romanian ro Romanian language
Russian ru Russian language
Scottish Gaelic gd Scottish Gaelic
Serbian (Cyrillic) sr-Cyrl Serbian written in Cyrillic script
Serbian (Latin) sr-Latn Serbian written in Latin script
Serbo-Croatian (Latin) sh-Latn Serbo-Croatian written in Latin script
Slovak sk Slovak language
Slovenian sl Slovenian language
Somali so Somali language
Spanish es Default Spanish
Spanish (Latin America) es-419 Spanish as spoken in Latin America
Spanish (Spain) es-ES Spanish as spoken in Spain
Swahili sw Swahili language
Swedish sv Swedish language
Tajik tg Tajik language
Tamil ta Tamil language
Telugu te Telugu language
Thai th Thai language
Tigrinya ti Tigrinya language
Turkish tr Turkish language
Ukrainian uk Ukrainian language
Urdu ur Urdu language
Uzbek (Cyrillic) uz-Cyrl Uzbek written in Cyrillic script
Uzbek (Latin) uz-Latn Uzbek written in Latin script
Vietnamese vi Vietnamese language
Welsh cy Welsh language
Xhosa xh Xhosa language
Yoruba yo Yoruba language
Zulu zu Zulu language

The structure of IETF language tags (BCP 47)

The structure of IETF tags is defined by the Best Current Practice (BCP 47) document. The document specifies guidelines for forming IETF tags and their extensions. It also details the best practices for tag implementation.

ISO 3166 country codes

ISO 3166 country codes are alphanumeric codes for identifying countries, territories, and special areas. The codes are used in IETF language tags as two-letter, three-letter, or numeric symbols.

Examples:

US for the United States
FR for France
840 for the United States
250 for France

IETF language tags containing ISO 3166 country code look as follows:

en-US for English used in the United States
fr-FR for French used in France

Types of ISO 3166 country codes

There are three types of ISO 3166 country codes:

ISO 3166-1

ISO 3166-1 is used to identify countries and their subdivisions. The code has three distinct sets:

  • ISO 3166-1 alpha 2 are two-letter codes for representing country subdivisions. Examples: US (United States), FR (France), JP (Japan).

  • ISO 3166-1 alpha 3 are three-letter codes for representing country subdivisions
    Examples: USA (United States), FRA (France), JPN (Japan).

  • ISO 3166-1 numeric are three-digit country codes used by systems without Latin scripts.
    Examples: 840 (United States), 250 (France), 392 (Japan).

ISO 3166-2

ISO 3166-2 defines administrative territories such as provinces, states, and regions of countries listed in ISO 3166-1.

Examples:

  • US-CA (California, United States)
  • CA-QC (Quebec, Canada)
  • IN-UP (Uttar Pradesh, India)
  • FR-75 (Paris, France)
  • DE-BY (Bavaria, Germany)

ISO 3166-3

ISO 3166-3 are codes for country names removed from ISO 3166-1. The changes were necessitated by countries’ dissolutions, separations, or mergers.

Examples:

  • CSK (Czechoslovakia) – A country split into the Czech Republic and Slovakia in 1993.
  • YUG (Yugoslavia) – A country dissolved in 1992 into several independent nations.
  • DDR (East Germany) – A country unified with West Germany in 1990.
  • TAN (Tanganyika) – A historical region that became a part of Tanzania.
  • SFR (Soviet Union) – The Soviet Union dissolved into independent countries in 1991.

A table of ISO 3166-1 codes (alpha 2, alpha 3, and numeric)

Here’s a table of ISO 3166 codes containing alpha 2, alpha 3, and numeric sets:

Country/Region ISO 3166-1 Alpha-2 ISO 3166-1 Alpha-3 ISO 3166-1 Numeric
Andorra AD AND 020
United Arab Emirates AE ARE 784
Afghanistan AF AFG 004
Albania AL ALB 008
Argentina AR ARG 032
Australia AU AUS 036
Austria AT AUT 040
Bahamas BS BHS 044
Belgium BE BEL 056
Brazil BR BRA 076
Canada CA CAN 124
China CN CHN 156
Colombia CO COL 170
Czechia CZ CZE 203
Denmark DK DNK 208
Egypt EG EGY 818
Finland FI FIN 246
France FR FRA 250
Germany DE DEU 276
Greece GR GRC 300
Hungary HU HUN 348
India IN IND 356
Indonesia ID IDN 360
Ireland IE IRL 372
Israel IL ISR 376
Italy IT ITA 380
Jamaica JM JAM 388
Japan JP JPN 392
Kenya KE KEN 404
Malaysia MY MYS 458
Mexico MX MEX 484
Netherlands NL NLD 528
New Zealand NZ NZL 554
Nigeria NG NGA 566
Norway NO NOR 578
Pakistan PK PAK 586
Philippines PH PHL 608
Poland PL POL 616
Portugal PT PRT 620
Qatar QA QAT 634
Russia RU RUS 643
Saudi Arabia SA SAU 682
Singapore SG SGP 702
South Africa ZA ZAF 710
South Korea KR KOR 410
Spain ES ESP 724
Sweden SE SWE 752
Switzerland CH CHE 756
Thailand TH THA 764
Turkey TR TUR 792
United Kingdom GB GBR 826
United States US USA 840
Vietnam VN VNM 704
Zimbabwe ZW ZWE 716

ISO code and country tag usage in localization

Now that you know what ISO codes are, let’s see why and how you can use them in localization.

Reasons to use ISO codes and country tags

With the help of ISO codes and country tags, you can efficiently deliver localized content to your audience. Other reasons to use ISO codes for localization are:

  • Improved user experience
    By using ISO codes, you can serve your website or app users content in their preferred language and relevant to their region.
  • Interoperability
    ISO codes are universally recognized across platforms and can ensure integration between tools and systems.
  • Compatibility with other standards
    ISO codes are compatible with currency codes, like ISO 4217, and time zone codes, like IANA.
  • Efficient data management
    APIs and web services often rely on ISO codes to exchange localization data.

How to use ISO codes and country tags

With the groundwork laid, let’s put ISO codes and country tags into practice:

App and software localization

Here’s how you might use ISO codes and country tags when localizing your Java app:

In Java, the Locale class carries information about language and region. And, in rare cases, scripts and dialects. You can create a Locale object using ISO 639-1 codes and ISO 3166-1 country tags as follows:

val localeEnUS \= new Locale("en", "US")  

Let’s break it down:

  1. en is the first parameter that specifies the language tag (ISO 639-1).
  2. US is the second parameter that specifies the country tag (ISO 3166-1).

In this case, you’re serving your users content in the version of English spoken in the US.

For more complex locales with a larger number of parameters, you can use the Locale.Builder class:

Locale localeSerbianCyrillic \= new Locale.Builder()  
    .setLanguage("sr")     
    .setScript("Cyrl")     
    .setRegion("RS")       
    .build();  

Here we set three parameters:

  1. Language
    sr is the ISO 639-1 language code for Serbian
  2. Script
    Cyrl specifies that the Cyrillic script is used
  3. Region
    RS is the ISO 3166-1 Alpha-2 code for Serbia

Using the Locale.Builder class and ISO codes, you can align your app with IETF BCP 47 standards for broad compatibility.

ISO codes in HTML

In HTML, the lang attribute specifies the language of the content on your website’s page. This attribute helps search engines and browsers recognize and render the content in the correct language.

ISO 639-1 (two-letter) codes are typically used for language tags, but ISO 3166-1 (country) codes can be added to specify a region or country variant.

For example:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>English Content</title>
</head>
<body>
    <h1>Hi, Centus fan!</h1>
    <p>Keep exploring the world of localization.</p>
</body>
</html>

Here, the lang attribute is set to "en", which specifies that the content is in English, according to ISO 639-1 code.

For regional variants like French in Canada, use the fr-CA lang attribute.

ISO codes in HTTP

In HTTP requests, the Accept-Language header is used to indicate the user’s language preferences. The user’s browser sends the header to the server, specifying the most suitable language for the content.

For HTTP headers, ISO 639-1 and ISO 3166-1 codes are combined to indicate language preferences, sometimes with a quality value (q-value) to show preference order.

Accept-Language: en-US,en;q=0.9,es;q=0.8

Let’s break it down:

  • en-US: The user prefers English (United States) as the primary language.
  • en;q=0.9: If US English is not available, the user prefers general English with a slightly lower preference (q=0.9).
  • es;q=0.8: The user also accepts Spanish but with the lowest preference of q=0.8.

What ISO codes to choose?

When localizing an app or software, should you use two-letter or three-letter ISO 639 codes?

In most localization scenarios, use two-letter ISO 639-1 codes. They are compatible across platforms and recommended by IETF BCP 47 standards.

Below is an example of an app’s JSON file (localization file) that uses ISO 639-1 codes:

{  
    "en": "Hello",  
    "fr": "Bonjour",  
    "es": "Hola"  
}  

However, if your app needs to support specialized languages such as Ancient Greek or Sumerian, opt for ISO 639-3 codes.

Here’s how your app’s JSON file with ISO 639-3 codes might look like:

{  
    "grc": "Ἀρχαία Ἑλληνικὴ γλῶσσα",   
    "sux": "𒂍𒍑𒂠"                   
}

Parting thoughts

Now you should know everything there is to know about ISO language codes and country tags. You’re all set to localize your app, software, or website.

To make your localization experience even easier, try Centus.

Centus is a localization management platform built for teams. Bring translators, editors, developers, designers, and managers on the platform where they can cooperate seamlessly to localize your product for maximum impact.

Try Centus now!

Get the week's best content!

By subscribing, you are agreeing to have your personal information managed in accordance with the terms of Centus Privacy Policy ->

Enjoyed the article?

Share it with your colleagues and partners 🤩