ISO Country and Language Codes: The Definitive Guide
To streamline global communication, the International Organization for Standardization (ISO) has standardized nomenclature for classifying languages. That’s how ISO country and language codes came about.
Whether you create a multilingual website or localize your app, you’ll likely encounter ISO codes. But which ones should you use?
Two-letter ISO 639-1 language codes or more granular three-letter ISO 639-3 codes? Or maybe it’s best to forgo language codes and opt for ISO 3166 country codes instead?
This definitive guide will break down the differences between ISO country codes and ISO language codes to help you choose the right ones for your project.
What are ISO 639 language codes?
ISO 639 codes are unique abbreviations used to declare languages of web pages, apps, and software. These two- or three-letter codes help identify and display language-specific content for the user.
💡Example: A two-letter ISO 639-1 code for English is en and a three-letter ISO 639-2 code for English is eng. The same basic pattern holds for other languages.
Language codes vs country codes vs locale codes
Before we move any further, let’s unpack the difference between language codes, country codes, and their combination known as locale codes.
Language codes
Want to show localized (translated) content to your Spanish-speaking or French-speaking visitors?
Use the language codes: es or fr.
Country codes
Want to show prices specific to Canada or the United States?
Use the country codes: CA or US.
Locale codes (identifiers) [BCP 47]
Want to deliver both country- and language-specific content to your French-speaking visitors in France and Canada?
Use the locale identifiers that combine country and language codes: fr-FR and fr-CA.
Types of ISO 639 language codes
ISO 639 is a family of standards that has been augmented several times to cover all known languages.
Currently in use are four ISO 639 code sets:
- ISO 639-1
- ISO 639-2
- ISO 639-3
- ISO 639-5.
Let’s break them down:
ISO 639-1
ISO 639-1 defines two-letter codes.
💡Example: ar for Arabic
ISO 639-1 covers 184 languages and is used in web localization, where two-letter coding is sufficient. Although ISO 639-1 codes are not case-sensitive, the are commonly used in lower case.
ISO 639-2
ISO 639-2 defines three-letter codes for a wider range of languages than ISO 639-1.
💡Example: fre (or fra) for French.
ISO 639-2 covers 487 languages and is used in library cataloging and media metadata, where more granularity is required.
ISO 639-3
ISO 639-3 also defines three-letter codes, expanding on ISO 639-2 to include nearly all known languages.
💡Example: nci for Classical Nahuatl.
ISO 639-3 covers 7,892 languages, including modern, historical, and minority languages, and is widely used in linguistics and cultural preservation studies.
ISO 639-5
ISO 639-5 defines three-letter codes for language families and groups rather than individual languages.
💡Example: ine for Indo-European languages.
ISO 639-5 covers 115 codes and is used in linguistic research and grouping languages into broader categories.
⚠️Notice two things⚠️
First, both ISO 639-2 and ISO 639-3 cover three-letter codes. Although released later, ISO 639-3 doesn’t supersede ISO 639-2. Instead, it complements it by covering major, minority, and extinct languages.
Second, notice how I omitted ISO 639-4. That’s because it’s not a set of codes but a document governing the application of ISO codes. Essentially, a technical manual.
A table of ISO 639-1, ISO 639-2, and ISO 639-3 codes
Below is a table containing all two-letter ISO 639-1 codes and corresponding three-letter ISO 639-2 and 639-3 codes:
Language | ISO 639-1 | ISO 639-2 | ISO 639-3 |
---|---|---|---|
Abkhazian | ab | abk | abk |
Afar | aa | aar | aar |
Afrikaans | af | afr | afr |
Akan | ak | aka | aka |
Albanian | sq | alb | alb |
Amharic | am | amh | amh |
Arabic | ar | ara | ara |
Armenian | hy | arm | arm |
Assamese | as | asm | asm |
Avaric | av | ava | ava |
Aymara | ay | aym | aym |
Azerbaijani | az | aze | aze |
Bashkir | ba | bashk | bashk |
Basque | eu | eus | eus |
Belarusian | be | bel | bel |
Bengali | bn | ben | ben |
Bislama | bi | bis | bis |
Bosnian | bs | bos | bos |
Breton | br | bre | bre |
Bulgarian | bg | bul | bul |
Burmese | my | bur | bur |
Catalan | ca | cat | cat |
Chichewa | ny | che | che |
Chinese | zh | chi (zho) | zho |
Chuvash | cv | chv | chv |
Cornish | kw | cor | cor |
Corsican | co | cos | cos |
Croatian | hr | cro | cro |
Czech | cs | cze | ces |
Danish | da | dan | dan |
Dutch | nl | dut | nld |
Dzongkha | dz | dzo | dzo |
English | en | eng | eng |
Esperanto | eo | epo | epo |
Estonian | et | est | est |
Ewe | ee | ewe | ewe |
Faroese | fo | fao | fao |
Fijian | fj | fij | fij |
Finnish | fi | fin | fin |
French | fr | fre (fra) | fra |
Fula | ff | ful | ful |
Georgian | ka | geo | geo |
German | de | ger (deu) | deu |
Greek | el | gre | ell |
Greenlandic | kl | kal | kal |
Guarani | gn | gua | gua |
Gujarati | gu | guj | guj |
Haitian Creole | ht | hat | hat |
Hausa | ha | hau | hau |
Hebrew | he | heb | heb |
Hindi | hi | hin | hin |
Hmong | hmn | hmong | hmn |
Hungarian | hu | hun | hun |
Icelandic | is | ice | isl |
Ido | io | ido | ido |
Igbo | ig | ibo | ibo |
Indonesian | id | ind | ind |
Interlingua | ia | ina | ina |
Irish | ga | gle | gle |
Italian | it | ita | ita |
Japanese | ja | jpn | jpn |
Javanese | jv | jav | jav |
Kannada | kn | kan | kan |
Kazakh | kk | kaz | kaz |
Khmer | km | khm | khm |
Kikuyu | ki | kik | kik |
Kinyarwanda | rw | kin | kin |
Kirghiz | ky | kir | kir |
Korean | ko | kor | kor |
Kurdish | ku | kur | kur |
Lao | lo | lao | lao |
Latin | la | lat | lat |
Latvian | lv | lav | lav |
Lingala | ln | lin | lin |
Lithuanian | lt | lit | lit |
Luxembourgish | lb | lux | ltz |
Macedonian | mk | mac | mkd |
Malagasy | mg | mlg | mlg |
Malay | ms | may | msa |
Malayalam | ml | mal | mal |
Maltese | mt | mlt | mlt |
Maori | mi | mao | mri |
Marathi | mr | mar | mar |
Mongolian | mn | mon | mon |
Nepali | ne | nep | nep |
Norwegian | no | nor | nor |
Nyanja | ny | nya | nya |
Odia (Oriya) | or | ori | ori |
Pashto | ps | pus | pus |
Persian | fa | per | per |
Polish | pl | pol | pol |
Portuguese | pt | por | por |
Quechua | qu | que | que |
Romanian | ro | ron | ron |
Russian | ru | rus | rus |
Samoan | sm | smo | smo |
Sardinian | sc | srd | srd |
Serbian | sr | srp | srp |
Sesotho | st | sot | sot |
Shona | sn | sna | sna |
Sindhi | sd | snd | snd |
Sinhala | si | sin | sin |
Slovak | sk | slo | slk |
Slovenian | sl | slv | slv |
Somali | so | som | som |
Spanish | es | spa | spa |
Sundanese | su | sun | sun |
Swahili | sw | swa | swa |
Swedish | sv | swe | swe |
Tajik | tg | taj | tgk |
Tamil | ta | tam | tam |
Tatar | tt | tat | tat |
Telugu | te | tel | tel |
Thai | th | tha | tha |
Tibetan | bo | tib | tib |
Tigrinya | ti | tig | tig |
Tonga | to | ton | ton |
Turkish | tr | tur | tur |
Turkmen | tk | tuk | tuk |
Ukrainian | uk | ukr | ukr |
Urdu | ur | urd | urd |
Uzbek | uz | uzb | uzb |
Vietnamese | vi | vie | vie |
Welsh | cy | wel | cym |
Xhosa | xh | xho | xho |
Yiddish | yi | yid | yid |
Yoruba | yo | yor | yor |
Zulu | zu | zul | zul |
IETF language tags
IETF language tags are alphanumeric codes (letters and numbers) used to identify human languages on the web. The tag sets are composed of subtags separated by hyphens. Subtags are Latin letters or digits.
Why do you need IETF codes if ISO 639 codes already exist?
Some languages have two or more writing systems, making it necessary to specify the preferred system for localization. Essentially, IETF tags are codes for scripts.
Examples:
sr for Serbian in any script
sr-Latn for Serbian in Latin script
sr-Latn-fonapi for Serbian in Latin script with phonetic transcription
sr-Cyrl for Serbian in Cyrillic script
IETF tag sets are documented and regulated by the Internet Engineering Task Force (IETF). They are commonly used in computing standards and formats such as HTTP, HTML, XML, and PNG.
A table of IETF language tags
Below is a table containing IETF language tags:
Language | IETF Tag | Usage |
---|---|---|
Afrikaans | af | Afrikaans language |
Amharic | am | Amharic language |
Arabic | ar | Default Arabic |
Arabic (Egypt) | ar-EG | Arabic as spoken in Egypt |
Arabic (Saudi Arabia) | ar-SA | Arabic as spoken in Saudi Arabia |
Armenian | hy | Armenian language |
Azerbaijani (Cyrillic) | az-Cyrl | Azerbaijani written in Cyrillic script |
Azerbaijani (Latin) | az-Latn | Azerbaijani written in Latin script |
Basque | eu | Basque language |
Bengali | bn | Bengali language |
Bosnian (Latin) | bs-Latn | Bosnian written in Latin script |
Bulgarian | bg | Bulgarian language |
Catalan | ca | Catalan language |
Chinese (China) | zh-CN | Simplified Chinese as used in China |
Chinese (Hong Kong) | zh-HK | Traditional Chinese as spoken in Hong Kong |
Chinese (Simplified) | zh-Hans | Simplified Chinese script |
Chinese (Taiwan) | zh-TW | Traditional Chinese as spoken in Taiwan |
Chinese (Traditional) | zh-Hant | Traditional Chinese script |
Croatian | hr | Croatian language |
Czech | cs | Czech language |
Danish | da | Danish language |
Dutch | nl | Dutch language |
English | en | Default English |
English (United Kingdom) | en-GB | English as used in the United Kingdom |
English (United States) | en-US | English as used in the United States |
Esperanto | eo | Esperanto language |
Estonian | et | Estonian language |
Filipino | fil | Filipino language |
Finnish | fi | Finnish language |
French | fr | Default French |
French (Canada) | fr-CA | French as spoken in Canada |
French (France) | fr-FR | French as spoken in France |
Galician | gl | Galician language |
Georgian | ka | Georgian language |
German | de | Default German |
German (Germany) | de-DE | German as spoken in Germany |
German (Switzerland) | de-CH | German as spoken in Switzerland |
Greek | el | Greek language |
Hausa | ha | Hausa language |
Hebrew | he | Hebrew language |
Hindi | hi | Hindi language |
Hungarian | hu | Hungarian language |
Icelandic | is | Icelandic language |
Igbo | ig | Igbo language |
Irish | ga | Irish Gaelic |
Italian | it | Italian language |
Italian (Italy) | it-IT | Italian as spoken in Italy |
Japanese | ja | Japanese language |
Kazakh | kk | Kazakh language |
Korean | ko | Korean language |
Kurdish (Kurmanji) | ku-Latn | Kurdish (Kurmanji) written in Latin script |
Kurdish (Sorani) | ku-Arab | Kurdish (Sorani) written in Arabic script |
Latvian | lv | Latvian language |
Lithuanian | lt | Lithuanian language |
Malay | ms | Malay language |
Maltese | mt | Maltese language |
Norwegian (Bokmål) | nb | Norwegian Bokmål |
Norwegian (Nynorsk) | nn | Norwegian Nynorsk |
Pashto | ps | Pashto language |
Persian | fa | Persian language (Farsi) |
Polish | pl | Polish language |
Portuguese | pt | Default Portuguese |
Portuguese (Brazil) | pt-BR | Portuguese as spoken in Brazil |
Portuguese (Portugal) | pt-PT | Portuguese as spoken in Portugal |
Romanian | ro | Romanian language |
Russian | ru | Russian language |
Scottish Gaelic | gd | Scottish Gaelic |
Serbian (Cyrillic) | sr-Cyrl | Serbian written in Cyrillic script |
Serbian (Latin) | sr-Latn | Serbian written in Latin script |
Serbo-Croatian (Latin) | sh-Latn | Serbo-Croatian written in Latin script |
Slovak | sk | Slovak language |
Slovenian | sl | Slovenian language |
Somali | so | Somali language |
Spanish | es | Default Spanish |
Spanish (Latin America) | es-419 | Spanish as spoken in Latin America |
Spanish (Spain) | es-ES | Spanish as spoken in Spain |
Swahili | sw | Swahili language |
Swedish | sv | Swedish language |
Tajik | tg | Tajik language |
Tamil | ta | Tamil language |
Telugu | te | Telugu language |
Thai | th | Thai language |
Tigrinya | ti | Tigrinya language |
Turkish | tr | Turkish language |
Ukrainian | uk | Ukrainian language |
Urdu | ur | Urdu language |
Uzbek (Cyrillic) | uz-Cyrl | Uzbek written in Cyrillic script |
Uzbek (Latin) | uz-Latn | Uzbek written in Latin script |
Vietnamese | vi | Vietnamese language |
Welsh | cy | Welsh language |
Xhosa | xh | Xhosa language |
Yoruba | yo | Yoruba language |
Zulu | zu | Zulu language |
The structure of IETF language tags (BCP 47)
The structure of IETF tags is defined by the Best Current Practice (BCP 47) document. The document specifies guidelines for forming IETF tags and their extensions. It also details the best practices for tag implementation.
ISO 3166 country codes
ISO 3166 country codes are alphanumeric codes for identifying countries, territories, and special areas. The codes are used in IETF language tags as two-letter, three-letter, or numeric symbols.
Examples:
US for the United States
FR for France
840 for the United States
250 for France
IETF language tags containing ISO 3166 country code look as follows:
en-US for English used in the United States
fr-FR for French used in France
Types of ISO 3166 country codes
There are three types of ISO 3166 country codes:
ISO 3166-1
ISO 3166-1 is used to identify countries and their subdivisions. The code has three distinct sets:
-
ISO 3166-1 alpha 2 are two-letter codes for representing country subdivisions. Examples: US (United States), FR (France), JP (Japan).
-
ISO 3166-1 alpha 3 are three-letter codes for representing country subdivisions
Examples: USA (United States), FRA (France), JPN (Japan). -
ISO 3166-1 numeric are three-digit country codes used by systems without Latin scripts.
Examples: 840 (United States), 250 (France), 392 (Japan).
ISO 3166-2
ISO 3166-2 defines administrative territories such as provinces, states, and regions of countries listed in ISO 3166-1.
Examples:
- US-CA (California, United States)
- CA-QC (Quebec, Canada)
- IN-UP (Uttar Pradesh, India)
- FR-75 (Paris, France)
- DE-BY (Bavaria, Germany)
ISO 3166-3
ISO 3166-3 are codes for country names removed from ISO 3166-1. The changes were necessitated by countries’ dissolutions, separations, or mergers.
Examples:
- CSK (Czechoslovakia) – A country split into the Czech Republic and Slovakia in 1993.
- YUG (Yugoslavia) – A country dissolved in 1992 into several independent nations.
- DDR (East Germany) – A country unified with West Germany in 1990.
- TAN (Tanganyika) – A historical region that became a part of Tanzania.
- SFR (Soviet Union) – The Soviet Union dissolved into independent countries in 1991.
A table of ISO 3166-1 codes (alpha 2, alpha 3, and numeric)
Here’s a table of ISO 3166 codes containing alpha 2, alpha 3, and numeric sets:
Country/Region | ISO 3166-1 Alpha-2 | ISO 3166-1 Alpha-3 | ISO 3166-1 Numeric |
---|---|---|---|
Andorra | AD | AND | 020 |
United Arab Emirates | AE | ARE | 784 |
Afghanistan | AF | AFG | 004 |
Albania | AL | ALB | 008 |
Argentina | AR | ARG | 032 |
Australia | AU | AUS | 036 |
Austria | AT | AUT | 040 |
Bahamas | BS | BHS | 044 |
Belgium | BE | BEL | 056 |
Brazil | BR | BRA | 076 |
Canada | CA | CAN | 124 |
China | CN | CHN | 156 |
Colombia | CO | COL | 170 |
Czechia | CZ | CZE | 203 |
Denmark | DK | DNK | 208 |
Egypt | EG | EGY | 818 |
Finland | FI | FIN | 246 |
France | FR | FRA | 250 |
Germany | DE | DEU | 276 |
Greece | GR | GRC | 300 |
Hungary | HU | HUN | 348 |
India | IN | IND | 356 |
Indonesia | ID | IDN | 360 |
Ireland | IE | IRL | 372 |
Israel | IL | ISR | 376 |
Italy | IT | ITA | 380 |
Jamaica | JM | JAM | 388 |
Japan | JP | JPN | 392 |
Kenya | KE | KEN | 404 |
Malaysia | MY | MYS | 458 |
Mexico | MX | MEX | 484 |
Netherlands | NL | NLD | 528 |
New Zealand | NZ | NZL | 554 |
Nigeria | NG | NGA | 566 |
Norway | NO | NOR | 578 |
Pakistan | PK | PAK | 586 |
Philippines | PH | PHL | 608 |
Poland | PL | POL | 616 |
Portugal | PT | PRT | 620 |
Qatar | QA | QAT | 634 |
Russia | RU | RUS | 643 |
Saudi Arabia | SA | SAU | 682 |
Singapore | SG | SGP | 702 |
South Africa | ZA | ZAF | 710 |
South Korea | KR | KOR | 410 |
Spain | ES | ESP | 724 |
Sweden | SE | SWE | 752 |
Switzerland | CH | CHE | 756 |
Thailand | TH | THA | 764 |
Turkey | TR | TUR | 792 |
United Kingdom | GB | GBR | 826 |
United States | US | USA | 840 |
Vietnam | VN | VNM | 704 |
Zimbabwe | ZW | ZWE | 716 |
ISO code and country tag usage in localization
Now that you know what ISO codes are, let’s see why and how you can use them in localization.
Reasons to use ISO codes and country tags
With the help of ISO codes and country tags, you can efficiently deliver localized content to your audience. Other reasons to use ISO codes for localization are:
-
Improved user experience
By using ISO codes, you can serve your website or app users content in their preferred language and relevant to their region. -
Interoperability
ISO codes are universally recognized across platforms and can ensure integration between tools and systems. -
Compatibility with other standards
ISO codes are compatible with currency codes, like ISO 4217, and time zone codes, like IANA. -
Efficient data management
APIs and web services often rely on ISO codes to exchange localization data.
How to use ISO codes and country tags
With the groundwork laid, let’s put ISO codes and country tags into practice:
App and software localization
Here’s how you might use ISO codes and country tags when localizing your Java app:
In Java, the Locale class carries information about language and region. And, in rare cases, scripts and dialects. You can create a Locale object using ISO 639-1 codes and ISO 3166-1 country tags as follows:
val localeEnUS \= new Locale("en", "US")
Let’s break it down:
- en is the first parameter that specifies the language tag (ISO 639-1).
- US is the second parameter that specifies the country tag (ISO 3166-1).
In this case, you’re serving your users content in the version of English spoken in the US.
For more complex locales with a larger number of parameters, you can use the Locale.Builder class:
Locale localeSerbianCyrillic \= new Locale.Builder()
.setLanguage("sr")
.setScript("Cyrl")
.setRegion("RS")
.build();
Here we set three parameters:
-
Language
sr is the ISO 639-1 language code for Serbian -
Script
Cyrl specifies that the Cyrillic script is used -
Region
RS is the ISO 3166-1 Alpha-2 code for Serbia
Using the Locale.Builder class and ISO codes, you can align your app with IETF BCP 47 standards for broad compatibility.
ISO codes in HTML
In HTML, the lang attribute specifies the language of the content on your website’s page. This attribute helps search engines and browsers recognize and render the content in the correct language.
ISO 639-1 (two-letter) codes are typically used for language tags, but ISO 3166-1 (country) codes can be added to specify a region or country variant.
For example:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>English Content</title>
</head>
<body>
<h1>Hi, Centus fan!</h1>
<p>Keep exploring the world of localization.</p>
</body>
</html>
Here, the lang attribute is set to "en", which specifies that the content is in English, according to ISO 639-1 code.
For regional variants like French in Canada, use the fr-CA lang attribute.
ISO codes in HTTP
In HTTP requests, the Accept-Language header is used to indicate the user’s language preferences. The user’s browser sends the header to the server, specifying the most suitable language for the content.
For HTTP headers, ISO 639-1 and ISO 3166-1 codes are combined to indicate language preferences, sometimes with a quality value (q-value) to show preference order.
Accept-Language: en-US,en;q=0.9,es;q=0.8
Let’s break it down:
- en-US: The user prefers English (United States) as the primary language.
- en;q=0.9: If US English is not available, the user prefers general English with a slightly lower preference (q=0.9).
- es;q=0.8: The user also accepts Spanish but with the lowest preference of q=0.8.
What ISO codes to choose?
When localizing an app or software, should you use two-letter or three-letter ISO 639 codes?
In most localization scenarios, use two-letter ISO 639-1 codes. They are compatible across platforms and recommended by IETF BCP 47 standards.
Below is an example of an app’s JSON file (localization file) that uses ISO 639-1 codes:
{
"en": "Hello",
"fr": "Bonjour",
"es": "Hola"
}
However, if your app needs to support specialized languages such as Ancient Greek or Sumerian, opt for ISO 639-3 codes.
Here’s how your app’s JSON file with ISO 639-3 codes might look like:
{
"grc": "Ἀρχαία Ἑλληνικὴ γλῶσσα",
"sux": "𒂍𒍑𒂠"
}
Parting thoughts
Now you should know everything there is to know about ISO language codes and country tags. You’re all set to localize your app, software, or website.
To make your localization experience even easier, try Centus.
Centus is a localization management platform built for teams. Bring translators, editors, developers, designers, and managers on the platform where they can cooperate seamlessly to localize your product for maximum impact.
Get the week's best content!
By subscribing, you are agreeing to have your personal information managed in accordance with the terms of Centus Privacy Policy ->