File locations: All schemas are in the schemas/ directory relative to this page. Click any card to download the file directly. Composite index schemas reference their constituent language schemas by relative URI — keep all files together in the same schemas/ directory.
Latin Script
Individual language schemas for the 24 official EU languages plus Norwegian Bokmål, Norwegian Nynorsk, and Icelandic. A composite index schema covers all EU languages together. (Bulgarian is covered in the Cyrillic section; Greek in its own section — both are included in the EU index by reference.)
Standard 26-letter Latin alphabet, ASCII printable range, plus common typographic punctuation.
schemas/en-english.crepdl German (de)Latin with Umlaut letters Ä, Ö, Ü and the Eszett ß.
schemas/de-german.crepdl French (fr)Latin with acute, grave, circumflex, diaeresis, cedilla, and ligatures œ and æ.
schemas/fr-french.crepdl Spanish (es)Latin with inverted punctuation ¡ ¿, tilde Ñ, and accented vowels.
schemas/es-spanish.crepdl Portuguese (pt)Latin with acute, grave, circumflex, tilde, cedilla: ã, â, ç, ê, õ, ô and others.
schemas/pt-portuguese.crepdl Italian (it)Latin with grave and acute accents on vowels: à, è, é, ì, ò, ó, ù.
schemas/it-italian.crepdl Dutch (nl)Latin with diaeresis on vowels ë, ï, ü, ä, ö and acute/grave/circumflex forms.
schemas/nl-dutch.crepdl Polish (pl)Latin with ogonek ą ę, acute ć ń ó ś ź, overdot ż, and stroke ł.
schemas/pl-polish.crepdl Romanian (ro)Latin with comma-below letters ș ț and circumflex/breve â î ă.
schemas/ro-romanian.crepdl Czech (cs)Latin with háček and čárka diacritics: á, č, ď, é, ě, í, ň, ó, ř, š, ť, ú, ů, ý, ž.
schemas/cs-czech.crepdl Hungarian (hu)Latin with unique double acute accent vowels ő Ő ű Ű, plus á, é, í, ó, ö, ü.
schemas/hu-hungarian.crepdl Swedish (sv)Latin with three extra vowels beyond English: å, ä, ö.
schemas/sv-swedish.crepdl Danish (da)Latin with three extra letters: Æ/æ, Ø/ø, Å/å.
schemas/da-danish.crepdl Finnish (fi)Latin with ä and ö (and å in Swedish loanwords).
schemas/fi-finnish.crepdl Slovak (sk)Latin with caron č š ž ď ľ ň ť, acute á é í ó ú ý, and unique ŕ ĺ.
schemas/sk-slovak.crepdl Croatian (hr)Latin with diacritical characters č, ć, dž, đ, lj, nj, š, ž.
schemas/hr-croatian.crepdl Slovenian (sl)Latin with three caron letters: č, š, ž.
schemas/sl-slovenian.crepdl Lithuanian (lt)Latin with ogonek ą ę į ų, macron ū, caron č š ž, and superscript dot ė.
schemas/lt-lithuanian.crepdl Latvian (lv)Latin with macrons ā ē ī ū, cedillas ģ ķ ļ ņ ŗ, and caron č š ž.
schemas/lv-latvian.crepdl Estonian (et)Latin with additional letters ä, ö, õ, ü, š, ž.
schemas/et-estonian.crepdl Irish (ga)Latin with síneadh fada (acute accent) on vowels: á, é, í, ó, ú.
schemas/ga-irish.crepdl Maltese (mt)Latin with unique characters ħ (H-bar), għ (digraph), ċ (C-dot), and ġ (G-dot).
schemas/mt-maltese.crepdl Norwegian Bokmål (nb)Latin 29-letter Norwegian alphabet including Æ/æ, Ø/ø, Å/å. The dominant written standard (~85–90% of Norwegian writing).
schemas/nb-norwegian_bokmål.crepdl Norwegian Nynorsk (nn)Latin 29-letter Norwegian alphabet, identical repertoire to Bokmål. The second official Norwegian written standard (~10–15% of written use).
schemas/nn-norwegian_nynorsk.crepdl Icelandic (is)Latin 32-letter Icelandic alphabet with unique letters Ð/ð (Eth) and Þ/þ (Thorn), preserved from Old Norse. ~370,000 native speakers.
schemas/is-icelandic.crepdlGreek Script
Modern Greek with monotonic orthography. Also referenced in the EU languages composite index.
Cyrillic Script
Schemas for Cyrillic-script languages across Eastern Europe and Central Asia. Bulgarian is also referenced in the EU index. Several languages (Uzbek, Azerbaijani, Kazakh, Tajik, Turkmen, Kyrgyz) use Cyrillic alongside Latin or Arabic — their schemas cover all scripts in use.
Cyrillic script plus Basic Latin for punctuation and numerals. EU official language.
schemas/bg-bulgarian.crepdl Russian (ru)33-letter Russian Cyrillic alphabet (А–Я). ~258M speakers across Russia and the post-Soviet states.
schemas/ru-russian.crepdl Ukrainian (uk)Cyrillic with Ukrainian-specific letters including Ї, Є, and І absent from Russian. ~40M speakers.
schemas/uk-ukrainian.crepdl Serbian (sr)Cyrillic (primary) and Latin (Latinica), both official scripts. ~17M speakers.
schemas/sr-serbian.crepdl Belarusian (be)Cyrillic plus Latin Łacinka (historical alternative orthography). ~8M speakers.
schemas/be-belarusian.crepdl Kazakh (kk)Cyrillic (current official) plus Latin (transition script) plus Arabic. Kazakhstan is progressively adopting Latin. ~19M speakers.
schemas/kk-kazakh.crepdl Kyrgyz (ky)Cyrillic (official in Kyrgyzstan) plus Perso-Arabic (diaspora in China). ~5M speakers.
schemas/ky-kyrgyz.crepdl Tajik (tg)Cyrillic (official in Tajikistan) plus Latin and Perso-Arabic scripts used regionally. ~9M speakers.
schemas/tg-tajik.crepdl Mongolian (mn)Cyrillic (official since 1941) plus Traditional Mongolian script (vertical cursive, U+1800–18AF; co-official since 2020). ~6M speakers.
schemas/mn-mongolian.crepdlArabic / Perso-Arabic Script
Right-to-left schemas for Arabic and languages that use Arabic-derived scripts: Persian, Pashto, Kurdish (Soranî), Urdu, Sindhi, Kashmiri, Uyghur, Hausa (Ajami), and West Punjabi (Shahmukhi). Several multilingual schemas (Kazakh, Uzbek, Malay, Kurdish) also include Arabic alongside other scripts.
Arabic block (U+0600–06FF), Supplement, Extended-A/B, and Presentation Forms A/B. Covers Modern Standard Arabic including tashkeel vowel marks. ~422M speakers across 22+ countries.
schemas/ar-arabic.crepdl Persian / Farsi (fa)Perso-Arabic (Nastaliq style). Adds پ چ ژ گ and Eastern Arabic-Indic digits ۰–۹ to the Arabic base. ~80M speakers in Iran, Afghanistan (Dari), and Tajikistan.
schemas/fa-persian.crepdl Pashto (ps)Perso-Arabic with additional letters specific to Pashto phonology. ~45M speakers in Afghanistan and Pakistan.
schemas/ps-pashto.crepdl Kurdish (ku)Latin (Kurmanji/Badini) plus Perso-Arabic (Soranî). ~26M speakers across Turkey, Iraq, Iran, and Syria.
schemas/ku-kurdish.crepdl Urdu (ur)Perso-Arabic (Nastaliq style). National language of Pakistan; scheduled language of India.
schemas/ur-urdu.crepdl West Punjabi / Shahmukhi (pnb)Shahmukhi (Arabic-script Punjabi) as used in Pakistan Punjab. ~90M total speakers.
schemas/pnb-west_punjabi_shahmukhi.crepdl Uyghur (ug)Perso-Arabic (official in China) plus Latin and Cyrillic used in the diaspora. ~15M speakers.
schemas/ug-uyghur.crepdl Hausa (ha)Latin (Boko, standard) plus Arabic (Ajami, traditional). ~75M speakers in West Africa.
schemas/ha-hausa.crepdl Saraiki / Siraiki (skr)Perso-Arabic. ~20M speakers in southern Punjab, Pakistan.
schemas/skr-saraiki_siraiki.crepdl South Azerbaijani (azb)Perso-Arabic, the primary script for Azerbaijani as spoken in Iran. ~14M speakers.
schemas/azb-south_azerbaijani_southern_azerbaijani.crepdlCJK — Chinese, Japanese, Korean
CJK schemas are of particular value given that UTF-8 exposes nearly 88,000 CJK ideographic characters alone. All five main CJK schemas share a common foundation of CJK Unified Ideograph blocks (Extensions A–H) and add their language-specific scripts on top. Chinese dialect schemas cover Wu, Min Nan, Hakka, Jinyu, Xiang, and Gan.
Hiragana (U+3040–309F), Katakana (U+30A0–30FF), Kanji (CJK Unified Ideographs + Extensions A–H), Latin rōmaji, Kana Supplement/Extended, and Hentaigana. ~125M speakers.
schemas/ja-japanese.crepdl Chinese, Simplified (zh-hans)Simplified Hanzi (CJK Unified Ideographs), Bopomofo, Pinyin (Latin), and CJK Extensions A–H. Standard written form in mainland China.
schemas/zh-hans-chinese-simplified.crepdl Chinese, Traditional (zh-hant)Traditional Hanzi, Bopomofo, Jyutping/Yale Latin, and CJK Extensions. Standard written form in Taiwan and Hong Kong.
schemas/zh-hant-chinese-traditional.crepdl Cantonese (yue)Traditional Hanzi (Cantonese usage) plus Jyutping/Yale romanisation. ~85M speakers in Guangdong, Hong Kong, and the diaspora.
schemas/yue-cantonese.crepdl Korean (ko)Hangul syllable block (U+AC00–D7A3), Hangul Jamo, Hanja (CJK Ideographs), and Latin. ~82M speakers in South and North Korea.
schemas/ko-korean.crepdl Wu Chinese / Shanghainese (wuu)Traditional Hanzi (Wu-specific usage), CJK Extensions, plus Latin romanisation. ~74M native speakers.
schemas/wuu-wu_chinese_shanghainese.crepdl Min Nan / Hokkien / Taiwanese (nan)Traditional Hanzi, Tai-lo/POJ Latin romanisation, and Min Nan-specific characters. ~75M total speakers.
schemas/nan-min_nan_hokkien_-_taiwanese.crepdl Hakka Chinese / 客家話 (hak)Traditional Hanzi (Hakka usage), CJK Extensions, plus Latin romanisation. ~47M native speakers.
schemas/hak-hakka_chinese_客家話.crepdl Jinyu Chinese / 晉語 (cjy)Simplified Hanzi (Jinyu usage) and CJK Extensions. ~46M native speakers in Shanxi and adjacent areas.
schemas/cjy-jinyu_chinese_晉語.crepdl Xiang Chinese / 湘語 (hsn)Simplified Hanzi (Xiang/Hunanese usage) and CJK Extensions. ~36M native speakers in Hunan.
schemas/hsn-xiang_chinese_湘語.crepdl Gan Chinese / 贛語 (gan)Simplified Hanzi (Gan usage) and CJK Extensions. ~22M native speakers in Jiangxi.
schemas/gan-gan_chinese_贛語.crepdl Zhuang / Cuengh (za)Latin (standard Zhuang orthography) plus CJK characters (Sawndip traditional script). ~16M native speakers in Guangxi.
schemas/za-zhuang_cuengh.crepdlIndic Scripts
The 22 constitutionally scheduled languages of India plus Nepali, Sinhala, and Sylheti. Scripts covered include Devanagari, Bengali/Assamese, Gurmukhi, Gujarati, Odia, Tamil, Telugu, Kannada, Malayalam, Meetei Mayek, and Ol Chiki. A composite index covers all 22 scheduled Indian languages.
Devanagari abugida plus Devanagari Extended (Vedic accent marks). ~600M L1+L2 speakers; Union official language of India.
schemas/hi-hindi.crepdl Marathi (mr)Devanagari plus Modi script (historical). Official language of Maharashtra and Goa.
schemas/mr-marathi.crepdl Nepali (ne)Devanagari. Official language of Nepal; scheduled language of Sikkim.
schemas/ne-nepali.crepdl Bengali (bn)Bengali/Bangla script (U+0980–09FF). Official language of Bangladesh; scheduled language of West Bengal and Tripura. ~230M speakers.
schemas/bn-bengali.crepdl Assamese (as)Bengali script (Assamese variant). Official language of Assam, with distinct letterforms from Bengali.
schemas/as-assamese.crepdl Punjabi (pa)Gurmukhi script (U+0A00–0A7F). Official script of Punjabi in India. ~125M total speakers.
schemas/pa-punjabi.crepdl Gujarati (gu)Gujarati script (U+0A80–0AFF). Official language of Gujarat. ~60M speakers.
schemas/gu-gujarati.crepdl Odia (or)Odia/Oriya script (U+0B00–0B7F). Official language of Odisha.
schemas/or-odia.crepdl Tamil (ta)Tamil script (U+0B80–0BFF). Official language of Tamil Nadu and Puducherry; one of the world's oldest classical languages. ~80M speakers.
schemas/ta-tamil.crepdl Telugu (te)Telugu script (U+0C00–0C7F). Official language of Andhra Pradesh and Telangana. ~95M speakers.
schemas/te-telugu.crepdl Kannada (kn)Kannada script (U+0C80–0CFF). Official language of Karnataka. ~60M speakers.
schemas/kn-kannada.crepdl Malayalam (ml)Malayalam script (U+0D00–0D7F). Official language of Kerala and Lakshadweep. ~38M speakers.
schemas/ml-malayalam.crepdl Maithili (mai)Devanagari plus Tirhuta script. Spoken in Bihar and Jharkhand.
schemas/mai-maithili.crepdl Sanskrit (sa)Devanagari plus Sharada script. Classical language with pan-India scholarly and religious use.
schemas/sa-sanskrit.crepdl Konkani (kok)Devanagari plus Kannada, Malayalam, and Latin scripts. Official language of Goa.
schemas/kok-konkani.crepdl Sindhi (sd)Arabic/Nastaliq (primary) plus Devanagari. No home state in India.
schemas/sd-sindhi.crepdl Kashmiri (ks)Arabic (primary) plus Devanagari and Sharada (historical). Official language of Jammu & Kashmir.
schemas/ks-kashmiri.crepdl Bodo (brx)Devanagari. Scheduled language spoken in Bodoland, Assam.
schemas/brx-bodo.crepdl Dogri (dgo)Devanagari plus Takri (historical) script. Spoken in Jammu & Kashmir and Himachal Pradesh.
schemas/dgo-dogri.crepdl Manipuri / Meitei (mni)Meetei Mayek script plus Bengali. Official language of Manipur.
schemas/mni-manipuri.crepdl Santali (sat)Ol Chiki script plus Devanagari, Bengali, and Odia. Austroasiatic (Munda) language of Jharkhand.
schemas/sat-santali.crepdl Chhattisgarhi (hne)Devanagari. ~16M speakers in Chhattisgarh.
schemas/hne-chhattisgarhi.crepdl Magahi (mag)Devanagari. ~21M native speakers in Bihar and Jharkhand.
schemas/mag-magahi.crepdl Bhojpuri (bho)Devanagari plus Kaithi (historical script). ~52M speakers in Bihar, Uttar Pradesh, and the diaspora.
schemas/bho-bhojpuri.crepdl Sylheti (syl)Bengali script (Sylheti variant). ~12M native speakers in Bangladesh's Sylhet Division and northeast India.
schemas/syl-sylheti.crepdl Sinhala (si)Sinhala script (Brahmi-derived, U+0D80–0DFF). Official language of Sri Lanka. ~17M speakers.
schemas/si-sinhala.crepdlSoutheast Asian Scripts
National and major regional languages of Southeast Asia, covering Thai, Myanmar/Burmese, Khmer, Lao, and Latin-script languages. Regional scripts for Javanese (Hanacaraka), Balinese (Aksara Bali), Sundanese (Aksara Sunda), and Buginese (Lontara) are also included. A composite index covers the full set of 12 languages.
Thai abugida script (U+0E00–0E7F), a Brahmi-derived script encoding vowels as diacritics with tone marks. ~70M speakers.
schemas/th-thai.crepdl Northeastern Thai / Isan (tts)Thai script (same repertoire as standard Thai). ~15M native speakers in the Isan region of northeast Thailand.
schemas/tts-northeastern_thai_isan_-_ภาษาอีสาน.crepdl Burmese / Myanmar (my)Myanmar script (U+1000–109F), a Brahmi-derived abugida. National language of Myanmar.
schemas/my-burmese.crepdl Khmer (km)Khmer script (U+1780–17FF), the largest Unicode alphabet. Official language of Cambodia.
schemas/km-khmer.crepdl Lao (lo)Lao script (U+0E80–0EFF), a Brahmi-derived abugida. Official language of Laos.
schemas/lo-lao.crepdl Vietnamese (vi)Latin-based Quốc Ngữ with extensive diacritics: five tone marks and modified vowels ă, â, ê, ô, ơ, ư, đ. ~90M speakers.
schemas/vi-vietnamese.crepdl Indonesian (id)Latin with minimal diacritics. ~270M L1+L2 speakers; national language of Indonesia.
schemas/id-indonesian.crepdl Malay (ms)Latin (Rumi, standard) plus Arabic (Jawi, traditional). National language of Malaysia, Brunei, and Singapore.
schemas/ms-malay.crepdl Filipino / Tagalog (fil)Latin plus Baybayin (traditional Philippine script). Official language of the Philippines.
schemas/fil-filipino-tagalog.crepdl Javanese (jv)Latin plus Hanacaraka (Javanese script, U+A980–A9DF). ~82M speakers in Java, Indonesia.
schemas/jv-javanese.crepdl Balinese (ban)Latin plus Aksara Bali (Balinese script, U+1B00–1B7F). Spoken in Bali, Indonesia.
schemas/ban-balinese.crepdl Sundanese (su)Latin plus Aksara Sunda (Sundanese script, U+1B80–1BBF). ~42M speakers in West Java, Indonesia.
schemas/su-sundanese.crepdl Buginese (bug)Latin plus Lontara (Buginese script, U+1A00–1A1F). Spoken in Sulawesi, Indonesia.
schemas/bug-buginese.crepdl Cebuano (ceb)Latin plus Baybayin (traditional Philippine script). ~27M speakers in the Visayas and Mindanao.
schemas/ceb-cebuano.crepdl Hiligaynon / Ilonggo (hil)Latin. ~10M native speakers in the Western Visayas, Philippines.
schemas/hil-hiligaynon_ilonggo.crepdl Ilocano / Ilokano (ilo)Latin. ~10M native speakers in the Ilocos Region and Cagayan Valley, Philippines.
schemas/ilo-ilocano_ilokano.crepdlOther Scripts
Languages using distinct writing systems not covered in earlier groups: Hebrew, Ethiopic/Ge'ez (Amharic, Tigrinya, Oromo), Georgian, Armenian, Tibetan, and the many Latin-script languages of Africa and the Middle East. Three composite index schemas cover grouped subsets. Languages whose schemas span multiple scripts (Uzbek, Azerbaijani, Hausa, Wolof etc.) are listed under their primary script above but their schemas cover all scripts used.
22-letter Hebrew consonantal alphabet with optional nikud (vowel points, U+05B0–05C7). ~22M speakers; official language of Israel.
schemas/he-hebrew.crepdl Amharic (am)Ethiopic / Ge'ez script (fidel syllabary, U+1200–137F). Official language of Ethiopia. ~57M speakers.
schemas/am-amharic.crepdl Tigrinya (ti)Ethiopic script (Ge'ez fidel). Official language of Eritrea; co-official in the Tigray region of Ethiopia. ~9M speakers.
schemas/ti-tigrinya.crepdl Oromo / Afaan Oromo (om)Qubee Latin (standard since 1991) plus Ethiopic script. ~42M speakers in Ethiopia and Kenya.
schemas/om-oromo_afaan_oromo.crepdl Georgian (ka)Mkhedruli script (U+10D0–10FF) plus Asomtavruli (U+10A0–10CF) and Georgian Extended (Mtavruli). One of the world's oldest and most distinctive alphabets. ~4M speakers.
schemas/ka-georgian.crepdl Armenian (hy)Armenian script / Aybuben (U+0530–058F), created ~405 CE by Mesrop Mashtots. 38-letter alphabet covering Classical (Grabar) and Modern Eastern/Western Armenian. ~7M speakers.
schemas/hy-armenian.crepdl Tibetan (bo)Uchen / Tibetan script (U+0F00–0FFF), a Brahmi-derived abugida created ~620 CE. Covers Standard and Classical Tibetan (Chöke). ~6M speakers.
schemas/bo-tibetan.crepdl Turkish (tr)Latin 29-letter alphabet with Ç, Ğ, İ, Ö, Ş, Ü (note also dotless ı). ~90M speakers in Turkey and Cyprus.
schemas/tr-turkish.crepdl Azerbaijani (az)Latin (official in Azerbaijan since 1991) plus Cyrillic and Arabic scripts. ~32M total speakers.
schemas/az-azerbaijani.crepdl Uzbek (uz)Latin (official since 1995) plus Cyrillic (still widely used) and Arabic (historical). ~35M speakers.
schemas/uz-uzbek.crepdl Turkmen (tk)Latin (official since 1993) plus Cyrillic (legacy) and Perso-Arabic (historical). ~7M speakers.
schemas/tk-turkmen.crepdl Albanian (sq)Latin 36-letter alphabet with Ë/ë and the digraph Rr/rr. ~8M speakers in Albania, Kosovo, and North Macedonia.
schemas/sq-albanian.crepdl Somali (so)Latin (standard official script since 1972). Official language of Somalia and Djibouti. ~22M speakers.
schemas/so-somali.crepdl Swahili / Kiswahili (sw)Standard 26-letter Latin alphabet. ~71M speakers; national/official language of Tanzania, Kenya, Uganda, and the DRC.
schemas/sw-swahili.crepdl Yoruba (yo)Latin with dot-below letters ẹ, ọ, ṣ and tone marks. ~47M speakers in Nigeria and Benin.
schemas/yo-yoruba.crepdl Igbo (ig)Latin with dot-below letters ị, ọ, ụ and tone marks. ~17M speakers in southeastern Nigeria.
schemas/ig-igbo.crepdl Fula / Fulfulde (ff)Latin (standard) plus Arabic (Ajami) plus Adlam (U+1E900–1E95F), an indigenous script created in the 1980s. ~35M speakers across West Africa.
schemas/ff-fula_fulfulde.crepdl Bambara / Bamanankan (bm)Latin plus N'Ko script (U+07C0–07FF, RTL). ~15M speakers; the lingua franca of Mali.
schemas/bm-bambara_bamanankan.crepdl Afrikaans (af)Latin with diacritics ê, ë, î, ï, ô, û and the unique â. ~17M speakers; official language of South Africa.
schemas/af-afrikaans.crepdl Zulu / isiZulu (zu)Standard 26-letter Latin alphabet. ~28M speakers; official language of South Africa.
schemas/zu-zulu_isizulu.crepdl Xhosa / isiXhosa (xh)Standard 26-letter Latin alphabet. ~19M speakers; official language of South Africa.
schemas/xh-xhosa_isixhosa.crepdl Shona (sn)Latin. ~15M speakers; major language of Zimbabwe.
schemas/sn-shona.crepdl Northern Sotho / Sepedi (nso)Latin. ~14M total speakers; official language of South Africa.
schemas/nso-northern_sotho_sesotho_sa_leboa_-_sepedi.crepdl Sesotho / Southern Sotho (st)Latin. ~14M total speakers; official language of South Africa and Lesotho.
schemas/st-sesotho_southern_sotho.crepdl Setswana / Tswana (tn)Latin. ~14M total speakers; official language of South Africa and Botswana.
schemas/tn-setswana_tswana.crepdl Kinyarwanda (rw)Latin. ~12M total speakers; official language of Rwanda.
schemas/rw-kinyarwanda.crepdl Kirundi / Rundi (rn)Latin. ~12M total speakers; official language of Burundi.
schemas/rn-kirundi_rundi.crepdl Lingala (ln)Latin with open-e ɛ and open-o ɔ. ~45M total L1+L2 speakers in the DRC and Republic of Congo.
schemas/ln-lingala.crepdl Wolof (wo)Latin. ~12M native speakers; the most widely spoken indigenous language of Senegal.
schemas/wo-wolof.crepdl Malagasy (mg)Latin (Rumi, standard) plus Arabic (Sorabe, historical). ~26M speakers; official language of Madagascar.
schemas/mg-malagasy.crepdl Nigerian Pidgin / Naijá (pcm)Latin with ọ and ẹ (dot-below) from the standardised Naijá orthography. ~75M total L1+L2 speakers.
schemas/pcm-nigerian_pidgin_naijá.crepdl Cameroonian Pidgin / Kamtok (wes)Latin. ~12M total speakers; the most widely spoken lingua franca of Anglophone Cameroon.
schemas/wes-cameroonian_pidgin_kamtok.crepdlSpecial Characters & Symbols
Script-agnostic schemas for punctuation, currency symbols, typographic characters, mathematical operators, and other non-alphabetic code points commonly required in publishing workflows. These schemas can be used standalone or combined with language schemas via CREPDL <union> to extend a language repertoire with a permitted symbol set.
schemas/ directory files to add them here. The cards below show the planned schemas — filenames will be confirmed once the files are available.
General Punctuation block (U+2000–206F): typographic spaces, dashes (en, em, figure), quotation marks, daggers, bullets, ellipsis, and editorial marks used across European publishing.
schemas/special-punctuation-typographic.crepdl Currency SymbolsCurrency Symbols block (U+20A0–20CF) plus commonly used symbols from Basic Latin ($ £ ¥) and Latin-1 (¢ ¤), suitable for multilingual financial publishing.
schemas/special-currency-symbols.crepdl Mathematical OperatorsMathematical Operators (U+2200–22FF) and Supplemental Mathematical Operators (U+2A00–2AFF) for STM publishing and technical documentation.
schemas/special-mathematical-operators.crepdl Combining Diacritical MarksCombining Diacritical Marks (U+0300–036F) and Combining Diacritical Marks Supplement (U+1DC0–1DFF) for use with Latin, Greek, or Cyrillic base characters in linguistic and critical-edition contexts.
schemas/special-combining-diacritics.crepdl Letterlike SymbolsLetterlike Symbols block (U+2100–214F): ℃ ℉ № ™ ℗ © ® ℠ and other frequently used symbols in legal, scientific, and commercial publishing.
schemas/special-letterlike-symbols.crepdl ArrowsArrows block (U+2190–21FF) plus Supplemental Arrows-A/B/C for technical documentation, flow diagrams, and instructional publishing.
schemas/special-arrows.crepdl Geometric Shapes & Box DrawingGeometric Shapes (U+25A0–25FF), Box Drawing (U+2500–257F), and Block Elements (U+2580–259F) for tabular layouts and diagrammatic use.
schemas/special-geometric-shapes.crepdl Superscripts & SubscriptsSuperscripts and Subscripts block (U+2070–209F) plus Number Forms (U+2150–218F, including Roman numerals and vulgar fractions) for scientific and legal publishing.
schemas/special-superscripts-subscripts.crepdl