# Transliteration encoding for the Quran # Last edited on 2004-01-26 03:47:37 by stolfi # # JSAR is J. Stolfi's ad-hoc remapping of HAR (HTML Arabic # transcription alphabet) to a more compact encoding, using only one # byte for lowercase HAR "letters" and caret "^" for uppercase. # # JSAR was a hacked-up solution to an immediate problem. It is now # obsolete since Unicode versions of the Quran are becoming available # (although they are still quite buggy). # # Lowercase Uppercase # --------------------- --------------------- # UC Phon Name HAR JSAR HAR JSAR # -- ---- --------- --------------- ---- --------------- ---- # 21 h_u h_ua' kh © KH ^© # 24 ? hamza / ' (/) ^' # 27 a 'alif a a A ^a # 28 b ba' b b B ^b # 2A t ta' t t T ^t # 2B t_- t_-a' th þ TH ^þ # 2C g^v g^vim j j J ^j # 2D h_. h_.a' h µ H ^µ # 2F d dal d d D ^d # 30 d_- d_-al th £ Th ^£ # 31 r ra' r r R ^r # 32 z zai z z Z ^z # 33 s sin s s S ^s # 34 s^v s^vin sh x SH ^x # 35 s_. s_.ad s ß S ^ß # 36 d_. d_.ad d ð D ^ð # 37 t_. t_.a' t ± T ^± # 38 z_. z_.a' th ç Th ^ç # 39 ` `ayn AA ä (AA) ^ä # 3A g^. g^.yn gh ¤ GH ^¤ # 41 f fa' f f F ^f # 42 q qaf q q Q ^q # 43 k kaf k k K ^k # 44 l lam l l L ^l # 45 m mim m m M ^m # 46 n nun n n N ^n # 47 h ha' h h H ^h # 48 w waw w w W ^w # 4A y ya' y y Y ^y # # Vowel marks # # UC Phon Name HAR JSAR HAR JSAR # -- ----- -------- --------------- ---- --------------- ---- # 4E a al.md. a â A ^â # 50 i i i I ^i # 4F u u u U ^u # # Long vowels # # 50+4A ee ë EE ^ë # 27+4F o o O ^o # 4F+48 oo ö OO ^ö # # Unknown: # # UC Phon Name HAR JSAR HAR JSAR # -- ----- -------- --------------- ---- --------------- ---- # ?? ? n ñ N ^ñ # ?? ? l ¬ L ^¬ # # The left column is the low-order byte (in hex) of the corresponding # Unicode point, in the Arabic range (u0600 - u06FF). # # # In the "Phon" and "Name" columns, "^." is dot-above, "^v" is # hacheck, "_-" is macron-below, "_." is dot-below, "_u" is # brevis-below. # # Note that the Arabic script itself has no case distinction, and # anyway the HAR codes "AA" and "/" have no uppercase equivalent. # # The JSAR codes "ñ" (HAR underlined "n") and "¬" (HAR underlined "l") # are used only once in the whole text (respectively, in 5:107 # "faâ©arañi", and 74:9 "fa£â¬ika"). They were not defined in the HAR # table. I am assuming that may be errors in the original files. # Perhaps ¬ is lam-'alif? #