Jump to content

User:Ruud Koot/Gujarati script/How To: Use Unicode for creating Gujarati script

From Wikipedia, the free encyclopedia

This is a subpage for the main article - Gujarati script. Here you can find additional details and resources regarding how to user Unicode for creating Gujarati script.

Unicode Code-set for Gujarati Script[edit]

The Unicode range for Gujarati script is from U+0A80 to U+0AFF. The ISCII Code-page identifier for Gujarati script is 57010.


The table below shows the glyphs that are implemented in Unicode standard 4.0.0. Gray boxes indicate the code-points that are reserved/unused.


x= 0 1 2 3 4 5 6 7 8 9 A B C D E F
U+0A8x        
U+0A9x  
U+0AAx  
U+0ABx         િ
U+0ACx        
U+0ADx                              
U+0AEx          
U+0AFx                                

How To: Use Unicode for creating Gujarati script[edit]

Note: In the examples shown in the sections below, the "+" sign denotes the combination of key-strokes.

Half-form of consonants[edit]

Half-forms of consonants are used in pre-base position. For consonants that do not have distinct glyph for half-forms, a Halant (્) is used to create half-forms as follows:

મ +્ + ય = મ્ય — as in રમ્ય (pleasant)

(Note the Half-form of મ, which is used here in conjunction with ય) Note: Half-form is not created for the base glyph even if the syllable ends with a Halant.


Application of Upper-based form of Ra – (Reph)[edit]

Application of Ra with a Halant (Half-form of Ra, as seen above) to a full-form consonant before the constonant produces Reph for that consonant. This affects the pronunciation of Ra in conjunction with that consonant. A Reph can be created as follows:

ર +્ = Ra + Halant
ર +્ + થ = ર્થ — as in અર્થ (meaning)

(Ra + Halant + થ = Reph effect on થ)


Application of Lower-based form of Ra – (Vattu)[edit]

Application of a Halant of a consonant (Half-form of consonant) to a full-form of Ra produces Vattu for that consonant. This affects the pronunciation of Ra in conjunction with that consonant. A Vattu can be created as follows:

પ +્ + ર = પ્ર — as in પ્રજા (people)

(પ + Halant + Ra = Vattu effect on પ)

Vattu variants[edit]

Vattu variants (half and full) are formed when consonants with vattu mark are combined. Often in some cases, a special glyph is required to represent vattu when various consonants are combined.

ડ +્ + ર = ડ્ર — as in ડ્રમ (drum)

(special glyph ડ્ર. Notice the two lower-based marks, as compared to only one in the previous example.)

Special Marks, Characters and Nukta[edit]

Above-based marks[edit]

All above-based marks and post-based matra are created as under:

ક +ં = કં — as in કંપન (vibration)

Below-based marks[edit]

The below-based marks and post-based matra are created as below:

ક +ુ = કુ — as in કુતરો (dog)
ભ +ૂ = ભૂ — as in ભૂકંપ (earthquake)

Characters શ્ર, ક્ષ and જ્ઞ[edit]

Following characters, which are part of the Gujarati alphabet, but are not explicitly created as glyphs in Unicode character-set, can be generated as indicated below:

શ +્ + ર = શ્ર
ક +્ + ષ = ક્ષ
જ +્ + ઞ = જ્ઞ

Application of Nukta[edit]

Nukta effects the pronunciation of the (preceding) consonant to which it is applied. A Nukta form of a consonant can be created in Unicode as follows:

ય +઼ = ય઼

Substitutions for specific typography of the script[edit]

Substitution, in the context applicable here, means replacing a set or group of characters with a resultant single unicode character. Following are the main character substitutions which are required to address the complexity of the language and to generate various character forms of the script:

Pre-base substitutions[edit]

The half-form conjunctions, one of the most common occurrences of the script, are created by pre-base substitutions.

ન +્ + ન = ન્ન — as in પ્રસન્ન (happy)

Also, the special use of this substitution is in creating I-Matra (and its appropriately aligned shape) as shown below:

ત +િ = તિ — as in તિર (arrow)

Post-base substitutions[edit]

Consonants of the Gujarati script do not have post-based forms. Primarily, post-based substitution is used to create visarga out of vowels, and is also applied for "I-Matra" substitutions as follows (which will precede any above-based substitution, if applied as well):

જ +ી = જી — as in જીવન (life)

(Compare the special shape જી – a result of post-based substitution – with another result of similar conbination using a character like લ, which will generate: લ +ી = લી)

Above-base substitutions[edit]

Above-based substitution is mainly applied for Matra, Reph, vowel modifications and for stress and tone marks. Consider the following examples:

વ +ૈ = વૈ — as in વૈભવ (pompousness)
ર +્ + ગ +ે = ર્ગે — as in સ્વર્ગે (in heaven)
મ +ે +ં = મેં — as in મેંઢક (frog)

Below-base substitutions[edit]

Mainly used for below-based matra, the below-based substitution could produce a conjunction, or change the whole shape of the glyph. This substitution is also used for producing special tone effect like anudatta.

More details on Gujarati Unicode[edit]

  • For further details on Gujarati Unicode, you may refer to Unicode Std 4.0.0 - Chapter 9
  • TDIL: Ministry of Communication & Information Technology, India
  • If you are creating a web-page while the OS language is not Gujarati, save the file as UTF-8 Unicode HTML. The code-points may be lost otherwise.