% commonerrors.txt

Here is something we should expand as
`common mistakes and establish correct ways' to type sanskrit.

There are two types of errors.  One is due to transliteration and other
due to wrong usage of Sanskrit terms or sandhiis.

We will use ITRANS as the transliteration scheme for explanation.
Those rules will be that of encoding and pronunciation.

One has to pay attention to some details, as one would do to write
Devanagari characters.  Editting the text later to
make it aa-kaar or ii-kaar is time consuming, more so when the one who 
is typing makes mistakes.  
---------------------- Transliteration -----------------------------------
Transliteration specific corrections of common errors:

Use 

   aa or A instead of a for aa-kaar
   uu or U instead of u or oo for uu-kaar
   ii or I instead of i or ee for ii-kaar
   e instead of E or ay (Telugu influence)
   aM instead of am (word ending anusvaar)
   aH  instead of .h (visarga)
   .h is used for half letter like m.h, t.h
   aa_ii or aa{}ii to have two vowels together as in Hindi bhaa{}ii for brother
                    (_ may not work in some instances)
    
    ka instead of kha (Tamil and Kannada influence), ka and kha are different
    ga instead of gha (Tamil and Kannada influence), ga and gha are different
    cha instead of ca (Indology or other transliteration influence)
    chha instead of cha
    ta instead of tha (Tamil and Kannada influence), ta and tha are different
    da instead of dha (Tamil and Kannada influence),  da and dha are different 
    da instead of dha (Tamil and Kannada influence),  da and dha are different 
    va instead of ba  (Bengali influence) va and ba are distinctly different.
    shha instead of sha or sa, sha-shha-sa are three distinct.
    ksh  instead of kshh
    GYa (hindi influence) instead of jjna or jJNa or dnya (Marathi influence)

    Ta-Tha-Da-Dha-Na instead of ta-tha-da-dha-na

Watch for aaa, hh, nD, Nd combination.
----------------------- Sanskrit rules: -----------------------------------
To form conjunts with nasals, use

     N^k, N^kh, N^g, N^gh 
    JNch, JNchh, JNj, JNjh
     NT, NTh, ND, NDh
     nt, nth, nd, ndh
     mp, mph, mb, mbh

    All the N^, JN, N, n, m can be replaced by .n(overdot), or the
    pa, pha, ba, bha series m with M, to keep the printout and
    pronunciation correct.  The overdot with M or .n is accepted way but is 
    technically incorrect, mostly from pronunciation standpoint.

To use M or .n for anusvaara
    If an anusvaara (overdot) is used within the words (word internal!)
    instead of above mentioned nasals, we suggest that you use
    .n instead of M for all the letters except p, ph, b, bh, m.
    With remaining letters, y, r, l, v, sh, shh, s, h, L, x, GY use .n.
    So it will be 
    sa.nskR^ita
    sa.nvaada
    sa.nlagna
    sa.nsaara
    a.nsha
    sa.nrakshaka
    sa.nyama     
    et ceteraa.  It is wrong to ma-kaar for anusvaara in these words.
    These .n have different pronunciation than simple M as saMsaara
    and is more like with ardhacha.ndrabi.nduu.

    This is not critical since the output with M and .n is same. The note
    is added more for clarification/information.  There is a very easy fix
    for such anusvaar in Unix editting,  with it
    M[kgcjTDtdyrlvshLG]  change to .n[[kgcjTDtdyrlvshLG]
    This affects each letter in square bracket which is encoded with M.
    In sed
    s/M\([kgcjTDtd]\)/\.n\1/
    s/M\([yrlvshLG]\)/\.n\1/  will be useful.  M[pbm] stay the same!

A word ending anusvaar with M followed by vowel becomes makaar

  (word)M and  a,aa,i,ii,u,uu,e,ai,o,au as a start of the following word
      become, respectively,

    ma, maa, mii, mu, muu, me, mai, mo, mau .
  
    As an example, kiM aasiita becomes kimaasiita, 
                   ashvatthaM enaM becomes ashvatthamenaM .

The word ending 
   k should be k.h    as vaak.h and not vaak
   m should be m.h    as suresham.h and not suresham 
   n should be n.h    as raajan.h and not raajan
   t should be t.h    as dhyaayet.h and not dhyaayet

    and similar for ga, cha, Ta, et cetera.
   The newer ITRANS version 5.0 onwards accomodates word ending consonents
   and automatically adds hala.nta (.h) to them.

Rules for visarga (H) ending word,

   Most of the visarga-s become sh, sa, or shha depending on the first letter
   of the following word.

     H shhaT.h  becomes  shhshhaT.h 
   kaH chit.h  becomes kashchit.h
   vaaN^mayaH tvaM    becomes vaaN^mayastvaM

Rules for avagraha

   The vowel ending words when joined with a or aa-kaar words
   an avagraha .a is put for a-kaar, two avagraha-s .a.a are used for aa-kaar
   The first vowel may or may not change during this joining.

   praNata asmi       praNato.asmi
navama adhyaayaH      navamo.adhyaayaH
   loke asmin.h       loke.asmin.h
tathaa aatmani        tathaa.a.atmani

Other sandhii

   tat.h dhaama       taddhaama

Use

    sattva instead of satva
    tattva instead of tatva

-------------------------------------------------------------------------
This can be programmed to identify typing errors, 
automatically, if such extraction can be programmed.

Here is the scheme inserted for reference
**********
The ITRANS 4.0 transliteration scheme is
vowels(svara):  a aa(A) i ii(I) u uu(U) R^i R^I e ai o au aM aH L^i L^I
consonents(vya.njana):
 k kh g gh N^ ch chh j jh JN T Th D Dh N t th d dh n p ph b bh m
 y r l v sh shh s h L(Marathi) ksh(x) GY(Hindi)
 q K G z f .D .Dh are the letters k, kh, g, j, ph, D, Dh with nuktaas for Urdu.
Both .n and M produce anusvaara, .a avagraha, .h haLa.nta (leg break),
 H visarga Only a dot . or a vertical line | produce a da.nDa . 
\. produces just a dot (puurNaviraama).  
a.c and aa.c produce ardhachandra as in cat and talk.
The vowels need to be added after each consonent,
unless one wants joDaakshara.  
No other letters (upper or lower cases) are allowed.
Enclose english text in two sets of ## signs (before and after the text.
For other examples, see documents on anonymous ftp chandra.cis.brown.edu 
OR contact Avinash Chopde at avinash@acm.org
******************************************************************************
We should standardize these rules as much as possible.  
Of course, we cannot add all the grammar rules of Panini, but this file can 
assist us.

Please provide your additions or corrections!

% End of common_errors.txt
