% commonerrors.txt Here is something we should expand as `common mistakes and establish correct ways' to type sanskrit. There are two types of errors. One is due to transliteration and other due to wrong usage of Sanskrit terms or sandhiis. We will use ITRANS as the transliteration scheme for explanation. Those rules will be that of encoding and pronunciation. One has to pay attention to some details, as one would do to write Devanagari characters. Editting the text later to make it aa-kaar or ii-kaar is time consuming, more so when the one who is typing makes mistakes. ---------------------- Transliteration ----------------------------------- Transliteration specific corrections of common errors: Use aa or A instead of a for aa-kaar uu or U instead of u or oo for uu-kaar ii or I instead of i or ee for ii-kaar e instead of E or ay (Telugu influence) aM instead of am (word ending anusvaar) aH instead of .h (visarga) .h is used for half letter like m.h, t.h aa_ii or aa{}ii to have two vowels together as in Hindi bhaa{}ii for brother (_ may not work in some instances) ka instead of kha (Tamil and Kannada influence), ka and kha are different ga instead of gha (Tamil and Kannada influence), ga and gha are different cha instead of ca (Indology or other transliteration influence) chha instead of cha ta instead of tha (Tamil and Kannada influence), ta and tha are different da instead of dha (Tamil and Kannada influence), da and dha are different da instead of dha (Tamil and Kannada influence), da and dha are different va instead of ba (Bengali influence) va and ba are distinctly different. shha instead of sha or sa, sha-shha-sa are three distinct. ksh instead of kshh GYa (hindi influence) instead of jjna or jJNa or dnya (Marathi influence) Ta-Tha-Da-Dha-Na instead of ta-tha-da-dha-na Watch for aaa, hh, nD, Nd combination. ----------------------- Sanskrit rules: ----------------------------------- To form conjunts with nasals, use N^k, N^kh, N^g, N^gh JNch, JNchh, JNj, JNjh NT, NTh, ND, NDh nt, nth, nd, ndh mp, mph, mb, mbh All the N^, JN, N, n, m can be replaced by .n(overdot), or the pa, pha, ba, bha series m with M, to keep the printout and pronunciation correct. The overdot with M or .n is accepted way but is technically incorrect, mostly from pronunciation standpoint. To use M or .n for anusvaara If an anusvaara (overdot) is used within the words (word internal!) instead of above mentioned nasals, we suggest that you use .n instead of M for all the letters except p, ph, b, bh, m. With remaining letters, y, r, l, v, sh, shh, s, h, L, x, GY use .n. So it will be sa.nskR^ita sa.nvaada sa.nlagna sa.nsaara a.nsha sa.nrakshaka sa.nyama et ceteraa. It is wrong to ma-kaar for anusvaara in these words. These .n have different pronunciation than simple M as saMsaara and is more like with ardhacha.ndrabi.nduu. This is not critical since the output with M and .n is same. The note is added more for clarification/information. There is a very easy fix for such anusvaar in Unix editting, with it M[kgcjTDtdyrlvshLG] change to .n[[kgcjTDtdyrlvshLG] This affects each letter in square bracket which is encoded with M. In sed s/M\([kgcjTDtd]\)/\.n\1/ s/M\([yrlvshLG]\)/\.n\1/ will be useful. M[pbm] stay the same! A word ending anusvaar with M followed by vowel becomes makaar (word)M and a,aa,i,ii,u,uu,e,ai,o,au as a start of the following word become, respectively, ma, maa, mii, mu, muu, me, mai, mo, mau . As an example, kiM aasiita becomes kimaasiita, ashvatthaM enaM becomes ashvatthamenaM . The word ending k should be k.h as vaak.h and not vaak m should be m.h as suresham.h and not suresham n should be n.h as raajan.h and not raajan t should be t.h as dhyaayet.h and not dhyaayet and similar for ga, cha, Ta, et cetera. The newer ITRANS version 5.0 onwards accomodates word ending consonents and automatically adds hala.nta (.h) to them. Rules for visarga (H) ending word, Most of the visarga-s become sh, sa, or shha depending on the first letter of the following word. H shhaT.h becomes shhshhaT.h kaH chit.h becomes kashchit.h vaaN^mayaH tvaM becomes vaaN^mayastvaM Rules for avagraha The vowel ending words when joined with a or aa-kaar words an avagraha .a is put for a-kaar, two avagraha-s .a.a are used for aa-kaar The first vowel may or may not change during this joining. praNata asmi praNato.asmi navama adhyaayaH navamo.adhyaayaH loke asmin.h loke.asmin.h tathaa aatmani tathaa.a.atmani Other sandhii tat.h dhaama taddhaama Use sattva instead of satva tattva instead of tatva ------------------------------------------------------------------------- This can be programmed to identify typing errors, automatically, if such extraction can be programmed. Here is the scheme inserted for reference ********** The ITRANS 4.0 transliteration scheme is vowels(svara): a aa(A) i ii(I) u uu(U) R^i R^I e ai o au aM aH L^i L^I consonents(vya.njana): k kh g gh N^ ch chh j jh JN T Th D Dh N t th d dh n p ph b bh m y r l v sh shh s h L(Marathi) ksh(x) GY(Hindi) q K G z f .D .Dh are the letters k, kh, g, j, ph, D, Dh with nuktaas for Urdu. Both .n and M produce anusvaara, .a avagraha, .h haLa.nta (leg break), H visarga Only a dot . or a vertical line | produce a da.nDa . \. produces just a dot (puurNaviraama). a.c and aa.c produce ardhachandra as in cat and talk. The vowels need to be added after each consonent, unless one wants joDaakshara. No other letters (upper or lower cases) are allowed. Enclose english text in two sets of ## signs (before and after the text. For other examples, see documents on anonymous ftp chandra.cis.brown.edu OR contact Avinash Chopde at avinash@acm.org ****************************************************************************** We should standardize these rules as much as possible. Of course, we cannot add all the grammar rules of Panini, but this file can assist us. Please provide your additions or corrections! % End of common_errors.txt