LANGUAGE IN INDIA
http://www.languageinindia.com
Volume 5 : 8 August 2005

Strength for Today and Bright Hope for Tomorrow

Editor: M. S. Thirumalai, Ph.D.
Associate Editors: B. Mallikarjun, Ph.D.
         Sam Mohanlal, Ph.D.
         B. A. Sharada, Ph.D.
         A. R. Fatihi, Ph.D.

STUDY OF HINDI NOUN PHRASE MORPHOLOGY FOR
DEVELOPING A LINK GRAMMAR BASED PARSER
Shailly Goyal and Niladri Chatterjee


ABSTRACT

Development of Hindi Link Grammar has already been initiated [Goyal and Chatterjee, 2005. To appear in a forthcoming issue of LANGUAGE IN INDIA]. However, in that work simple sentence structures were considered, and the focus was on verb morphology only. This work considers the Noun Phrase Morphology of Hindi in detail, and suggests appropriate links by taking into account the variations in the Noun Phrase structures of English and Hindi.

1. INTRODUCTION

Link Grammar provides a systematic way of parsing sentences by establishing links between the constituent words of a sentence. Typically, these links are aimed at providing the syntactic relationships between words of a phrase, and also between different phrases in a sentence. These two types of links have been named as "Intra-phrase" and "Inter-phrase" links, respectively. Although, English Link Grammar (ELG) is fully developed [Sleator and Temperley, 1991], work towards developing Hindi links has just begun. In this work we focus on developing Intra-phrase links for Noun Phrases in Hindi.

Typically, a link grammar is developed by creating a dictionary of all the words in a language. By judging the roles of a particular word in different contexts, a list of possible linkages that can be associated with that word is ascertained. A sentence is said to be valid if all its words have their link bindings satisfied.

However, creating such an exhaustive dictionary for any language is arduous and time-consuming. A simpler approach may be to follow the English links that are already developed and available for developing a link grammar-based parser for Hindi. However, variations of the syntactic rules of the two languages make straightforward utilization of the English links in Hindi difficult, if not impossible. Consequently, appropriate modifications need to be made.

This work focuses on identifying the discrepancies in the Noun Phrase morphology between English and Hindi. Any such work almost inevitably demands a systematic analysis of the morphologies of the two languages under consideration. This paper is therefore organized as follows. Section 2 provides a description of different English Noun Phrase links as given in [Sleator and Temperley, 1991]. Difficulties in straightforward adaptation of English links for Hindi are discussed in Section 3. Section 4 discusses different Hindi Noun phrase structures, as given in [Singh, 2003], and explains how English links need to be modified in order to capture the nuances Hindi syntax. The proposed links are illustrated with examples.

2. STUDY OF ENGLISH NOUN PHRASES AND LINKS

English Noun Phrases (NPs) typically consists of a noun/pronoun, called the "head" of the NP. It may further have the following optional constituents [Singh, 2003]:

Each of the constituents along with its related links is discussed below:

There are several other links like R, RS, B, C etc. which are used for English Noun Phrases. However, space limitation prohibits us from discussing all those links.

3. DISSIMILARITIES IN ENGLISH AND HINDI NPs

Straightforward application of Intra-phrase Noun Phrase links for Hindi suffers from several difficulties. Most important ones are being discussed in the following subsections.

3.1 Usage of Articles

Usage of articles in English Noun Phrases is governed by certain rules. On the other hand, Hindi does not have articles. Generally, nouns not preceded with "ek" are considered as definite, and the nouns preceded with "ek" are treated as either indefinite or quantitative noun [Singh, 2003].

3.2 Dissimilarities in English-Hindi Adjectives

3. 3. Post modifiers

Pre-modifiers as well as post-modifiers are used in English (e.g., "blue-eyed girl", "the girl with blue eyes"), whereas only pre-modifiers are used in Hindi (e.g., niilii aankhoon waalii laDkii). In Hindi, post-modifiers are used only in the form of relative clause or ki-clause [Singh, 2003] (e.g. "yah tathya ki tumne jhooth bolaa bhulaaya nahii jaa saktaa").

3. 4. Dependence of Modifiers on Gender, Number and Case-ending of Head Noun

Like Hindi adjectives, morphology of other pre-nominal modifiers (such as, genitives, participle modifier) varies with variations in head noun. In this subsection, we discuss the variations for genitives in detail. Similar variations are observed for other modifiers.

In Hindi, genitives are indicated with kaa/ke/kii as morpho-word. Choice of kaa/ke/kii depends on the gender, number and case ending of the head noun. Table 1 explains and illustrates the usage of kaa/ke/kii for different variations of head noun.

TABLE 1: Usage of kaa/ke/kii in genetive case

Gender of head noun Number of head noun Case-ending of head noun kaa/ke/ kii Example(s)
Masculine Singular Absent kaa laDke kaa bhai
Masculine Plural Absent ke ladke ke bhai
Masculine Don't care Present ke laDke ke bhai ne,
ladke ke bhaiyoon ne
Feminine Don't care Don't care kii ladke kii behan,
laDkee kii behan ne,
laDke kii behanon ne,
laDke kii behanein

4. PROPOSED HINDI LINKS FOR NPs

Here we explain the Hindi links (H-links) that we propose corresponding to the above discussed English links. It may be noted that unlike the Verb Phrase links discussed in [Goyal and Chatterjee, 2005], direction for the Noun Phrase links can be specified since the relative position of various modifiers with respect to the head noun is fixed.

In the following links, the direction information is given by direction specifier '+' or '-', as is followed with respect to the English Link Grammar [Sleator and Temperley, 1991].

D Link: 'D' link connects Hindi determiner with the head noun. Like English 'D' link, this link is also followed by two suffixes to give number information of the head noun.

Genitive Links: As discussed in Section 3. 4, case ending kaa/ke/kii is used to construct Hindi genitives. We propose the following two links for this construction:

Thus, the case ending kaa/ke/kii has (Jp- & D1+). Table 2 gives different 'D1' suffixes for kaa/ke/kii. Figure 1 gives an example of these links.

case-link

Table 2: 'D1' H-link

jp-dimp

Figure 1: Example of Genitive Case

Adjective Links: Below we discuss the proposed links for Hindi adjectives.

Post-Modifier Links: In Hindi, post modifiers come either in the form of relative clause or as ki-clause. Due to lack of space, we discuss only ki-clause in this work.

apki manyata

Figure 6: Example of ki-clause

It may be noted that like in ELG, 'TH' and 'C' links can be used in various constructions in Hindi Link Grammar also. Since the focus of this work is on Noun Phrase links, we omit the discussion of other usages of these links from this work.

4. CONCLUDING REMARKS

In this work, we studied the Hindi Noun Phrase morphology to develop links that may provide syntactic relationship between the words in a Noun Phrase. We have followed an Example Based approach where links given in ELG have been considered and suitably modified to capture the characteristics of Hindi morphology. Due to lack of space, many other variations (e.g. relative clause) could not be discussed here. We are currently working on developing algorithms for parsing Hindi sentences using the proposed Hindi Link Grammar.


REFERENCES

Goyal S. and Chatterjee N.: 2005, Towards Developing a Link Grammar Based Parser for Hindi, a paper submitted to Workshop on Morphology, IIT Bombay. To appear in LANGUAGE IN INDIA http://www.languageinindia.com.

Sastri S. and Apte B.: 1968, Hindi Grammar, Dakshina Bharat Hindi Prachar Sabha, Madras, India.

Singh, S.: 2003, English-Hindi Translation Grammar, Prabhat Publication, New Delhi.

Sleator D. and Temperley D.: 1991, Parsing English with a Link Grammar, Computer Science technical report CMU-CS-91-196, Carnegie Mellon University.

CLICK HERE TO GO TO HOME PAGE


Shailly Goyal and Niladri Chatterjee
Department of Mathematics
Indian Institute of Technology Delhi
Hauz Khas, New Delhi 110016
India.
C/o. LANGUAGE IN INDIA