LANGUAGE IN INDIA

Strength for Today and Bright Hope for Tomorrow

Volume 13 : 1 January 2013
ISSN 1930-2940

Managing Editor: M. S. Thirumalai, Ph.D.
Editors: B. Mallikarjun, Ph.D.
         Sam Mohanlal, Ph.D.
         B. A. Sharada, Ph.D.
         A. R. Fatihi, Ph.D.
         Lakhan Gusain, Ph.D.
         Jennifer Marie Bayer, Ph.D.
         S. M. Ravichandran, Ph.D.
         G. Baskaran, Ph.D.
         L. Ramamoorthy, Ph.D.
Assistant Managing Editor: Swarna Thirumalai, M.A.

HOME PAGE

Click Here for Back Issues of Language in India - From 2001




BOOKS FOR YOU TO READ AND DOWNLOAD FREE!


REFERENCE MATERIAL

BACK ISSUES


  • E-mail your articles and book-length reports in Microsoft Word to languageinindiaUSA@gmail.com.
  • PLEASE READ THE GUIDELINES GIVEN IN HOME PAGE IMMEDIATELY AFTER THE LIST OF CONTENTS.
  • Your articles and book-length reports should be written following the APA, MLA, LSA, or IJDL Stylesheet.
  • The Editorial Board has the right to accept, reject, or suggest modifications to the articles submitted for publication, and to make suitable stylistic adjustments. High quality, academic integrity, ethics and morals are expected from the authors and discussants.

Copyright © 2012
M. S. Thirumalai


Custom Search

Query Optimization:
Solution for low recall problem in Hindi Language IR -
Revisited with Experimental Results and Analysis

Kumar Sourabh, Ph.D. Candidate
Prof. Vibhakar Mansotra
Rakesh Goswami, Research Scholar


Abstract

While information retrieval (IR) has been an active field of research for decades, for much of its history it has had a very strong bias towards English as the language of choice for research and evaluation purposes. The Internet is no longer monolingual, as the non-English content is growing rapidly. Hindi is the third most widely spoken language in the world. An estimated 500-600 million people speak this language. Information Retrieval in Hindi language is getting popularity and IR systems face low recall if existing systems are used as-is. Certain characteristics of Indian languages do not enable the existing algorithms to match relevant keywords in the documents for retrieval.

Some of the major characteristics that affect Indian language IR are due to language morphology, compound word formations, word spelling variations, ambiguity, word synonym, foreign language influence, and lack of standards for spelling words.

Taking into consideration the aforesaid issues we introduced Hindi Query Optimization technique in our previous work. In this paper we extend our work by presenting various experiments carried out by using query optimization technique to solve low recall problem in Hindi Language IR.

Keywords:Information retrieval, Hindi, Monolingual, Query optimization, Interface, Hindi WordNet.

1. Introduction

While information retrieval (IR) has been an active field of research for decades, for much of its history it has had a very strong bias towards English as the language of choice for research and evaluation purposes. Internet shows more inclination toward the use of plurality of languages, as the non-English content is growing rapidly. More people have begun to send and receive e-mails, searching for information, reading e-papers, blogging and launching web sites in their own languages. Hindi is the third most widely-spoken language in the world (after English and Mandarin): an estimated 500-600 million people speak this language. Two American IT companies, Microsoft and Google, have played a big role in making this possible.


This is only the beginning part of the article. PLEASE CLICK HERE TO READ THE ARTICLE IN PRINTER-FRIENDLY VERSION.


Kumar Sourabh, Ph.D.
Department of Computer Science and IT
University of Jammu
J&K 180001 INDIA
Kumar9211.sourabh@gmail.com

Prof. Vibhakar Mansotra
Department of Computer Science and IT
University of Jammu
J&K 180001 INDIA
Vibhakar20@yahoo.co.in

Rakesh Goswami, Research Scholar
Department of Computer Science and IT
University of Jammu
J&K 180001 INDIA
rahulgoswami95@gmail.com

Custom Search


  • Click Here to Go to Creative Writing Section

  • Send your articles
    as an attachment
    to your e-mail to
    languageinindiaUSA@gmail.com.
  • Please ensure that your name, academic degrees, institutional affiliation and institutional address, and your e-mail address are all given in the first page of your article. Also include a declaration that your article or work submitted for publication in LANGUAGE IN INDIA is an original work by you and that you have duly acknowledged the work or works of others you used in writing your articles, etc. Remember that by maintaining academic integrity we not only do the right thing but also help the growth, development and recognition of Indian/South Asian scholarship.