CLOUD BASED MULTI-LANGUAGE INDEXING USING CROSS LINGUAL INFORMATION RETRIEVAL APPROACHES

Chayapathi A R et al.

doi:10.17762/itii.v9i1.269

PDF

Published: Mar 18, 2021

DOI: https://doi.org/10.17762/itii.v9i1.269

Chayapathi A R et al.

Abstract

The exponential growth of data sizes created by digital media (video/audio/images), physicalsimulations, scientific instruments and web authoring joins the new growth of interest in cloud computing. The options for distribution and parallelization of information in clouds make the retrieval and storage processes very complicated, especially when faced with real-time data management. The quantity of Web Users getting access to data over Internet is expanding step by step. An enormous measure of data on Internet is accessible in various languages which could be accessed by anyone whenever. The Information Retrieval (IR) manages finding valuable data from a huge assortment of unorganized, organized and semi-organized information. In the present situation, the variety of data and language boundaries are the difficult challenges for communication and social trade over the world. To tackle such obstructions, CLIR, the cross-language information retrieval frameworks, are these days in solid interest. The Query Expansion (Q.E.) is the way toward adding related and important terms to original inquiry to upgrade its indexing ability to improve the significance of recovered files in CLIR. In this exploration work, Q.E. has been investigated for a Hindi-English and Kannada-English CLIR in that Hindi and Kannada queries are utilized to look through English docs. After the interpretation of query, recovered outcomes are positioned making use of OkapiBM25 to organize the most important doc at the top for expanding the significance of recovered docs using QE. We proposed architecture for Hindi-English and Kannada-English CLIR making use of QE. to improve the importance of recovered reports. In the primary investigation, QE. is performed with and without OkapiBM25 ranking. The outcomes show that the pertinence of recovered archives is higher with OKapiBM25 as contrast with the one without positioning. The work docs plainly demonstrate that the presentation of Hindi-English and Kannada-English CLIR framework can be improved altogether with query development using fitting terms located at suitable place and the recovered Snippets can incredibly fill in as the continuous test collection.

Issue

Vol. 9 No. 1 (2021)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details