TAXONOMY OF DATA INCONSISTENCIES IN BIG DATA

Main Article Content

Ms. Vinaya Keskar, Dr. Jyoti Yadav, Dr. Ajay Kumar

Abstract

In the coming years, common units of measuring data viz. kilobytes, megabytes, gigabytes, or even terabytes will begin to appear quainter as the entire digital universe is expected to produce approximately 463 Exabyte’s of data every 24 hours worldwide. This omnipresent data is potentially knowledge-rich. Unprocessed data can be excavated for hidden information. Essentially, the quality of the output depends on the quality of input data. Alternately, a good analysis of faulty/bad data cannot result in meaningful outputs. The global challenge that arises during data analysis is the quality of data. Data quality is not intentionally reduced by unscrupulous systemic elements as inconsistencies have an uncanny way of creeping in due to various factors. The importance of data allows organizations to measure past performance quantitatively as well as to quantitatively ascertain present capabilities and thereby plan for future performance targets, leading to the study of data inconsistencies. This paper presents a conceptual outline of various categories and types of data inconsistencies, extending it further to briefly explain the data processing life cycle and the sources of data inconsistencies.

Article Details

Section
Articles