Today’s era is the era of big data. This paper documents an attempt that gives a consolidated description of big data while indulging its other unique and defining characteristics by considering definitions from practitioners and academics. In this paper, brief introduction of big data and an overview of Hadoop, which is the core platform of big data and used for processing the data, which uses a map reduce paradigm to process the data, is given. Big data is a set of techniques and technologies that require new forms of integration to uncover large hidden values from large datasets that are diverse, complex, and of a massive scale. Big data environment is used to acquire, organize and analyze the various types of data. There is an observation about Map Reduce framework that framework generates large amount of intermediate data. Therefore, as well as the tasks finishes there is need of throwing that abundant data, because MapReduce is unable to utilize them.
Suman Arora, Dr.Madhu Goel, ―Survey Paper on Scheduling in Hadoop‖ International Journal of Advanced Research in Computer Science and Software Engineering, Volume 4, Issue 5, May 2014.
Apache HBase. Available at http://hbase.apache.org
Apache Pig. Available at http://pig.apache.org
Parmeshwari P. Sabnis, Chaitali A.Laulkar , ―SURVEY OF MAPREDUCE OPTIMIZATION METHODS‖, ISSN (Print): 2319- 2526, Volume -3, Issue -1, 2014
Jian Tan; Shicong Meng; Xiaoqiao Meng; Li ZhangINFOCOM, ―Improving ReduceTask data locality for sequential MapReduce‖ 2013 Proceedings IEEE ,1627 - 1635
Sagiroglu, S.; Sinanc, D., ‖Big Data: A Review‖, 2013,20-24
Garlasu, D.; Sandulescu, V.; Halcu, I.; Neculoiu, G,‖A Big Data implementation based on Grid Computing‖, Grid Computing, 2013, 17-19
Mukherjee, A.; Datta, J.; Jorapur, R.; Singhvi, R.; Haloi, S.; Akram, ―Shared disk big data analytics with Apache Hadoop‖, 2012,18-22.
Aditya B. Patel, Manashvi Birla, Ushma Nair, ―Addressing Big Data Problem Using Hadoop and Map Reduce‖, 2012, 6-8
Jefry Dean and Sanjay Ghemwat, MapReduce:A Flexible Data Processing Tool, Communications of the ACM, Volume 53, Issuse.1,2010, 72-77.
Wang, J.; Xiao, Q.; Yin, J.; Shang, P. Magnetics, ―DRAW: A New Data-gRouping-AWare Data Placement Scheme for Data Intensive Applications With Interest Locality―IEEE Transactions ( Vol: 49 ), 2013, 2514 – 2520.
Dong, X.L.; Srivastava, D. Data Engineering (ICDE),‖ Big data integration― IEEE International Conference on , 29(2013) 1245–1248.
Big data, Map Reduce, Hadoop, HDFS, Hadoop components, Hive.