Back to Current Issues

Survey on Big Data using Apache Hadoop and Spark

Priya Dahiya , Chaitra.B , Usha Kumari

Affiliations
Information Science Dept. , Acharya Doctor Sarvepalli Radhakrishnan Rd, Bengaluru, Karnataka 560107, India.
:10.22362/ijcert/2017/v4/i6/xxxx [UNDER PROCESS]


Abstract
Big data is growing rapidly regarding volume, variability, and velocity which make it difficult to process, capture and analyze the data. Hadoop uses MapReduce which has two parts Map and Reduce whereas Spark uses Resilient Distributed Datasets (RDD) and Directed Acyclic Graph (DAG) for processing of large datasets. To store data both of them uses Hadoop Distributed File System (HDFS).This paper shows the architecture and working of Hadoop and Spark and brings out the differences between them and the challenges faced by MapReduce during processing of large datasets and how Spark works on Hadoop YARN.


Citation
Priya Dahiya et.al, “Survey on Big Data using Apache Hadoop and Spark”, International Journal of Computer Engineering In Research Trends, 4(6):pp:195-201,June -2017.


Keywords : Big data, Spark, Hadoop, HDFS, MapReduce, YARN

References
1. Varsha B.Bobade, “Survey Paper on Big Data and Hadoop”, International Research Journal of Engineering and Technology (IRJET) , Volume: 03 Issue: 01 | Jan-2016, e-ISSN: 2395-0056  p-ISSN: 2395-0072.

2. S. Justin Samuel, Koundinya RVP, Kotha Sashidhar and C.R. Bharathi,  “A SURVEY ON BIG DATA AND ITS RESEARCH CHALLENGES”, VOL. 10, NO. 8, MAY 2015 ISSN 1819-6608, ARPN Journal of Engineering and Applied Sciences.

3.Ms. Vibhavari Chavan, Prof. Rajesh. N. Pursue “Survey Paper On Big Data”, Vibhavari 
Chavan et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 5 (6) , 2014, 7932-7939.

4. Ankush Verma ,Ashik Hussain Mansuri ,Dr. Neelesh Jain “Big Data Management Processing with Hadoop MapReduce and Spark Technology: A Comparison” 2016 Symposium on Colossal Data Analysis and Networking (CDAN) , 978-1- 5090-0669-4/16/$31.00 © 2016 IEEE. 

5. Wei Huang, Lingkui Meng, Dongying Zhang, and Wen Zhang, “In-Memory Parallel Processing of Massive Remotely Sensed Data Using an Apache Spark on Hadoop YARN Model” , IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 10, NO. 1, DECEMBER 2016.

6. Katarina Grolinger, Michael Hayes, Wilson A. Higashino, Alexandra L'Heureux1 David S.Allison ,Miriam A.M. Capretz, “Challenges for MapReduce in Big Data ”, 978-1- 4799-5069-0/14 $31.00©2014IEEEDOI10.1109/SERVICES.2014.4.

7. Xiuqin LIN, Peng WANG, Bin WU, “LOG ANALYSIS IN CLOUD COMPUTING ENVIRONMENT WITH HADOOP AND SPARK”, 978-1-4799-0094-7/13/$31.00©2013

8. K..Naga Maha Lakshmi et al., International Journal of Computer Engineering In Research Trends ,Volume 3, Issue 3, March-2016, pp. 134-142.

9. Sunil B. Mane et.al, “Product Rating using Opinion Mining”, International Journal of Computer Engineering In Research Trends, 4(5):161-168 ,May -2017.


DOI Link : Not yet assigned

Download :
  V4I6001.pdf


Refbacks : There are Currently no refbacks

Quick Links


DOI:10.22362/ijcert


Science Central

Score: 13.30



Submit your paper to editorijcert@gmail.com

>