Keywords |
: |
Hadoop, MapReduce, Hadoop-A algorithm, Pipeline algorithm, cloud computing |
Abstract |
: |
The MapReduce model is implemented by using open-source software of Hadoop. A number of issues faced by Hadoop to achieve the best performance. A serialization barrier requires to achieve the best performance which delays the phase. Repetitive merges and disk access leads to leverage latest high speed interconnects. With increasing volume of datasets, an acceleration framework of Hadoop-A optimizes Hadoop to keep updates. To overcome the problem of repetition and disk access, a novel algorithm to merge data introduced in this paper. A full-pipeline is also proposed for overlapping of the shuffle, merge and reduce the phases. The disk access from the intermediate data efficiently reduced and the data movement also increased by proposed Hadoop-A. |