What are the major components of the Aneka MapReduce Programming Model?
ANEKA MAPREDUCE PROGRAMMING
Aneka follows the reference model published by Google and implemented by Hadoop to give an implementation of the MapReduce abstractions. The MapReduce Programming Model specifies the abstractions and runtime support required to build MapReduce applications on top of Aneka. For MapReduce programming using Aneka, you have to have a basic understanding of object-oriented programming, cloud computing, distributed systems, and a good understanding of the .NET framework and C#. Similarly, Microsoft Visual Studio with C# and Aneka also need to be installed.
MapReduce.NET is a MapReduce solution for data centers that is similar to Google's MapReduce but focuses on the.NET and Windows platforms. MapReduce.NET offers a set of storage APIs to wrap input key/value pairs into starting files and extract result key/value pairs from results files, as well as an object. oriented interface for programming maps and reducing functions.
MapReduce.NET is based on master-slave architecture.
Its main components include a manager, scheduler, executor, and storage.
1. Manager: The manager acts as a MapReduce computation agent. It sends the application to the MapReduce scheduler and gathers the final results after the execution is complete.
2. Schedular: The schedular assigns subtask to available resources when a user submits MapReduce.NET applications to it. During execution, it checks the progress of each job and performs task migration operations if certain nodes are ch slower than others owing to heterogeneity.
3. Executor: Each executor awaits a task execution order from the scheduler. Normally, the input data for a Map job is located locally. Otherwise, the executor will have to collect input data from neighbors. Before executing a Reduce job, the executor must get all of the input and combine it. Furthermore, the executor checks the progress of the task execution and sends the status to the scheduler regularly.
4. Storage: The storage component of MapReduce .NET provides a distributed storage service over the .NET platform. It arranges disk spaces on all available resources as a virtual storage pool and provides an object-based interface with a flat namespace for managing data stored in it.
Figure 5.24 depicts an example MapReduce.NET deployment with Aneka setup. Client processes simply include the libraries necessary to connect to Aneka and submit MapReduce tasks for execution. Typical management will be ahead of the MapReduce schedular installed on the schedular node and the MapReduce executor deployed on the executor node. Using the Windows Shared File Services, each executor node will access shared among all MapReduce execution.
Comments
Post a Comment