Paper is here.
MR programming model - Map - written by the user - takes an input key/value pair (file_name/file_contents) and emits intermediate pairs like (word, 1) for Wordcount. MR library combines all intermediate pairs and gives them to a reduce task which gives the final output for that intermediate key.
Handling master failure - just abort the MR task since there is only one master and hence failure is unlikely. Other option is to periodically save state of the master and pick up from there.
Locality - to preserve bandwidth, the MR master tries to schedule map task on a machine having the copy of the data or near to a replica of the data.
Locality - to preserve bandwidth, the MR master tries to schedule map task on a machine having the copy of the data or near to a replica of the data.
Backup tasks for Stragglers - When a MapReduce operation is close to completion, the master schedules backup executions of the remaining in-progress tasks. The task is marked as completed whenever either the primary or the backup execution completes.
Custom partitioning - default is hash(key) mod R, at times it's not enough. For e.g. if it's web links and we want all links with same hostname in same output, then hash(hostname(key)) mod R is right hashing approach.
Combiner function - after the Map task is executed a big payload is sent over the network to the reducer. To reduce the size we can combine the intermediate keys which come out from the map. Reduce and Combine will most likely have the same code - just that output of Reduce will be written to an output file whereas output of the Combiner will be sent over the network to Reduce.
Debugging - there is a way to sequentially execute the MR tasks to debug them better.
Counters - there are ways to keep counts of specific things. They are sent from worker machines to master (piggybacking on the ping response).
Implementing Sort - TeraSort is a standard map/reduce sort, except for a custom partitioner that uses a sorted list of N − 1 sampled keys that define the key range for each reduce. In particular, all keys such that sample[i − 1] <= key < sample[i] are sent to reduce i. This guarantees that the output of reduce i are all less than the output of reduce i+1.
Some numbers - Cluster of 1800 machines with each one having two 2GHz Intel Xeon processors with HyperThreading enabled, 4GB of memory, two 160GB IDE disks and a gigabit Ethernet link.
Grep which scanned through 10^10 100 bytes records (1TB in total) took 150 seconds.
Sorting was around 1000 seconds.
Advantages -
Code to deal with fault tolerance/distribution and parallelization is removed from the core task code.
Conceptually simpler.
MapReduce exploits a restricted programming model to parallelize the user program automatically and to provide transparent fault-tolerance.
No comments:
Post a Comment