Pregel: a system for large-scale graph processing
For large-scale graph processing one way to go is, of course, to use Hadoop and code the graph algorithm as a series of chained MapReduce invocations. MapReduce, however, is a functional language, so using MapReduce requires passing the entire state of the graph from one stage to the next, which is inefficient (as I alluded to at the end of this summary ). Google Pregel provides a simple straightforward solution to the large-scale graph processing problems. The Pregel solution is to use round(superstep)-based synchronized computation at the vertices supported with message-passing between the rounds. Pregel keeps vertices and edges on the machines that perform computation, and uses network transfers only for messages. This way Pregel avoids the communication overhead and programming complexity incurred by MapReduce chained iterations. Model In Pregel, in each iteration (superstep), a vertex can receive messages sent to it in the previous iteration, send messages to other vertices