Monday, June 30, 2014

The Best Graph Processing Engine?

Graph processing is so hot these days. But, what's the qualities that the best graph processing engine should have? Here i list several native ideas, which i believe are wrong-prone and easily beat by you, so, please give me your feedback.

  1. Well support in-memory and out-of-core processing. Reason of this quality comes from the large size of graph. Even using a huge cluster, it is still possible that we can not load the whole graph into memory.
  2. Well support for time-evolving graph. It is stupid if we have to perform a complex process again on a graph with small portion of changes. Using reasonable resources to accelerate processing the time-evolving graph is essential here.
  3. Well support graph traversal. Not just map-reduce or scatter-gather style processing on graphs is important, the graph traversal starting from a given vertex and ending with a bunch of vertices is also critical in many use cases. However, an efficient graph traversal may conflict the divided-and-conquer graph partition strategy. Usually, the graph traversal was considered as a functionality of graph database, not graph processing framework. However, i still think this should be well considered before making such decision.
  4. Well support rich data graph. Graph structure only contains vertices and edges between them. However, the essential thing is the rich data on those vertices and edges. The processing model should be aware of those rich data and process them differently. 
To Be Continue...

No comments:

Post a Comment