GraphE: Fast, Flexible, and Programmable Graph Processing

News!

Distributed Triangle Listing paper in ICPP 2015.
Amitabha's X-Stream scale out paper (Chaos) in SOSP 2015.
Dagstuhl Seminar 14462 in November, 2014: Systems and Algorithms for Large Scale Graph Analytics.
PrefEdge paper in SYSTOR 2014.
Amitabha's X-Stream Paper in SOSP 2013.
GraphLab Workshop 2013.
We presented our paper 'Scale-up Graph Processing: A Storage-centric View' at SIGMOD-GRADES workshop (GRADES).
Participation to MSR Big Data Analytics 2013.

OOverview

This project is a joint project between the Computer Laboratory and EPFL in Switzerland.

There is a shift that massive data forms networks and graph processing has become popular for various applications on the increasingly large power distribution networks, internet backbone, social networks, ground transportation, and protein interaction networks and so forth. Graph algorithms are becoming increasingly important for solving many problems in scientific computing, data mining and other domains. As these problems grow in scale, parallel computing resources are required to meet their computational and memory requirements. Unfortunately, the algorithms, software, and hardware that have worked well for developing mainstream parallel scientific applications are not necessarily effective for large-scale graph problems.

This project GraphCam tackles to understand the difficulties and bottlenecks in large scale graph processing ranging from graph algorithm design to graph-parallel computing. The project aims at demonstrating how parallel graph algorithms in different scale such as devices level to cluster computing and build a framework to integrate various approached under GraphCam Domain Specific Language (DSL) designed for large scale graph processing that can be implemented on a variety of architectures. In particular, the following properties of graph problems present significant challenges for efficient parallelism.

Data-driven computations. Graph computations are often completely data-driven. The computations performed by a graph algorithm are dictated by the vertex and edge (node and link) structure of the graph on which it is operating rather than being directly expressed in code. As a result, parallelism based on partitioning of computation can be difficult to express because the structure of computations in the algorithm is not known a priori.
Unstructured problems. The data in graph problems are typically unstructured and highly irregular. Similar to the difficulties encountered in parallelizing a graph problem based on its computational structure, the irregular structure of graph data makes it difficult to extract parallelism by partitioning the problem data. Scalability can be quite limited by unbalanced computational loads resulting from poorly partitioned data.
Poor locality. Because graphs represent the relationships between entities and because these relationships may be irregular and unstructured, the computations and data access patterns tend not to have very much locality. This is particularly true for graphs that come from data analysis. Performance in contemporary processors is predicated upon exploiting locality. Thus, high performance can be hard to obtain for graph algorithms, even on serial machines.
High data access to computation ratio. Graph algorithms are often based on exploring the structure of a graph in preference to performing large numbers of computations on the graph data. As a result, there is a higher ratio of data access to computation than for scientific computing applications. Since these accesses tend to have a low amount of exploitable locality, runtime can be dominated by the wait for memory fetches

Storage Centric View

The determinant of performance in scale-up graph processing on a single system is the speed at which the graph can be fetched from storage: either from disk into memory or from memory into CPU-cache. Algorithms that follow edges perform random accesses to the storage medium for the graph and this can often be the determinant of performance, regardless of the algorithmic complexity or runtime efficiency of the actual algorithm in use. A storage-centric viewpoint would suggest that the solution to this problem lies in recognizing that graphs represent a unique workload and therefore should be treated as such by adopting novel ways to access graph structured data. We approach this problem from two different aspects.

One approach is specific to graphs stored on SSDs and accelerates random access using a novel prefetcher called RASP. The second approach takes a fresh look at how graphs are accessed and suggests that trading off the low cost of random access for the approach of sequentially streaming a large set of (potentially unrelated) edges can be a winning proposition under certain circumstances: leading to a system for graphs stored on any medium (main-memory, SSD or magnetic disk) called Xstream. RASP and X-stream therefore take - diametrically opposite - storage centric viewpoints of the graph processing problem.

See [1] for comparison of the above two approaches. Our plan is development of an online algorithm that selects between the two approaches, possibly providing the best of both worlds.

Source Code Repository

tbd.

Related Projects

Related Publications

[1] E. Yoneki and A. Roy: Scale-up Graph Processing: A Storage-centric View. ACM SIGMOD - GRADES, New York, USA, June, 2013.

[2] A. Roy, K. Nilakant, V. Dalibard, and Eiko Yoneki: Mitigating I/O latency in SSD-based Graph Traversal. Technical Report, University of Cambridge, 2012 (UCAM-CL-TR-823).

[3] E. Yoneki and Amitabha Roy: A Unified Graph Query Layer for Multiple Databases. Technical Report, University of Cambridge, 2012 (UCAM-CL-TR-820).

[4] A. Roy, I. Mihailovic, and W. Zwaenepoel X-Stream: Edge-centric Graph Processing using Streaming Partitions, SOSP 2013.

[5] K. Nilakant and E. Yoneki: On the Efficacy of APUs for Heterogeneous Graph Computation. EuroSys - SMFA, Amsterdam, 2014.

[6] K. Nilakant, V. Dalibard, A. Roy, and E. Yoneki: PrefEdge: SSD Prefetcher for Large-Scale Graph Traversal. ACM Internataional Systems and Storage Conference (SYSTOR), June, 2014.

[7] I. Giechaskiel: Distributed Massive Graph Triangulation, MPhil ACS Thesis, 2014.

[8] I. Giechaskiel, G. Panagopoulos and E. Yoneki: PDTL: Parallel and Distributed Triangle Listing for Massive Graphs. 44th International Conference on Parallel Processing (ICPP), September, 2015. Technical Report version (UCAM-CL-TR-866).

[9] W. Rao, E. Yoneki, and L. Chen: L-Graph: A General Graph Analytic System on Continuous Computation Model. HotPlanet, 2015

POSTERS:

[P1] K. Nilakant and E. Yoneki Active Data Management for Graph Processing. Poster EuroSys 2013.

[P2] V. Dalibard and E. Yoneki Optimizing Graph Computations: Trading Communications for Computations. Poster EuroSys 2013.

[P3] E. Yoneki, K. Nilakant, V. Dalibard and A. Roy RASP: Large-Scale Graph Traversal with SSD Prefetching. Poster MSR Big Data Analytics, 2013.

[P4] E. Yoneki, K. Nilakant, V. Dalibard and A. Roy SAKYOMI: SSD Prefetcher for Large-Scale Graph Traversal. Poster, SOSP, 2013.

Members

George Panagopouls (2014 Intern Student)
Marton Havasi (2014 Intern Student)
Weixiong Rao (Former PostDoc - Currently Tongji University in China)
Karthik Nilakant (PhD Student)
Valentin Dalibard (PhD Student)
Amitabha Roy (EPFL)
Eiko Yoneki

Contact Email

Please email to eiko.yoneki@cl.cam.ac.uk.

Computer Laboratory Systems Research Group