Musketeer helps you deal with the diversity by automatically mapping your workflow to different data processing systems at the push of a button.
Our prototype implementation is available for you to try, and already supports seven data processing back-ends: Hadoop, Spark, Naiad, PowerGraph, GraphChi, Metis and serial C code.
Have a look over our experiments and the data sets we used for our EuroSys paper.
is a PhD student in the University of Cambridge Computer Laboratory. His research interests include distributed systems, data processing systems and scheduling.
is currently finishing his PhD at the University of Cambridge Computer Laboratory. His research is primarily on operating systems and scheduling for data centres.
Natacha is a PhD student at MPI-SWS, currently visiting University of Texas at Austin. Her interests lie at the intersection of distributed systems, distributed computing and databases.
is a PhD student at the University of Cambridge Computer Laboratory. His interests lie in cross-layer optimizations of networks, with a particular focus on network latency.