This experiment uses the NetFlix movie recommendation workflow to compare the performance of Musketeer-generated jobs for Hadoop, Spark and Naiad to hand-written implementations for Hadoop and Spark, and a Lindi implementation on Naiad. Musketeer's auto-generated code is competitive in all cases.


Figure 10

Under construction: we will add information on the experimental setup and our data sets here shortly.

If you are interested in being notified when the data appears, please join our musketeer-announce mailing list.

Thanks for your patience.

-- The Musketeer team.

Experimental setup

This experiment was executed on an Amazon EC2 cluster comprising of 100 instances. Please check the clusters page for more details.

Result data set

The raw results for this experiment are available here.

To plot Figure 10, run the following command:

experiments/plotting_scripts$ python plot_netflix_new.py ../netflix/netflix_new.csv

The graph will be in netflix_runtime.pdf