Computer Laboratory

Large-Scale Data Processing and Optimisation (2020-2021 Michaelmas Term)

LSDPO - R244

review_log

Open Source Projects

Reading Club papers

Contact

 

 

 

 

 

 

 

  

 

Open Source Project Study

 Candidates for Open Source Project Study

The list is not exhausted. If you take anything other than the one in the list, please discuss with me. The purpose of this assignment is to understand the prototype of the proposed architecture, algorithms, and systems through running an actual prototype and present/explain to the other people how the prototype runs, any additional work you have done including your own applications and setup process of the prototype. This experience will give you better understanding of the project. These Open Source Projects come with a set of published papers and you should be able to examine your interests in the paper through running the prototype. Some projects are rather large and may require extensive environment and time; make sure you are able to complete this assignment.

Suggested projects are in red colour font.

  1. Ciel https://github.com/mrry/skywriting, https://www.cl.cam.ac.uk/netos/ciel/

  2. Apache Hadoop https://hadoop.apache.org/

  3. DryadLINQ https://research.microsoft.com/en-us/projects/dryadlinq/

  4. MapReduce Online https://code.google.com/p/hop/

  5. Naiad: data-parallel dataflow computation https://research.microsoft.com/en-us/projects/naiad/, and https://github.com/frankmcsherry/timely-dataflow (Rust version)

  6. Apache Giraph: Graph processing based on BSP https://incubator.apache.org/giraph/

  7. Spark: Fast Cluter Computing https://spark-project.org/

  8. X-Stream: https://labos.epfl.ch/x-stream

  9. Storm:  https://storm-project.net/

  10. GraphX: https://github.com/amplab/graphx

  11. Tensorflow: https://www.tensorflow.org/

  12. Chaos: https://github.com/epfl-labos/chaos

  13. PyTorch: https://pytorch.org/

  14. CNTK: https://docs.microsoft.com/en-us/cognitive-toolkit/  https://github.com/Microsoft/CNTK

  15. Ray: https://github.com/ray-project/ray, https://github.com/ray-project/ray/tree/master/python/ray/rllib

  16. RLgraph: https://github.com/rlgraph/rlgraph

  17. BoTorch: hhttps://github.com/pytorch/botorch

  18. BOAT: https://github.com/VDalibard/BOAT

  19. Saber: https://github.com/lsds/Saber

  20. Snorkel / FlyingSquid: https://www.snorkel.org/ https://github.com/HazyResearch/flyingsquid

  21. Park: https://github.com/park-project/park (https://openreview.net/pdf?id=BkgfRbEPsE)

  22. Pyro: https://pyro.ai/

  23. Apache Flink: https://flink.apache.org/