Computer Laboratory

Large-Scale Data Processing and Optimisation (2022-2023 Michaelmas Term)

LSDPO - R244

review_log

Open Source Projects

Reading Club papers

Contact

 

 

 

 

 

 

 

  

 

 Candidates for Open Source Project Study

The list is not exhausted. If you take anything other than the one in the list, please discuss with me. The purpose of this assignment is to understand the prototype of the proposed architecture, algorithms, and systems through running an actual prototype and present/explain to the other people how the prototype runs, any additional work you have done including your own applications and setup process of the prototype. This experience will give you better understanding of the project. These Open Source Projects come with a set of published papers and you should be able to examine your interests in the paper through running the prototype. Some projects are rather large and may require extensive environment and time; make sure you are able to complete this assignment.

Project title in red colour is suggested but choose what you like. Some projects are more framework like, while others are more specific approaches. The project 27 and beyond are added this year.

  1. Ciel: https://github.com/mrry/skywriting, https://www.cl.cam.ac.uk/netos/ciel/

  2. Apache Hadoop: https://hadoop.apache.org/

  3. DryadLINQ: https://research.microsoft.com/en-us/projects/dryadlinq/

  4. MapReduce Online: https://code.google.com/p/hop/

  5. Naiad: data-parallel dataflow computation https://research.microsoft.com/en-us/projects/naiad/, and https://github.com/frankmcsherry/timely-dataflow (Rust version)

  6. Apache Giraph: Graph processing based on BSP https://incubator.apache.org/giraph/

  7. Spark: Fast Cluter Computing https://spark-project.org/

  8. X-Stream: https://labos.epfl.ch/x-stream

  9. Storm: https://storm-project.net/

  10. GraphX: https://github.com/amplab/graphx

  11. Tensorflow: https://www.tensorflow.org/

  12. Chaos: https://github.com/epfl-labos/chaos

  13. PyTorch: https://pytorch.org/

  14. CNTK: https://docs.microsoft.com/en-us/cognitive-toolkit/  https://github.com/Microsoft/CNTK

  15. Ray (+RLLib): https://github.com/ray-project/ray, https://github.com/ray-project/ray/tree/master/python/ray/rllib

  16. RLgraph: https://github.com/rlgraph/rlgraph

  17. BoTorch: hhttps://github.com/pytorch/botorch

  18. BOAT: https://github.com/VDalibard/BOAT

  19. Saber: https://github.com/lsds/Saber

  20. Snorkel / FlyingSquid: https://www.snorkel.org/ https://github.com/HazyResearch/flyingsquid

  21. Park: https://github.com/park-project/park (https://openreview.net/pdf?id=BkgfRbEPsE)

  22. Pyro: https://pyro.ai/

  23. Apache Flink: https://flink.apache.org/

  24. Jax: https://github.com/google/jax

  25. Numpyro - building pyro on jax: https://github.com/pyro-ppl/numpyro

  26. Emukit: https://emukit.github.io

  27. PGM: https://github.com/gvinciguerra/PGM-index

  28. ALEX: https://github.com/microsoft/ALEX

  29. OtterTune: https://github.com/cmu-db/ottertune

  30. CDBTune: https://github.com/ZhengtongYan/CDBTune

  31. Bao: https://github.com/learnedsystems/baoforpostgresql

  32. NAS-MCTS (LA-MCTS): https://github.com/facebookresearch/LaMCTS

  33. World-Model: https://worldmodels.github.io/

  34. DRL-Pytorch: https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch

  35. Inverse RL-GAIL: https://github.com/hcnoh/gail-pytorch

  36. MARL: https://www.david-albert.fr/marl/html/index.html

  37. Meta-RL: https://github.com/rlworkgroup/metaworld

  38. SMAC3: https://github.com/automl/SMAC3

  39. EGG: https://egraphs-good.github.io/

  40. TVM: https://github.com/apache/tvm

  41. Ray-Tune: https://github.com/ray-project/ray/tree/master/python/ray/tune