Computer Laboratory

Large-Scale Data Processing and Optimisation (2023-2024 Michaelmas Term)

LSDPO - R244

review_log

Open Source Projects

Reading Club papers

Contact

 

 

 

 

 

 

 

  

 

 Candidates for Open Source Project Study

The list is not exhausted. If you take anything other than the one in the list, please discuss with me. The purpose of this assignment is to understand the prototype of the proposed architecture, algorithms, and systems through running an actual prototype and present/explain to the other people how the prototype runs, any additional work you have done including your own applications and setup process of the prototype. This experience will give you better understanding of the project. These Open Source Projects come with a set of published papers and you should be able to examine your interests in the paper through running the prototype. Some projects are rather large and may require extensive environment and time; make sure you are able to complete this assignment.

Project title in red colour is suggested but choose what you like. Some projects are more framework like, while others are more specific approaches. The project 27 and beyond are added this year.

  1. Ciel: https://github.com/mrry/skywriting, https://www.cl.cam.ac.uk/netos/ciel/

  2. DryadLINQ: https://research.microsoft.com/en-us/projects/dryadlinq/

  3. Naiad: data-parallel dataflow computation https://research.microsoft.com/en-us/projects/naiad/, and https://github.com/frankmcsherry/timely-dataflow (Rust version)

  4. Apache Giraph: Graph processing based on BSP https://incubator.apache.org/giraph/

  5. Spark: Fast Cluter Computing https://spark-project.org/

  6. X-Stream: https://labos.epfl.ch/x-stream

  7. Storm: https://storm-project.net/

  8. GraphX: https://github.com/amplab/graphx

  9. Tensorflow: https://www.tensorflow.org/

  10. PyTorch: https://pytorch.org/

  11. CNTK: https://docs.microsoft.com/en-us/cognitive-toolkit/  https://github.com/Microsoft/CNTK

  12. Ray (+RLLib): https://github.com/ray-project/ray, https://github.com/ray-project/ray/tree/master/python/ray/rllib

  13. RLgraph: https://github.com/rlgraph/rlgraph

  14. BoTorch: hhttps://github.com/pytorch/botorch

  15. BOAT: https://github.com/VDalibard/BOAT

  16. Snorkel / FlyingSquid: https://www.snorkel.org/ https://github.com/HazyResearch/flyingsquid

  17. Park: https://github.com/park-project/park (https://openreview.net/pdf?id=BkgfRbEPsE)

  18. Pyro: https://pyro.ai/

  19. Apache Flink: https://flink.apache.org/

  20. Jax: https://github.com/google/jax

  21. Numpyro - building pyro on jax: https://github.com/pyro-ppl/numpyro

  22. Emukit: https://emukit.github.io

  23. PGM: https://github.com/gvinciguerra/PGM-index

  24. ALEX: https://github.com/microsoft/ALEX

  25. OtterTune: https://github.com/cmu-db/ottertune

  26. CDBTune: https://github.com/ZhengtongYan/CDBTune

  27. Bao: https://github.com/learnedsystems/baoforpostgresql

  28. NAS-MCTS (LA-MCTS): https://github.com/facebookresearch/LaMCTS

  29. World-Model: https://worldmodels.github.io/

  30. DRL-Pytorch: https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch

  31. Inverse RL-GAIL: https://github.com/hcnoh/gail-pytorch

  32. MARL: https://www.david-albert.fr/marl/html/index.html

  33. Meta-RL: https://github.com/rlworkgroup/metaworld

  34. SMAC3: https://github.com/automl/SMAC3

  35. EGG: https://egraphs-good.github.io/

  36. TVM: https://github.com/apache/tvm

  37. Ray-Tune: https://github.com/ray-project/ray/tree/master/python/ray/tune