Protocol Buffers are Google's data interchange format. They are a way of encoding structured data as a series of bytes. They can be thought of as a sort of binary XML. A particular message type is defined by a ".proto" file. The file format is rather like a C struct but includes explicit tag numbers for each field. Your project would take the ".proto" file and generate ML code with an ML type definition and functions to convert between that type and some ML binary type, probably Word8Vector.vector.
Google provides a developer's guide as well as a source code repository that provides implementations of protocol buffers in Java, C++ and Python. Similar ideas have been implemented for OCaml in the form of the Piqi project.
The full library is very large, but the project can identify a subset to start with and larger subsets as extensions.
Google are very interested in this data structure, and would probably be impressed with any student who did this project well!Last revised: 7 October, 2014