Julien TESSON - research intership

English version

Formal verification of Ocaml's serialisation

Context

Serialization is the process of transforming a graph of objects (we use here object in the broad sens of a datastructure which can contain references to other objects) into a sequence of bytes (or text), such that this sequence can be reconstructed as an identical object graph afterwards.

There are various reasons why serialization is needed – such as saving an object state to persistent storage or streaming an object across a communication link. More generally, an object is serialized whenever the manipulation of an equivalent object in a different program context than its origin is required.

The serialization process can also be referred to as marshalling, pickling, flattening or persisting. To serialize an object, a basic algorithm recursively traverses all the object’s references to have the complete object representation (the complete object graph, rooted at the serialized object). During this process, the first occurrence object results in this object being serialized. The subsequent references to the same entity need to avoid serializing this object again.

Goal

Most of the existing results on serialization [1,2] deal with the encoding and decoding process but not with the traversal of the graph of dependencies between entities in the memory. In this internship we propose to verify the marshalling operation performed in Ocaml, and prove properties of this algorithm concerning both the correct encoding of information and the correct traversal of the graph.

An abstract model of correct serialisation, written in Isabelle/HOL [3], will be provided as a starting point for the specification.

Our objective is to reuse the intership results in the study of the semantics of distributed programming languages and in the proof of their correct implementation.

Work Program

In a first time the intern will have to check the c code of Ocaml's serialisation, to verify that it can be fed to a program verification framework such as FramaC [4], and, if not, do minor code modifications in order to fit the language constraints of the program verification framework.

Then the necessary parts of the formalisation of correct serialisation already written with Isabelle proof assistant will have to be translated into the program verification framework's logic.

And the correctness of the C code with respect to the translated specification will be established.

The next step, if time allows it, will be to use the properties of the serialisation process in the context of parallel multi-threaded execution (Ocaml multicore [5]) in order to assert some correctness properties of the Ocaml multicore library.

This internship is proposed in the context of an ongoing collaboration between Julien Tesson, Assistant professor at university Paris-Est Créteil, and Ludovic Henrio, CNRS researcher at I3S, Scale project.

References

[1]
Alexander Ahern and Nobuko Yoshida. Formalising java rmi with explicit code mobility. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications, OOPSLA ’05.
[2]
Heather Miller, Philipp Haller, Eugene Burmako, and Martin Odersky. Instant pickles: generating object-oriented pickler combinators for fast and extensible serialization. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA 2013.
[3]
Isabelle/HOL. Proof assistant. http://isabelle.in.tum.de/
[4]
FramaC. Source-code analysis of C software. http://frama-c.com/
[5]
Stephen Dolan , Leo White , Anil Madhavapeddy. Multicore OCaml. In the OCaml Users and Developers Workshop. http://ocamllabs.io/doc/multicore.html

Automatic cost analysis of BSP programs

Multi-BSP graph algorithms and implementation

Design of graph algorithms for multi-BSP architectures and implementation of such algorithms using the Multi-ML language.

details here (in french)

Multicore implementation of BSML

Implementation and benchmarks of BSML based on Ocaml Multicore, and ultimately, modification of Ocaml multicore to benefit from the BSP structuration of the parallelism.