Tuesday, December 11, 2012

A reading list on F# and other ML languages


Yin Zhu

This page will be updated regularly.    last updated on 11 Dec 2012.


For F#, Don Syme has posted a blog page:

How to reference F# in a research paper?

Most of the papers therein are, of course, written by Don himself, the creator of F#.

The above page is on F#, particularly on the novel points in F#, e.g. active patterns, async IO, quotations, etc. But the most powerful part of F# is probably its ML heritage. To have a full and authoritative review of the core of F#, we must to back to the original papers of these ideas. This also motivated me to read some of these papers. For a few of them, I have quite a good understanding. However, as I haven’t had any formal training in programming languages (nor do I intend to – data mining is my primary job!), I only have a shallow understanding of most papers. So, I set up this page to keep track on interesting papers that I have read and want to re-read.

General ML

Paul Hudak. The conception, evolution, and application of functional programming languages. ACM Computing Surveys. 1989. pdf

A comprehensive survey. Tutorial on lambda calculus and its relationship to FP languages. Note the year it published, Haskell was not mature at that time.

Andrew Appel. A Critique of Standard ML. 1992. pdf

This paper gives a detailed review on ML language features, mostly from a user’s perspective. The concerns Appel raises are what a typical ML programmer would raise, e.g. the efficiency of the low-level implementation of datatypes.

David B. MacQueen. Reflections on Standard ML. 1993. pdf

This paper is another review on ML, but from a language designer’s perspective.

Module systems

Robert Harper and Benjamin C. Pierce. Design Considerations for ML-Style Module Systems. Chapter 8 in Advanced Topics in Types and Programming Languages. MIT. 2005. book link

An up-to-date review of the ML module system. It starts with a very general introduction, independent of any concrete implementations. At the end of it, it compares different of implementations (SML and OCaml), and compares the ML module system briefly to C and Java. This chapter does not build on the type book of Pierce and can be read alone.

David MacQueen. Modules  for  Standard  ML. 1984. link

Module extension to ML was primary designed by David MacQueen, and this paper is the first paper on ML modules. MacQueen wrote several subsequent papers on ML modules, e.g. implementation issues and separate compilation of SML modules.

Derek Dreyer. Understanding and Evolving the ML Module System. PhD thesis. 2005. pdf

To learn to use a concrete ML module system, the best way is probably to read tutorials and code. For OCaml, check Jason Hickey's introduction to Objective Caml, OCaml manual, many OCaml library designs(Map, Set, and particularly Jean-Christophe FilliĆ¢tre’s ocamlgraph).

Ad-hoc polymorphism

ML does not support typeclass/ad-hoc polymorphism. Attracted by the elegance of Haskell typeclass, many ML programmers want to use typeclass in ML. Without typeclass, ML programmers have to mimic a typeclass, e.g. dictionary passing is used for implementing F# powerpack’s matrix library.

Stefan Wehr and Manuel M. T. Chakravarty. ML Modules and Haskell Type Classes: A Constructive Comparison. 2008. pdf

It implements ML’s full module system using Haskell typeclass; and implement simple typeclass suing ML’s modules. Constructor typeclass is not implemented and everyone said it is hard or impossible (but I haven’t seen on the web why….)

Jeremy Yallop. Practical Generic Programming with OCaml. ML workshop’07. slides.

Another implementation of typeclass (again simple typelcass) using OCaml.

John Peterson and Mark P. Jones. Implementing Type Classes. PLDI’93. paper.

Philip Wadler and Stephen Blott. How to make ad-hoc polymorphism less ad hoc. POPL’88. paper.

The original paper of typeclass actually has suggested a simple implementation using explicit dictionary passing, and this technique as poor man’s ad-hoc polymorphism can be used in ML.

The realpower of Haskell comes from constructor typeclasses – Functor, Monads, etc, are all based on it. (I know how to use them in simple cases, however I have no idea how it is implemented and why it cannot be implemented in .Net; people keep saying that “to support higher-order typeclasses, the IL of .Net needs to be changed” however they never give any clue on why…)

Mark P. Jones. A system of constructor classes: overloading and implicit higher-order polymorphism. FPCA '93. paper.

Inside ML polymorphism

From a programmer’ view, ML polymorphism is quite easy to understand and use. However, normal programmers have little knowledge of the underlying theory and the implementation details of it. Also .Net and Java generics are heavily influenced by ML polymorphism (the core designers are from FP community). If one wants to know the generics in Java or .Net thoroughly, understanding ML polymorphism seems a good pre-course.

Robin Milner. A theory of type polymorphism in programming. Journal of Computer and System Sciences. 1978. pdf

This paper is the second most cited paper in ML literature (the first being the definition of Standard ML), the breakthrough in this paper, plus the ML module system, gives the modern shape of ML. The Hindley-Milner type inference scheme is also described in this paper.

Gilad Bracha, Martin Odersky, David Stoutamire and Philip Wadler. Making the future safe for the past: Adding Genericity to the Java Programming Language. OOPSLA'98.pdf

Andrew Kennedy and Don Syme. Design and Implementation of Generics for the .NET Common Language Runtime. PPLDI’01. pdf

These two papers are not directly related to ML. But since Java and .NET are the two major programming platforms today, understanding their implementations of generics is critical to programmers today.

Courses and tutorials

The following courses are taught by active researchers in this field. The instructors are not only good teachers of functional programming, but also successful researchers.

Cornell CS3110. Data Structures and Functional Programming. Uses OCaml. As the title says, data structures and OCaml are taught in an interleaving way. Well designed assignments. More theoretical stuff, e.g. fixpoints.

Harvard CS51. Introduction to Computer Science II: Abstraction and Design. Uses OCaml. Introductory course. Half OCaml and half Java. Focused on core programming abstractions, e.g. higher-order functions, modules, data types, lazy, objects, parallelism.

IT University of Copenhagen. Programs as data. Uses F# to implement a small compiler. Focuses on compilers implementation. At the end of the course, it also introduces Scala, and advanced abstractions, e.g. continuations and Monads.

Yale CS421. Compilers and Interpreters. Uses SML to teach compilers.

OCaml tutorials/books.