Tuesday, December 11, 2012

A reading list on F# and other ML languages


Yin Zhu

This page will be updated regularly.    last updated on 11 Dec 2012.


For F#, Don Syme has posted a blog page:

How to reference F# in a research paper?

Most of the papers therein are, of course, written by Don himself, the creator of F#.

The above page is on F#, particularly on the novel points in F#, e.g. active patterns, async IO, quotations, etc. But the most powerful part of F# is probably its ML heritage. To have a full and authoritative review of the core of F#, we must to back to the original papers of these ideas. This also motivated me to read some of these papers. For a few of them, I have quite a good understanding. However, as I haven’t had any formal training in programming languages (nor do I intend to – data mining is my primary job!), I only have a shallow understanding of most papers. So, I set up this page to keep track on interesting papers that I have read and want to re-read.

General ML

Paul Hudak. The conception, evolution, and application of functional programming languages. ACM Computing Surveys. 1989. pdf

A comprehensive survey. Tutorial on lambda calculus and its relationship to FP languages. Note the year it published, Haskell was not mature at that time.

Andrew Appel. A Critique of Standard ML. 1992. pdf

This paper gives a detailed review on ML language features, mostly from a user’s perspective. The concerns Appel raises are what a typical ML programmer would raise, e.g. the efficiency of the low-level implementation of datatypes.

David B. MacQueen. Reflections on Standard ML. 1993. pdf

This paper is another review on ML, but from a language designer’s perspective.

Module systems

Robert Harper and Benjamin C. Pierce. Design Considerations for ML-Style Module Systems. Chapter 8 in Advanced Topics in Types and Programming Languages. MIT. 2005. book link

An up-to-date review of the ML module system. It starts with a very general introduction, independent of any concrete implementations. At the end of it, it compares different of implementations (SML and OCaml), and compares the ML module system briefly to C and Java. This chapter does not build on the type book of Pierce and can be read alone.

David MacQueen. Modules  for  Standard  ML. 1984. link

Module extension to ML was primary designed by David MacQueen, and this paper is the first paper on ML modules. MacQueen wrote several subsequent papers on ML modules, e.g. implementation issues and separate compilation of SML modules.

Derek Dreyer. Understanding and Evolving the ML Module System. PhD thesis. 2005. pdf

To learn to use a concrete ML module system, the best way is probably to read tutorials and code. For OCaml, check Jason Hickey's introduction to Objective Caml, OCaml manual, many OCaml library designs(Map, Set, and particularly Jean-Christophe Filliâtre’s ocamlgraph).

Ad-hoc polymorphism

ML does not support typeclass/ad-hoc polymorphism. Attracted by the elegance of Haskell typeclass, many ML programmers want to use typeclass in ML. Without typeclass, ML programmers have to mimic a typeclass, e.g. dictionary passing is used for implementing F# powerpack’s matrix library.

Stefan Wehr and Manuel M. T. Chakravarty. ML Modules and Haskell Type Classes: A Constructive Comparison. 2008. pdf

It implements ML’s full module system using Haskell typeclass; and implement simple typeclass suing ML’s modules. Constructor typeclass is not implemented and everyone said it is hard or impossible (but I haven’t seen on the web why….)

Jeremy Yallop. Practical Generic Programming with OCaml. ML workshop’07. slides.

Another implementation of typeclass (again simple typelcass) using OCaml.

John Peterson and Mark P. Jones. Implementing Type Classes. PLDI’93. paper.

Philip Wadler and Stephen Blott. How to make ad-hoc polymorphism less ad hoc. POPL’88. paper.

The original paper of typeclass actually has suggested a simple implementation using explicit dictionary passing, and this technique as poor man’s ad-hoc polymorphism can be used in ML.

The realpower of Haskell comes from constructor typeclasses – Functor, Monads, etc, are all based on it. (I know how to use them in simple cases, however I have no idea how it is implemented and why it cannot be implemented in .Net; people keep saying that “to support higher-order typeclasses, the IL of .Net needs to be changed” however they never give any clue on why…)

Mark P. Jones. A system of constructor classes: overloading and implicit higher-order polymorphism. FPCA '93. paper.

Inside ML polymorphism

From a programmer’ view, ML polymorphism is quite easy to understand and use. However, normal programmers have little knowledge of the underlying theory and the implementation details of it. Also .Net and Java generics are heavily influenced by ML polymorphism (the core designers are from FP community). If one wants to know the generics in Java or .Net thoroughly, understanding ML polymorphism seems a good pre-course.

Robin Milner. A theory of type polymorphism in programming. Journal of Computer and System Sciences. 1978. pdf

This paper is the second most cited paper in ML literature (the first being the definition of Standard ML), the breakthrough in this paper, plus the ML module system, gives the modern shape of ML. The Hindley-Milner type inference scheme is also described in this paper.

Gilad Bracha, Martin Odersky, David Stoutamire and Philip Wadler. Making the future safe for the past: Adding Genericity to the Java Programming Language. OOPSLA'98.pdf

Andrew Kennedy and Don Syme. Design and Implementation of Generics for the .NET Common Language Runtime. PPLDI’01. pdf

These two papers are not directly related to ML. But since Java and .NET are the two major programming platforms today, understanding their implementations of generics is critical to programmers today.

Courses and tutorials

The following courses are taught by active researchers in this field. The instructors are not only good teachers of functional programming, but also successful researchers.

Cornell CS3110. Data Structures and Functional Programming. Uses OCaml. As the title says, data structures and OCaml are taught in an interleaving way. Well designed assignments. More theoretical stuff, e.g. fixpoints.

Harvard CS51. Introduction to Computer Science II: Abstraction and Design. Uses OCaml. Introductory course. Half OCaml and half Java. Focused on core programming abstractions, e.g. higher-order functions, modules, data types, lazy, objects, parallelism.

IT University of Copenhagen. Programs as data. Uses F# to implement a small compiler. Focuses on compilers implementation. At the end of the course, it also introduces Scala, and advanced abstractions, e.g. continuations and Monads.

Yale CS421. Compilers and Interpreters. Uses SML to teach compilers.

OCaml tutorials/books.


  1. David MacQueen's "Modules for Standard ML" is available on citeseer:


  2. Some more references:

    - "A Syntactic Approach to Type Soundness", Andrew Wright and Matthias Felleisen, 1994

    This is a really important paper as it introduces the now-standard technique of soundness of a type system by establishing preservation and progress. It is also nice to read, gives a very interesting historical overview of previous approaches, and presents the value restriction (while not being the initial reference on the subject, it's earlier work of Wright) which is a very nice design compromise to combine mutable state and polymorphism in a correct yet simple way.

    On the very specific topic of the value restriction, I also like "Relaxing the Value Restriction" of Jacques Garrigue, 2004 ( http://caml.inria.fr/pub/papers/garrigue-value_restriction-fiwflp04.pdf ), because it is a surprising use of subtyping and the idea of variance, even in ML languages that don't actually have much use of subtyping.

    - "A Modular Module System", Xavier Leroy, 2000

    It is the best paper I know on the "old school module systems for ML" (I'm not speaking of the refreshing work by Russo, Dreyer, Rossberg or Montagu that came in the more recent years); I'd use it as an introduction to module systems, instead of the MacQueen reference. The TAPL chapter is also excellent but, if I remember correctly, also relatively advanced.

    - Jacques Garrigue, "Code reuse through polymorphic variants", 2000

    Despite not having been retained in F#, Polymorphic variants are a fairly interesting structural-rather-than-nominal extension of traditional sum types. Like extensible records they offer a different flexibility-at-the-cost-of-complexity trade-off than usual ML datatypes, which may not be the right choice in most applications. This paper is a very nice exposition of the idea and its uses, including a good example of inheritance-like open recursion in ML (William Cook has some more examples of this).

    - If you are interested in the compilation of nested pattern matching in classical ML (or Haskell) compilers, "Compiling Pattern Maching to Good Decision Trees" (Luc Maranget, 2008) ( http://moscova.inria.fr/~maranget/papers/ml05e-maranget.pdf ) is rather easy to read. It does not cover higher-level constructs like views or F# active patterns.

    - Everyone should read "A History of Haskell, Being Lazy with Class", Hudak, Hughes, Peyton-Jones and Wadler, 2007 ( http://research.microsoft.com/en-us/um/people/simonpj/papers/history-of-haskell/history.pdf ).

  3. Just a quick plug: you might include a link to CMU's new intro courses, 15-150 and 15-210, in your courses section. These two courses use SML to teach functional programming, parallelism, specs and verification, data structures, and analysis. Here are links to the course pages: http://www.cs.cmu.edu/~15150/previous-semesters/2012-spring/ and http://www.cs.cmu.edu/afs/cs/academic/class/15210-s12/www/

    1. I am very enjoyed for this blog. I feel strongly about it and love very important information. If possible, as you gain expertise, would you mind updating your blog with more information? It is extremely helpful for me.
      vance book advertisement

    2. I found your blog when I was looking for a different sort of information wonderful collection, but I was very happy and glad to read through your blog. The information available here is great
      subliminal advertising

  4. For module systems, the "F-ing" semantics is the most straightforward presentation that exists. The following paper definitely belongs on that list:

    Rossberg, Russo, and Dreyer. F-ing Modules, TLDI 2010.

  5. if scala can be ported to .net using ikvm...and scala support type classes and high kinded types... .is not .net able to support type classes??..I'm missing something?

  6. recently, i find that J Mitchell's book, Concepts in Programming Languages, and the cs242 course are also very good learning resources.