Saturday, October 4, 2014

A list of references in machine learning and programming languages


Today I have to dispose some papers that I’ve printed during my PhD study. Most of them are research papers or notes, and I had great joy when I first read them. For some of them, I constantly refer back to. Since all these materials are freely available online, I now write down this list and may find them more easily. (In terms of reading, I still prefer papers, however they consume too much space.)

04 Oct 2014

Machine learning/statistics

Minka, A comparison of numerical optimizers for logistic regression.

          Estimating a Gamma distribution.

          Beyond Newton’s method.

Steyvers, Multidimensional scaling.

von Luxburg, A tutorial of spectral clustering.

Das et. al, Google news personalization: scalable online collaborative filtering.

Hofmann, Probabilistic latent semantic analysis.

Deerwester et. al, Indexing by latent semantic analysis.

Yi Wang, Distributed gibbs sampling of latent dirichlet allocation, the gritty details.

Blei et. al, latent dirichlet allocation.

Herbrich et. al, TreeSkill: a bayesian skill rating system.

                      Web-scale bayesian click-through rate prediction for Sponsored search..

                      Matchbox: large scale online bayesian recommendations

Wickham, Tidy data.

                The split-apply-combine strategy for data analysis.

DiMaggio, Mapping great circles tutorial.

Breiman, Statistical modeling: the two cultures. (with comments)

Wagstaff, Machine learning that matters.

Shewchuk, An introduction to the conjugate gradient method without the agonizing pain.

Mohan et. al, Web-search ranking with initialized gradient boosted regression trees.

Burges, From RankNet to LambdaRank to KambdaMART: an overview.



Goodger, Code Like a Pythonista: Idiomatic Python.

C++ FAQ, inheritance.

Wadler: Monads for functional programming.

            How to make ad-hoc polymorphism less ad hoc.

            Comprehending monads.

            The essence of functional programming.

Jones: A system of constructor classes: overloading and implicit higher-order polymorphism.

          Functional programming with overloading and higher-order polymorphism.

          Implementing type classes.

          Monad transformers and modular interpreters.

Yorgey, The Typeclassopedia.

Augustsson, Implementing Haskell overloading.

Damas and Milner, Pricipal type-schemes for functional programs.

Wehr et. al, ML modules and haskell type classes: a constructive comparison.

Bernardy et. al, A comparison of C++ concepts and Haskell type classes.

Garcia et. al, A comparative study of language support for generic programming.

Conchon et. al, Designing a generic graph library using ML functors.

Fenwick, A New Data Structure for Cumulative Frequency Tables.

Petr Mitrchev, Fenwick tree range updates. and his blog.

Flanagan et. al, The essence of compiling with continuations.

Haynes et. al, Continuations and coroutines.

Notes on CLR garbage collector.

Odersky et. al, an overview of the scala programming language.

                       lightweight modular staging: a pragmatic approach to runtime code genration and compiled dsls.

                       javascript as an embedded dsl

                       StagedSAC: a case study in performance-oriented dsl development

F# language reference, chapter 5, types and type constraints.

Remy, Using, undertanding, and unraveling the ocaml language.

Carette et. al, Finally tagless, partially evaluated.

Gordon et. al, A model-learner pattern for bayesian reasoning.

                     Measure transformer semantics for bayesian machine learning

Daniels et. al, Experience report: Haskell in computational biology.

Kiselyo et. al, Embedded probabilistic programming.

Hudak, Conception, evollution, and application of functional programming languages.

Hudak et al, A gentle introduction to Haskell 98.

                  A history of Haskell: Being lazy with class.

Yallop et. al, First-class modules: hidden power and tantalizing promises.

Lammel et. al, Software extension and integration with type classes.

HaskellWiki, OOP vs type classes.

                   All about monads.


Course notes

Tao, linear algebra.

MIT 18.440 Probability and Random Variables.OCW link.

MIT 18.05 Introduction to Probability and Statistics. OCW link.

MIT 6.867 Machine learning. Good problem sets.  OCW link.

MIT 18.01 Single Variable Calculus. OCW link.

Princeton COS424, Interacting with Data.

Stanford CS229. Machine learning.

Cornell CS3110, Data Structures and Functional Programming.

FISH 554 Beautiful Graphics in R. (formally FISH507H, course website not available now)

Haskell course notes, CS240h, Stanford. recent offering.

Liang Huang’s course notes, Python, Programming Languages.


Balanced Search Trees Made Simple (in C#)

Fast Ensembles of Sparse Trees (fest)

An implementation of top-down splaying.

Largest Rectangle in a Histogram.