My friend passed me the link for Sho, a data processing and number crunching tool on .Net platform. It is developed in Microsoft Research. I am quite amazed by this tool and like to share a post here.
The frontend of Sho is IronPython. Besides all the standard Python core language and its libraries, Sho also integrates with part of Intel MKL library, which supports linear algebra operations.
There are rumors that there will be a machine learning/data mining library based on Sho in the future.
Key features of Sho
1) IronPython, and its console has some IntelliSense
2) libraries:
* Matrix and linear algebra (Intel MKL©)
both 32bit & 64bit support + multi-threaded
* Statistics (Sho Library)
* Signal processing (Sho Library)
* Optimization (Microsoft Solver Foundation)
3) Interactive visualization
The official site of Sho has some examples and good documents including a reference book.
IronPython?
The language and the open nature make Python a very suitable language for scientific computing. I know that a lot of researchers in data mining area use or occasionally use Python for their research. Three friends of mine are using Python for their research. They are also very good at Matlab. One said to me that he writes his development speed of Python(Scipy/Numpy) and Matlab are about the same. For the running speed, Scipy uses ATLAS + Lapcak, which is about the same as Matlab.
But I think the real power of Python is that your prototype is your deployment. If you ecosystem uses .Net, your number crunching module in IronPython could be ready be used in the system. While for Matlab, deployment is not that easy.
I think IronPython + Sho would be as good as Python+SciPy/Numpy.
I personally don’t quite favor dynamic languages. It is the libraries Sho uses make Sho excellent, F# + Sho libraries rocks too.
Using the Sho API in F#
The following is a sample F# script showing how to use Sho library in F#.
// set the env variable SHODIR to the root of Sho's installation folder
System.Environment.SetEnvironmentVariable("SHODIR", @"C:\Program Files (x86)\Sho 2.0 for .NET 4")
#r @"C:\Program Files (x86)\Sho 2.0 for .NET 4\bin\ShoArray.dll"
#r @"C:\Program Files (x86)\Sho 2.0 for .NET 4\bin\MatrixInterf.dll"
#time
open ShoNS.Array
open System
let rnd = new Random(1)
let vec = new DoubleArray(3)
vec.FillRandom(rnd)
let arr2 = new DoubleArray(3,3)
arr2.FillRandom(rnd)
// big matrix multiplication
let bigArr = new DoubleArray(1200,20000)
bigArr.FillRandom(rnd)
let bigArr2 = bigArr.Transpose()
let bigArrMulti = bigArr * bigArr2
// multi-threaded:
// Real: 00:00:01.745, CPU: 00:00:10.389, GC gen0: 0, gen1: 0, gen2: 0
// svd
let svd = new SVD(arr2)
printfn "U = %A\nD = %A\nV = %A" svd.D svd.U svd.V
// lu
let lu = new LU(arr2)
printfn "L = %A\nU = %A" lu.L lu.U
Notice that Sho linear algebra library is actually licensed from Intel MKL, which is multi-threaded! (Matlab also uses Intel MKL as its basic linear algebra library.)
Basic operations like matrix multiplication is very fast if it uses more cores, as illustrated above.
Note
Fsi.exe runs in 32bit mode. Thus your cannot create large matrices and other objects in the F# interactive. There are workarounds here.
would you mind to compare this Sho API with your LAPACK/F# Powerpack solution? thanks.
ReplyDelete@ elton
ReplyDeleteSho API should be more stable than F# Math Providers. Sho API also has 64-bit.
Sho API is a general .Net library. But F# math provider is aimed for F# only, also more flexible in its operations (you can change its source code).
Sho licensed the MKL library from INTEL. But I think there are license restrictions of its usage.
Hi Yin
ReplyDeleteHow would you compare the sho libraries to those currently available in the open source .NET numerics project (also allows linking to optimized math libraries like MKL). I'm trying to decide which one to use as the base for my numerical work...
@lasami
ReplyDeleteGenerally speaking, the implementation in Sho is more stable than other open source projects. But Sho is closed source.
You can also take a look at http://ilnumerics.net/.