1. BLAS: the basic linear algebra e.g. matrix multiplication
2. LAPACK: the second level linear algebra e.g. SVD, full eigendecomposition
3. ARPACK: sparse linear algebra, which means calculating a few eigenvalues / eigenvectors
openBLAS, intel MKL are the implementation of these library. In C++, Armadillo is an interface for BLAS, LAPACK, and ARPACK. In R, there is RCPP and RcppArmadillo is an interface to Armadillo. Unfortunately the eigs_sym function does not work, as R is not linked with ARPACK as explained here.
At the end, I found rARPACK, which is an interface for ARPACK. It uses Lanczos algorithm. The irlba package written in R also uses this algorithm to do the same thing, but at a slower speed. If we were to just use the full eigendecomposition function it would be very slow on large matrix.
A few benchmark on Macbook Air 2014 13inch Rstudio:
Note that in a Mac, to use multithreaded linear algebra, you might need to replace R's default BLAS library with Mac BLAS implementation name Accelerate following thisguide.library("corpcor") library("rARPACK") #library(rbenchmark) library(microbenchmark) library(irlba) # generate a "fat" data matrix n = 5000 X = matrix(rnorm(n*n), n, n) X = t(X)%*%X # compute SVD system.time( (s1 = eigen(X,symmetric = TRUE)) ) # user system elapsed # 58.487 1.042 35.895 system.time( (s2 = svd(X, nu = 5, nv = 0))) # user system elapsed # 304.342 5.497 138.519 system.time( (s3 = fast.svd(X)) ) # user system elapsed # 396.007 8.395 187.781 system.time( (s4 = eigs_sym(X, 5))) # user system elapsed # 1.909 0.029 2.013 system.time( (s5 = svds(X,k=5)) ) # user system elapsed # 7.686 0.082 4.113 system.time( (s6 = irlba(X, nu = 5, nv = 0))) # user system elapsed # 27.559 0.684 12.167
No comments:
Post a Comment