Multiple Linear Regression With Sparse Matrices In R
Alternate working title: For the love of god would it hurt them to write a fucking HOWTO?
I am not a mathematician. I'm just a humble engineer who happens to have a really big linear model that needs regressin'. And I have now spent the better part of 3 days figuring out how to do it in R so that you don't have to.
Step 1: Find yourself a sparse matrix representation. R, helpfully enough, has two popular packages for sparse matrices: Matrix and SparseM. Matrix is included in the stock R distribution these days, which means that you should use that package. Except... the best algorithm for fitting a sparse model makes use of the Cholesky decomposition of the matrix, a function which is currently busted:
cholesky_decomp <- chol(mm_cross) Error in .local(x, ...) : temporarily disabled
Of course, you wouldn't discover that until after you'd spent a couple of hours figuring out how to massage your data into the appropriate format. So... SparseM it is.
Step 2: Massage your data. I happen to be pulling my data out of a database, which makes it easy for me to create <i,j,value> triplets. That being the case I stuffed my data into a matrix.coo object, the SparseM native format for matrices stored in coordinate form. Here's how that works... suppose you have your data in a data frame like this:
i | j | x ----------- i_1|j_1|x_1 i_2|j_2|x_2 ...
Assume that the overall matrix is n X m in size. The appropriate call to create the sparse matrix is
sparse_matrix.coo <- new( "matrix.coo", ia = my_data_frame$i, ja = my_data_frame$j, ra = my_data_frame$x, dimension = c(n,m) )
So far so good? Alright... the slm.fit function requires a matrix in "compressed sparse row" (csr) format, so convert your coo matrix:
sparse_matrix.csr <- as.matrix.csr(sparse_matrix.coo)
Step 3: Fit it. Suppose you have your response data in a plain ol' (i.e. not sparse) vector. The call for the fit function is then
fit_result <- slm.fit(sparse_matrix.csr,response_vector)
The model is fitted! Great rejoicing and merriment!
That is all... be glad you didn't have to RTFM.