Data Structure for Matrix Implementation

kirylm · 9/2/09

I was wondering if anyone has any thoughts on which is the best data structure to use for matrix implementation in C++ (in case I decide to do it

). The first thing that comes to mind, vector of vectors, is not a good idea in the end, since column access will be somewhat cumbersome. Another way would be use valarray or even implement you own array-like or vector-like class. Any opinions and/or suggestions from prior experience?

alain · 9/2/09

Kiryl, use uBlas Matrix, or other matrix implementation. Don't waste your time writing your own.

Eugene Krel · 9/2/09

While it's nice to say you did it, there is a lot of work involved in a properly written matrix library. Use uBlas for c++ and be done with it.

If you have extra time for some reason, then abstract out the underlying container and user vector of vectors or whatever you want. The user shouldn't care what the underlying storage is, just that he can access X(1,1) when needed.

tobias elbert · 9/2/09

Generally, I'd agree with Alain and wouldn't waste my time. However, if it is for your personal learning and understanding, then I'd do it.

To answer your question, I'd always build a class as you can attach very useful members to it such as an inversion function, operators (for element access, arithmetics, ...), access to entire columns or rows...

If you deal with double types exclusively, I'd overload a few constructors taking double type pointers as arguments, e.g void Matrix::Matrix(double **m,...).

John Jurcevic · 9/2/09

Use a one dimensional array/vector and let your accessor methods take care of figuring out how to access a particular matrix element. Think about having a base class and then sub-classes for vectors, upper/lower-triangular, diagonal, banded, banded upper/lower-triangular, and sparse matrices. Think about an easy implementation for the transpose. Make sure you can deal with the 1000000x1000000 sparse matrix that Dan will give you on the final exam.

Or just use some matrix library like uBlas...

Andy Nguyen · 9/2/09

I came from the 2006 class that had to build our own matrix from scratch and it was like the biggest thing we did then. I think I learn a lot about coding, troubleshooting that way.
It took a lot of time. There are always tradeoff between using a package and DIY.

Eugene Krel · 9/2/09

John Jurcevic said:
Use a one dimensional array/vector and let your accessor methods take care of figuring out how to access a particular matrix element. Think about having a base class and then sub-classes for vectors, upper/lower-triangular, diagonal, banded, banded upper/lower-triangular, and sparse matrices. Think about an easy implementation for the transpose. Make sure you can deal with the 1000000x1000000 sparse matrix that Dan will give you on the final exam.

Or just use some matrix library like uBlas...

That was only one homework John.

You will have other things to worry about than building a library from ground up. This doesn't mean you should use package algorithms for things like Cholesky and LU. Those should be implemented by you.

John Jurcevic · 9/2/09

Dude, you're spoiling my fun... So it was only on the homework and not on the final exam but it was actually a 1638400x1638400 banded matrix (very sparse).

You are right about implementing the all the algorithms on your own.

bob · 9/2/09

Andy said:
I came from the 2006 class that had to build our own matrix from scratch and it was like the biggest thing we did then. I think I learn a lot about coding, troubleshooting that way.
It took a lot of time. There are always tradeoff between using a package and DIY.

Yeah, it was a lot of work, but fun for me, since I was a C++ newb coming into the program. It was a great introduction to how (and how not) to build a class. I used mine all the way through Dan's courses.

But eventually I had to move away from it. The watershed was trying to implement QR decomposition myself for getting eigenvalues and eigenvectors of matrices. IMHO it is not at all easy to implement this well.

My advice would be to start off with your own implementation if doing that interests you, but around the semester break grab a library and get comfortable with it. You'll always work faster if you use the available tools without having to build them yourself; but you'll use those tools more wisely if you have a good understanding of how they work.

Andy Nguyen · 9/2/09

Then it begs this issue:
You all know how we try to debug our code by comparing the output among others for a specific set of input.
How would you do so efficiently among various packages and DIYers?

I can already imagine incoming flush of "my numbers only matching up with yours up until 6th decimal place". I can only smile and reminisce about the good old days

Do we still need to match each other perfectly till 20th decimal place or so?

Eugene Krel · 9/2/09

Well there is always matlab or something similar to that to check decimal precision.

John Jurcevic · 9/2/09

Andy said:
Do we still need to match each other perfectly till 20th decimal place or so?

Last year, I think we tried to match our results out to at least the 9th decimal place. It actually did happen even though we all implemented our own matrix code and algorithms in quite disparate ways.

Stefan Zota · 9/2/09

Vectors of vectors provides very slow access time. It has also storage limitations.

Fastest solution is to use a simple array. This provides quick access (continous memory), optimal storage (no fragmentation).
The downside is that you need to do some work to write all functions needed for a matrix. This gets trickier when you are looking at banded/sparse matrices. In the end, I can tell you from my own experience it is worth it. You can process 10k * 10k matrices on a personal computer.

Alain's solution is good too, it depends if you want to practice C++ or you focus on numerical portion.

alain · 9/2/09

IMHO, there are more important things to learn than implementing a full Matrix class. I spent countless hours doing it (I did using valarrays following Stroustroup ideas). I'm convinced that using an off the shelf implementation will save you time.

There are plenty of matrices implementations out there. Pick one that you like and write all the algorithms against it. Avoid using the algorithms that come with those libraries so you learn the numerical side. Implementing a matrix class won't teach any of the numerical side.

ExSan · 9/2/09

consider this too
O-Matrix: High Performance Data Analysis and Modeling

alain · 9/3/09

O-Matrix is a paid product.... SKIP IT. There are plenty of free good implementations.

Data Structure for Matrix Implementation

kirylm

alain

Older and Wiser

Eugene Krel

sunmulA

tobias elbert

John Jurcevic

Andy Nguyen

Eugene Krel

sunmulA

John Jurcevic

bob

Faculty (Undercover)

Andy Nguyen

Eugene Krel

sunmulA

John Jurcevic

Stefan Zota

alain

Older and Wiser

ExSan

alain

Older and Wiser