• C++ Programming for Financial Engineering
    Highly recommended by thousands of MFE students. Covers essential C++ topics with applications to financial engineering. Learn more Join!
    Python for Finance with Intro to Data Science
    Gain practical understanding of Python to read, understand, and write professional Python code for your first day on the job. Learn more Join!
    An Intuition-Based Options Primer for FE
    Ideal for entry level positions interviews and graduate studies, specializing in options trading arbitrage and options valuation models. Learn more Join!

Computing to the billions

Joined
5/8/06
Messages
80
Points
16
I tried to write a file for 1e6 by 1e6 and it literally crashed, giving me the message below. How should I code so that I could begin to write and compute my matrices? I am looking along the lines of computing values larger than 1e6.

Thanks.

2ai205x.jpg


C++:
int main()
{
	int i, j, nmat;
	double *b, **a4;
	nmat = 1000000;

	Init_Matrix(a4, nmat);
	Init_Vector (b, nmat);
	for (i = 0; i<nmat; ++i) {
		if (i%2 == 0) a4[i][i] = 4;
		if (i%2 == 1) a4[i][i] = 4;
		if ((i-1)%10 != 0) a4[i][i-1] = -1;
		if (i%10 != 0) a4[i][i+1] = -1;
	}

	for (i = 0; i<nmat; ++i) {
		if (i%5 == 0) b[i] = 2;
		else b[i] = 1;
	}

	writeToCsvFile_matrix("a.dat", a4, nmat);
return 0;
 
Assuming you're running XP, you only have 2^32 possible memory locations which is about 4 billion.

A double is 8 bytes so, you're asking for 8,000,000,000,000 bytes, or 2,000 times more memory than XP can handle.
 
How should I code in C++ that are both fast and space efficient for a resulting large matrix? For example, using vectors' deque instead of pointers? Is C++ the better language to handle large matrices, or Matlab? In various forums, comments are generally favourable for Matlab and in terms of speed, the difference is not stark.
 
PeikLooi said:
How should I code in C++ that are both fast and space efficient for a resulting large matrix? For example, using vectors' deque instead of pointers? Is C++ the better language to handle large matrices, or Matlab? In various forums, comments are generally favourable for Matlab and in terms of speed, the difference is not stark.
If you need to code a 1e6 x 1e6 matrix, you won't be coding on Windows. I has nothing to do with Matlab or C++.
 
Peik Looi said:
I tried to write a file for 1e6 by 1e6 and it literally crashed, giving me the message below. I am looking along the lines of computing values larger than 1e6.
Is you algorithm working correctly for smaller and managable sizes ? how about 5x5,10x10, 100x100, 1000x1000,etc. If the algorithm is incorrect, it would be impossible to debug given the huge size of your output.
Alain said:
If you need to code a 1e6 x 1e6 matrix, you won't be coding on Windows. I has nothing to do with Matlab or C++.
As Dominic and Alain said, doing \(10^6 \times 10^6\) is impossible for the 32-bit XP we are using. You can wait for 64-bit Vista to come out which can utilize up to \(2^{64}\) memory addresses.
 
Alhough a 64 bit O/S will handle multi terabyte arrays, it's not really likely that a student (or midsized bank) could afford to run it at an acceptable speed.
There do exist techniques for matrices that are huge but not fully populated.
One thing that C got wrong, but Basic, Fortran and Pascal got right is using the same notation for functions as for arrays.
C++ (nearly) rectified that error.

You can implement a function like Square root as e^x as an array, indeed I have done so when desparately pushed to squeeze performance. Of course it's a big array...
This works both ways, you can write a function that overloads the [] operator, and only store the values that actually get used.
The way to think of it is that [] is a function that maps one integer to one floating point value.

You can use something like a linked list that stores a mapping between the index and the stored value.

This allows you to have arrays whose domain is huge, but where you only actually store the values you actually use.

It is of course very much slower than simple meory arrays, but will get there in the end.

Of coure the "list" need not be in memory.
You can read & write values to disk, and thus only be bounded by disk size.
A bit of intelligence in the code will reduce the amount of disk activity by orders of magnitude.
Before long you will have implemented your own virtual memory system.

Another way of mapping arrays into some sort of code, is that if some values in your vast matrix are defined by some piece of code , such as saying that diagonal is all zeros, then don't stroe these values, and mrely clall the code which knows which values should be set.

None of this is all that trivial, but are mature techniques.
 
Excerpt from a discussion I had:

Which language is best ?
C++ is the language of choice for most
large application programs including Microsoft's Office and Mozilla . It
produces code that is easy to read, write, and remember.This works for
everything not just matrix analysis. Download LAPACK and look at some of the
tests using it then download ppLinear and look at some of its test programs.
Which is easier to follow? Download the Cephes library ( written mostly in
C) compare it as well. The advantages are reasonably clear.
Most people use Fortran because they want to use libraries
like LAPACK and many others written years ago. Some front end their C and
C++ libraries with LAPACK. I don't think they would use it on an entirely
new application. There are also groups at universities that are tweaking
and multi threading BLAS . This is for matrices greater than 5000x5000 and
very high flop rates. Very specialized. Guys who do this won't touch C++.
MATLAB (and its clones like O-Matrix) are great for doodling arround with
Linear Algebra. All the schools use it. It's the way to go. Every one is
writing m files and handing them arround. Somewhere along the line you may
have to convert m files to C++ but the languages are so close you can almost
do it on sight.
bottom line -> must learn C++
must buy MATLAB
get the Schaum's outline for Fortran
just in case.
Bill
 
Back
Top