• C++ Programming for Financial Engineering
    Highly recommended by thousands of MFE students. Covers essential C++ topics with applications to financial engineering. Learn more Join!
    Python for Finance with Intro to Data Science
    Gain practical understanding of Python to read, understand, and write professional Python code for your first day on the job. Learn more Join!
    An Intuition-Based Options Primer for FE
    Ideal for entry level positions interviews and graduate studies, specializing in options trading arbitrage and options valuation models. Learn more Join!

SQL interraction with VS

One last detail. What number should I place for nPartitions? Im doing for (300,1000)x(1000,500) matrices. I haven't looked in your code in detail so it'd be a shortcut if you state. @tobias elbert
 
WooW. Result is amazing. See the photos and let's go on the discussion.
Initially I set nPartitions for 6 and got almost the same result as my one. Increasing the number of nPartitions gets the time close to 0 sec. Thanks tobias. I wonder one thing: What is the reasonable nPartions to stop on? For example setting it for 5000 and 7000 gives the same result. Do you know any way...
 

Attachments

  • TibiasSuper.png
    TibiasSuper.png
    155.6 KB · Views: 7
  • Tsotne.png
    Tsotne.png
    207.8 KB · Views: 5
Ah ok...The middle one didn't work because you set nPartitions to 1000. This basically will kick off 0 threads and hence the quick result. The resulting matrix should be empty in that case. nPartitions should never be larger than the number of rows in matrix A. I think you need to make the matrices much larger as well to see better results.

By the way, I should have stated this more clearly: my example code is really just a demonstration as to how you could use multiple threads for matrix multiplication. This could be implemented much better.
 
Amazing indeed: managed to extract whopping 100 MFLOPS out of machine that is probably north of 50 GFLOPS peak.

Actually I was new to multi-threading and parallel programming so I'm gonna excel it and reconstruct all the remaining codes requiring large number manipulation to it.
 
I can conduct 5000x5000 matrices multiplication in 1 second in Matlab. What is the difference between C++/C# and Matlab making Matlab such faster? Can I achieve the same speed in C++/C#?
 
I can conduct 5000x5000 matrices multiplication in 1 second in Matlab. What is the difference between C++/C# and Matlab making Matlab such faster? Can I achieve the same speed in C++/C#?

This question was answered for you, some months ago, on this thread: Matlab is faster because it probably uses optimized version of BLAS library for matrix multiplication. You cannot come even close (if you didn't get it - I was ironic in my previous post about "impressive" speeds of C# codes posted here) to this speed through coding matrix multiplication in three for loops. To come up to this level of speed, you'd have to utilize SSE processing units, take great care about re-organizing multiplication code with regards to caching, etc. - so you should be very, very knowledgeable about code optimization before even thinking about approaching such sort of task (alternatively, if you're C++ wizard, maybe you could come close through employing some template meta-programming magic, like in Eigen or alike libraries). For these reasons, for vector/matrix operations, one should always stick to using its vendor supplied version of BLAS library.
 
This question was answered for you, some months ago, on this thread: Matlab is faster because it probably uses optimized version of BLAS library for matrix multiplication. You cannot come even close (if you didn't get it - I was ironic in my previous post about "impressive" speeds of C# codes posted here) to this speed through coding matrix multiplication in three for loops. To come up to this level of speed, you'd have to utilize SSE processing units, take great care about re-organizing multiplication code with regards to caching, etc. - so you should be very, very knowledgeable about code optimization before even thinking about approaching such sort of task (alternatively, if you're C++ wizard, maybe you could come close through employing some template meta-programming magic, like in Eigen or alike libraries). For these reasons, for vector/matrix operations, one should always stick to using its vendor supplied version of BLAS library.

How about using OpenCL library? Yes, it's true you cannot beat Matlab with C#. I haven't yet tried such library.
 
How about using OpenCL library?

If you think about coding matrix multiplication from scratch to be executed on GPU, then indeed there is better chance your could fare well this way - GPU architectures are actually typically simpler regarding achieving decent level of speedup for the parallel implementation of given algorithm than what is the case with CPU architectures nowadays. Still, you'd have to learn a lot about GPU programming in order to be able to approach this task, and on the other side GPU vendors are providing own BLAS versions (if you re-read this thread I've mentioned above, you'll find NVIDIA guy was there to point you to CUBLAS, which is BLAS implementation for CUDA) so again you probably won't be able to reach their level of performance.
 
Has anyone tried this: ?

http://www.extremeoptimization.com/downloads.aspx

I added in references and the concepts seem to be quite good but I experience some weirdness. I wonder if anyone had any touch with this library. The thing is that in some static classes (e.g. Statistics namespace => Distributions namespace => Class BinomialDistribution) I cannot invoke the members. I declared the object of the class just in case it is not static, and still it is empty. Not urgent problem, but if anyone has some experience with this library, I'd like to hear opinions. Thanks
 
Top