Thoughts on why current analytical software constrains Big Data solutions for the financial industry

Bill Kantor · 10/18/13

Many organizations have realized significant competitive advantages with Big Data, but could do even more with new software paradigms. The financial industry in particular experiences limitations because it depends on complex analytics—particularly matrix math—which most Big Data architectures cannot accommodate readily. I recently wrote a blog post sharing my thoughts about why this is the case, and I would love some thoughts and feedback from other industry professionals.

Daniel Duffy · 10/18/13

Bill Kantor said:
Many organizations have realized significant competitive advantages with Big Data, but could do even more with new software paradigms. The financial industry in particular experiences limitations because it depends on complex analytics—particularly matrix math—which most Big Data architectures cannot accommodate readily. I recently wrote a blog post sharing my thoughts about why this is the case, and I would love some thoughts and feedback from other industry professionals.

Ok, I'll bite

Here are some examples of awesome things you should be able to do with a Big Data exploratory analytics database.

Build the ARCA book for one day of all exchange-traded US equities (186 million quotes) in 80 seconds on a 32-instance commodity hardware cluster. Run it in about half the time on a cluster twice as large.
Run a Principle Components Analysis on a 50M x 50M sparse matrix in minutes.
Select data sets (based on complex criteria) in constant time—irrespective of how big your dataset gets.

I won't say I don't believe this but what time frame are we talking about?

Point 2. to be taken with a spoon of salt IMO. Disclaimer: I know no Big Data.

Yike Lu · 10/18/13

Your competition in this space is basically q/kdb+. They do many many things right, including ETL (much less overhead for this than other systems I've used), complex math, and being quant friendly. There is no layer middle between the DB and the language. And the basic type system + relationships to DB structures is very well thought out.

There are some limitations, notably it isn't designed/licensed for web scale problems (multi-node). It is expensive.

Bill Kantor · 10/18/13

Thanks for your replies. It’s a good discussion. Yike, you’re right that P4 scales by adding commodity hardware nodes to a multi-node cluster whereas Kx scales by upgrading to a server with more cores. Adding nodes provides more aggregate computing power. And as you point out, cost is a difference, too. We’re a less expensive, scalable solution. The bigger picture is that the financial industry, which is so dependent on complex analytics, stands to gain from new software approaches. That’s good news for everyone.

Thoughts on why current analytical software constrains Big Data solutions for the financial industry

Bill Kantor

Daniel Duffy

C++ author, trainer

Yike Lu

Finder of biased coins.

Bill Kantor