• C++ Programming for Financial Engineering
    Highly recommended by thousands of MFE students. Covers essential C++ topics with applications to financial engineering. Learn more Join!
    Python for Finance with Intro to Data Science
    Gain practical understanding of Python to read, understand, and write professional Python code for your first day on the job. Learn more Join!
    An Intuition-Based Options Primer for FE
    Ideal for entry level positions interviews and graduate studies, specializing in options trading arbitrage and options valuation models. Learn more Join!

SQL

Here's one example: data reconciliation. Let's say you work in middle office, and your IB recently purchased another IB. You now have to transfer all of their trades from their systems to yours. Of course, the trade details, cashflow information, curve data, etc. will differ between systems, so you need some method of checking that the details of the new trades in your system post-transfer match the details of the trades in the other IB's system pre-transfer. Basically, you need to ensure that nothing was lost in translation. The ability to write SQL queries to aggregate, compare, and reconcile data between the two systems would be incredibly useful in such a situation.
 
I use SQL to build tables to store data for quantitative analysis and sometimes for reporting purposes as well. It has a pretty good reporting tool SQL-SRS
 
SQL is used to access databases. If you're building a pricing system, you may use it to query market data or get instrument-specific information. I would imagine a financial engineer would spend less time building database queries than a typical programmer, but it doesn't hurt to understand basic SQL queries and how relational databases work.

---------- Post added at 05:39 PM ---------- Previous post was at 04:52 PM ----------

Here's one example: data reconciliation. Let's say you work in middle office, and your IB recently purchased another IB. You now have to transfer all of their trades from their systems to yours. Of course, the trade details, cashflow information, curve data, etc. will differ between systems, so you need some method of checking that the details of the new trades in your system post-transfer match the details of the trades in the other IB's system pre-transfer. Basically, you need to ensure that nothing was lost in translation. The ability to write SQL queries to aggregate, compare, and reconcile data between the two systems would be incredibly useful in such a situation.
I'm not sure about that situation. I'm actually working on cashflow tie-outs in a merger going on right now, and it's being handled only by us IT guys, not financial engineers.

Relational databases are good for quants to understand, though, because database queries can be very expensive. Designing a model with inputs that allow the system to cache a lot of market data- rather than query for it over and over again- can make the difference between microseconds and seconds.
 
Thanks all for the input.

I would like to do a class project rooted in a practical market application. If anyone has something they need done in SQL with VBA, and it's suitable in scope for the project, I'll do it gratis in exchange for a reference.

My industry references are solid. If interested please drop me a PM.
 
Usefulness of SQL in Analysis

While it is possible to do simple statistical analyses using the SQL language, this is not its main strong-point. For quants there are a couple of uses for which some knowledge of SQL can be quite helpful.

1) Making a temporary copy of data.
Repeated queries on a large dataset can be very time consuming. Using SQL, you can copy from a server database onto your desktop or laptop only the subset of database of interest at the time. The free version of Microsoft's SQL Server is particularly useful in this regard. You can install it on your laptop and use it as a local repository for data subsets.

2) Exploring relational data with R.
The R statistical program has a library package, RODBC, for connecting to a relational database and using SQL to retrieve the specified data. The rowdata from the database is then organized into an R dataset for advanced analysis.

Dan Buskirk
 
Often the data-management part of the work I've done in Excel in support of some model would beg for more efficient ways to be done. This isn't my specialty, but I gather that the memory requirements -- the overhead -- of storing Excel row-and-column data (address, properties, etc) must be greater than simply storing a record's row and field contents. If speed is an issue, knowing how Excel can bog down, I am guessing this is an issue, especially w/larger data sets.

Excel can be wired together like a relational database, but it's kind of like overloading a small truck when you ought to just step up and get the bigger one. For lack of a better analogy. And, writing as a user, not a computer scientist.
 
Excel has many limitations for data analysis. It's more of a holdout for the user, the non-technical guy that needs all his analysis and feeds in excel because that what they know, and the firm doesn't have front-end developers for little day-to-day stuff. Excel does not allow you to nest nearly as many functions as you would like, and sometimes can't recognize certain ones when combined. It's a great tool, until you start to get frustrated that it can't do what you want it to do without freezing your computer, lol.

Putting datasets into SQL Server/Sybase databases is a quick and simple way to do more meaningful data analysis. The datatypes are defined, you can obtain data integrity, set keys and indexes, and normalize your data. How many times have you had to create a concatenated field containing 4-5 fields so you have a 'key' to do vlookups on? Doing SMF recon with excel is a huge pain. I suppose you could try to use excel's 'Query' add-in, but it's just another limited wrapper on what you can do with a SQL relational database.

Obviously the aggregation functions are very useful, but by far my favourite set of functions are the case/when/then/else/end set. It can be used everywhere, inside aggregates, sorting, grouping, even on joins. If gives you the flexibility to allow you to define how you want your data to come together. You are very limited in excel, you can't write complex conditional statements within the out of the box functions. So you end up heading to VBA to write macros and functions, but to me that's just complicating things, when you can use a straight data store and query tool. I feel the same way about MS Access, it's better in terms of data management, but in the end you have to write 8 queries in the SQL view for something that I can do in one statement in true SQL. I like the exists/not exists functions as well, key for doing reconciliation.

Then once in a table, you can write stored procedures to perform your analysis, such as recursive functions to optimize or solve, or perform a regression analysis on historical data.

It's kind of the halfway point between business-level data apps and more sopisticated programming languages. You can do many things faster with a C++ for instance than a stored proc, but for someone with less IT knowledge, the code is more readable and simple to program, and space and memory gets cheaper by the day.

When performance matters though, you will find it's limitations, but it's great to browse and understand data.
 
kx's kdb+ is an excellent resource if performance is a priority. It's got most of the traditional sql constructs along with a rich library of inbuilt functions. Besides, it's quite easy to have it set it up to interface with C,Java,Matlab and the like as well.
 
Back
Top