• C++ Programming for Financial Engineering
    Highly recommended by thousands of MFE students. Covers essential C++ topics with applications to financial engineering. Learn more Join!
    Python for Finance with Intro to Data Science
    Gain practical understanding of Python to read, understand, and write professional Python code for your first day on the job. Learn more Join!
    An Intuition-Based Options Primer for FE
    Ideal for entry level positions interviews and graduate studies, specializing in options trading arbitrage and options valuation models. Learn more Join!

Databases and SQL- where to start?

roni

Cornell FE
Joined
3/19/09
Messages
608
Points
38
Hello,

I'm a complete novice with everything related to databases, so if some of the things I say sound stupid, I apologize.
All right, I think I would like to get a grip on databases areas that might be helpful in several positions in quant finance.
Which major topics would you recommend me to learn? I went over the database books on Amazon and there are many different books. Here are some of the books I found:

1) Apress_Beginning_Database_Design_From_Novice_to_Professional
Beginning- Databas Design Solutions, Wrox

2) Relational Database Design and Implementation, Third Edition

3) Learning SQL Alan Beaulieu
SQL Cookbook
Head First SQL: Your Brain on SQL -- A Learner's Guide by Lynn Beighley

4) Data Analysis Using SQL and Excel

I'm not trying to be an expert with SQL and databases. I would like to learn to use databases or to program with databases effectively. I think picking a book from category #3 is going to be sufficient, isn't it?

I saw Tsotne's thread but didn't want to hijack his thread :\

Thanks in advance.

Roni.
 
I use SQL on a daily basis and actually can make it do some fairly nifty tricks. (I recently implemented some needed functionality for our production software via script that our development staff considered "too hard" to build into the UI proper.) But I've never "learned" it in any formal sense--either with a course or a book--and I think a lot of its users would say the same. To me, that's a testament to how well it's conceived.

If you're feeling ambitious, MySQL is free and, if you can find a project for yourself that's suitable, you could conceivably download it along with their Workbench tool and get a sense of what it can do that way. If you're feeling even more ambitious, there is a connector available that allows you to work with a MySQL database directly from C++.

With that said, I would not advise trying this if you're a Mac user. At least in my experience, the MySQL offering for Mac is at best unreliable, and as far as I can tell in many ways just flat does not work as advertised.
 
If you just want to learn SQL syntax, in order to interact with existing databases, then "Learning SQL" book should serve you great. As far as choice of software to be used while learning concerned, I'd recommend SQLite (http://www.sqlite.org/) - all of Linux/Windows/Mac supported, database is in a plain single file (so no hassle to setup server etc.), simple and nice command line tool exists to execute SQL statements, programming APIs are available for number of programming languages, and all of the basic SQL syntax is supported.

If you're up to learning about database design, then things are more complicated. There exist number of books, starting from "Database Design for Mere Mortals" type of books (this one is actually often recommended as good started for learning about basics of database design), up to heavily theoretic classic from computer science curricula: " An Introduction to Database Systems" by C. J. Date. So it all depends on your interests and the level of proficiency you want to achieve here.
 
As far as choice of software to be used while learning concerned, I'd recommend SQLite (http://www.sqlite.org/) - all of Linux/Windows/Mac supported, database is in a plain single file (so no hassle to setup server etc.), simple and nice command line tool exists to execute SQL statements, programming APIs are available for number of programming languages, and all of the basic SQL syntax is supported..

Thanks for the recommendation. I will definitely check this out.
 
Thank you very much for the advices. I will start with "Learning SQL Alan Beaulieu" and will pick up from there.
I also will try to look for a complete project, as bob suggested.


Thanks very much.
 
There are tutorials available on the web.
Unfortunately there's no substitute for real project work, where you have to learn stuff, fast.
It's nothing like showing up on your first real job and was told that you need to get the data out of a 30GB size database where you need to learn what every table and column in the database contains.
Internet is a wonderful place. If you know where and what to look, you can get up to speed rather quickly and always can find resources and tricks on how to do something better and faster.
I was one of the folks who was handed over a mission critical database and asked to produce data to feed into our money making machines. That's the best way to learn, get things done.
Along the way, I learn of trick to optimize the workflow. Instead of going through several layers, I was able to query, format the data and dump it directly into our models in a way that saves everyone time and increase productivity.

SQL is a skill that is increasing important and many MFE students have never learned that skill. You are not going to be a DBA where you have to know the detailed structure of each database but at the very least, have some mastery at query. That will get you a job someday.
 
roni - If you want a gentle introduction and are not bothered about learning a little PHP you could download WAMP.

If you install this on your Windows machine (or XAMP for mac I believe) you'll get a small Apache based dev environment with MySQL server, where you can build databases and build some small PHP apps to interact with them.
 
The basics of SQL are not difficult to learn. My advice is to pick some sort of project that deal with data and start hacking at it. You can use any of the books mentioned already or pick a tutorial from the Internet. However, if you are into baseball, OReilly publishes a book called "Baseball Hacks" that talks about statistics in baseball. It also gives recipes on how to handle all that data with MySQL, Perl (I don't like this but I can live with it) and R.

(just my $0.02)
 
The basics of SQL are not difficult to learn. My advice is to pick some sort of project that deal with data and start hacking at it. You can use any of the books mentioned already or pick a tutorial from the Internet. However, if you are into baseball, OReilly publishes a book called "Baseball Hacks" that talks about statistics in baseball. It also gives recipes on how to handle all that data with MySQL, Perl (I don't like this but I can live with it) and R.

(just my $0.02)
alain, this book is fun!
 
Could anybody point a link where to download SQL (full) from? I couldn't download from the websites I tried. Anyone knows the correct link? Thanks
 
Books: Celko, I'm sure he has an intro book. His "SQL Puzzles" book is excellent at making you actually think about what it is you're writing. These are the only books I ever used, sparingly at that. http://www.amazon.com/Puzzles-Answe...=sr_1_1?ie=UTF8&s=books&qid=1301871334&sr=8-1

Software varies. SQL is a language, implemented with various minor caveats (for purposes of customer retention I suspect...). I wouldn't recommend sqlite, I haven't heard of it widely used, so you are very unlikely to run into it in the wild very often.

I would recommend one of the following to mess around with:

MySQL - the DB in the LAMP stack that seems to dominate all things start-up internet. http://www.mysql.com/downloads/

PostGRE - has "more features" than MySQL. I've found it to be more stable and less buggy. http://www.postgresql.org/

Microsoft SQL server - the most production ready... provide an "express" version to mess around with. Gets better performance than MySQL and PostGRE. http://www.microsoft.com/express/

Oracle - very widely used, considered the best by many, but I'm not sure if they provide a free version. http://www.oracle.com/technetwork/indexes/downloads/index.html#database

Like Ken said, SQL is something you learn by messing around with it. Google is the best reference, and aside from that, make a minor project for yourself or two and work on them. For example, download some time series from somewhere, import them into your DB and try to backtest some algos (which you would have to build).
 
This has been my experience. At my last job we ran MSSQL, MySQL and PostGRE databases. MSSQL made short work of the other two. Then again, it cost ten(s) of thousands of dollars a year.
 
All RDMS have a free/startup version including ORACLE. The performance of the app heavily depends on a qualification of a programmer.
A current fully loaded Enterprise MSSQL with all tools and a CPU license retails at around 25k while a similar version of ORACLE is at around 125K. This makes MSSQL a more attractive solution to small and medium corporations looking to go big for small amount of $.
The majority of prop desks and hedge funds are using .net technologies to build front end apps and reporting solutions. This makes MSSQL and its TSQL a good choice to learn for a financial software developer but they already know that.

Some stats:
http://www.mysql.com/why-mysql/marketshare/
 
I wouldn't recommend sqlite, I haven't heard of it widely used, so you are very unlikely to run into it in the wild very often.

Then check this: http://www.sqlite.org/famous.html. Also, Bloomberg is sponsoring SQLite development, so I guess they have some interest too... Now, SQLite certainly is not in the same league as other DMBS mentioned, it's much simpler and slower. But it has its place in some simple projects and, in my opinion, nothing could beat it in using it by novice for learning - do you really think someone should bother with learning first how to setup and connect with server etc., in order to start learning SQL?

Also, I doubt too that MS SQL has better performance than PostgreSQL, at least not in case if PostgreSQL run on Linux.
 
Back
Top