- Joined
- 10/16/08
- Messages
- 97
- Points
- 18
Hi all,
was wondering if anyone can recommend or has experience with an open source database for storing and, especially, retrieving market depth and trade messages?
Thus far I am caching messages in memory until they grow to a certain size by creating an object of each message, push that object onto a list and then serialize that list using protobuf. The serialised lists then are zipped and stored onto the filesystem using some naming convention for later retrieval. That whole operation works actually quite well in terms of efficiency and storage space.
For analysis, I'd then simply deserialise the list into memory and use C# linq for queries. This works reasonably fast. For instances deserialising a list of about 2mil "message objects" takes about 4-6 seconds on a quad core with 16GB of RAM. Add another 10 seconds or less for returning the result of a particular query.
I'd like to get rid of the 'deserialising the list into memory' step and be able to query the file directly and am therefore looking at other methodologies. Thus far, I have looked at,
- Couple of OODBs,
- SQL Server and MySQL,
- KyotoCabinet
None of them were able to beat my benchmark, i.e. rather naive implementation described above. Currently, I am looking at HDF5, but given my previous experiences, I am not putting too much hope into it. Would be great if someone had any pointers. Note that commercial alternatives such as Kdb, OneTick, and the like are not an option at this stage.
was wondering if anyone can recommend or has experience with an open source database for storing and, especially, retrieving market depth and trade messages?
Thus far I am caching messages in memory until they grow to a certain size by creating an object of each message, push that object onto a list and then serialize that list using protobuf. The serialised lists then are zipped and stored onto the filesystem using some naming convention for later retrieval. That whole operation works actually quite well in terms of efficiency and storage space.
For analysis, I'd then simply deserialise the list into memory and use C# linq for queries. This works reasonably fast. For instances deserialising a list of about 2mil "message objects" takes about 4-6 seconds on a quad core with 16GB of RAM. Add another 10 seconds or less for returning the result of a particular query.
I'd like to get rid of the 'deserialising the list into memory' step and be able to query the file directly and am therefore looking at other methodologies. Thus far, I have looked at,
- Couple of OODBs,
- SQL Server and MySQL,
- KyotoCabinet
None of them were able to beat my benchmark, i.e. rather naive implementation described above. Currently, I am looking at HDF5, but given my previous experiences, I am not putting too much hope into it. Would be great if someone had any pointers. Note that commercial alternatives such as Kdb, OneTick, and the like are not an option at this stage.