Low latency trading system

Lun · 11/17/09

For algo trading jobs, I frequently find the requirement "low latency"

I just wonder how low latency can be achieved ? I have done a search in google, I find none discussing it

Can any one experienced or professional shares how it is achieved ?

Is it by multi-thread ?
Is it by code improvement, say skip some unnecessary loop by code level improvement ?
or any idea I don't state here ?

Thank you !

Aditya Chitral · 11/17/09

I am not sure but things that come to my mind are:

How quickly you can establish a connection?

How many bits are you transferring to carry out a trade, so this would bring in things like encoding.

Multithreading would certainly improve the throughput but if its a serial process its difficult to improve the turn around time.

Correct me if I am wrong.

Lun · 11/17/09

From your point of view, the bottleneck is at the network. That means, to reduce the delay, the knowledge you need is network programming, say some options of TCP/IP, tradeoff between TCP & UDP.

I mean, "low latency" is equivalent to the topic of "network programming", right ?

Andy Nguyen · 11/17/09

There is no simple or short answer for this. High frequency, low latency trading system consists of multi layers of technology, each plays an important role. Experts in developing such system got paid HUGE amount of money. Anyone remembers the one that stole GS code?

There are at least three types of latencies involved: built-in server latency (the time needed by the server to process a transaction), network latencies, server 'refresh rate' - this is the rate at which the server publishes new information.

A very simple example of how to find out the latency is to use "ping" command from a terminal to find out how long it takes to the server and back to your computer. Something like "ping -t yahoo.com" where you replace "yahoo.com" with the exchange server address.

Many off the shelf trading systems will allow you to send a request to an exchange server to get a bid/ask confirmation. Using this info will give you a good idea of your system latency.

Then, you would look at the routing packet information to analyze it and optimize them to reduce the time the data travels. This is where network programming comes in. I think the GS programmer is the guy who works on this kind of problem.

Then, you would have to see if your trading system have a timestamps order embedded correctly or not. This means using some nuclear clock radio card in your system or use the timestamps from some super accurate time server like the National Nuclear Lab or such. You will have to find out how the timestamps at the exchange server is embedded , at which stage in the order process it is used, then compare the order to ack confirmation. If you deal in multiple exchanges, this gets complicated very fast.

Just some idea of what can be involved in a low latency system. It involves "system programming" (how your system process orders), "network programming" (how data travel between you and exchange), and more. It's no wonder people with experience in this field is hot commodity.

iddqd · 11/17/09

They are starting to measure in microseconds recently.

Andy Nguyen · 11/17/09

Maybe one day, this technology can be used for online gaming. Just imagine how great it would be for FPS games

Back in 2007, sub 200 microsecond latency network made news.
Data News | Ultra Low-Latency Connection Facilitates Sub-200 Microsecond Network Transit Times | Automated Trader

IlyaKEightSix · 11/18/09

Sub 200?

Try 25 direct market, or 70 on network. That's the stats on the software alone of the trading engine my internship boss's company produces. Oh, and I believe they also get 35,000 messages per second in throughput (don't remember if it's per second or per 25 microseconds). After hardware optimizations, I believe we're talking sub-20 microseconds. And the performance is highly deterministic (no fat tails here).

UBS uses it as their brokerage system, along with 119 other clients. (The last 20 of which were acquired in the last three months.)

To put it into perspective, DESCo uses his previous system from his previous company (called Javelin Technologies), which is like ten years old. He's in talks with their CTO to give them an upgrade to the latest version.

And I essentially wrote test cases and translation algorithms for programs that are related to said engine.

elliot · 11/18/09

The software you mentioned is just a FIX engine, not a trading engine. FIX engines don't do any complex processing. They just normalize or route the message.

Eugene Krel · 11/18/09

IlyaKEightSix said:
Sub 200?

Try 25 direct market, or 70 on network. That's the stats on the software alone of the trading engine my internship boss's company produces. Oh, and I believe they also get 35,000 messages per second in throughput (don't remember if it's per second or per 25 microseconds). After hardware optimizations, I believe we're talking sub-20 microseconds. And the performance is highly deterministic (no fat tails here).

As elliot pointed out your product is a FIX engine, which is an important, but not really "algorithmic" part of electronic trading. Fixflyer is not a broker, but a vendor that supplies a fix engine and apparently an order management system as well as some analysis software. If one were to take their claims at face value then they are running a pretty good operation.

Whatever fixflyer is doing, they are not running 35k/25 microseconds. In fact their website states it's per second and that they just recently broke the 200 (on average) microsecond mark. However, when talking about latency it's very important to describe exactly what you are measuring, in their release it states the tests were done on their servers so geographical location was a major factor.

All that aside, hardware optimization is important, but there are other issues to take into account before spending hundreds of millions on marginal upgrades.

elliot · 11/18/09

Latency can be reduced by colocation High-Frequency Trading Shops Play the Colocation Game by Advanced Trading

AnoopRN · 11/18/09

Other than enhancing your technology platforms to establish connectivity and assimilate data with trivial delays , you need a setup which 'reacts' to this data in real time - reacts fast. Data vendors like Wombat are becoming increasingly popular amongst a lot of Algorithmic Trading desks. Some products like kdb+ provide an analytically rich platform for real time tick processing. Some firms also come up with customized hardware (FPGA's for instance) for implementing their trading logic.

satyag · 11/18/09

IlyaKEightSix said:
Sub 200?

Try 25 direct market, or 70 on network. That's the stats on the software alone of the trading engine my internship boss's company produces. Oh, and I believe they also get 35,000 messages per second in throughput (don't remember if it's per second or per 25 microseconds). After hardware optimizations, I believe we're talking sub-20 microseconds. And the performance is highly deterministic (no fat tails here).

Sounds like 35000 per second. 35000 per 25 usec = 1.4 ticks per CPU cycle on 1GHz core.

Does anyone have details on what hardware optimizations are available out there ?

UBS uses it as their brokerage system, along with 119 other clients. (The last 20 of which were acquired in the last three months.)

To put it into perspective, DESCo uses his previous system from his previous company (called Javelin Technologies), which is like ten years old. He's in talks with their CTO to give them an upgrade to the latest version.

And I essentially wrote test cases and translation algorithms for programs that are related to said engine.

jay.berg · 11/18/09

-- latency is the time from when data gets created to when it acted upon.
-- all data is latent, due to the speed of light limitation

-- in HFT, one tries to be as fast as possible, hence "Low Latency"

Latency effect HFT models in many ways

-- time it takes to receive market dat from the exchange
-- time it takes to send order to the exchange
-- time it tales to receive fills from the exchange
-- time it takes to process market data in order to make a trading decision

90% of the latency is in the transportation of data to and from the exchange.
the other 10% comes from the processing / coding

a company like QuantHouse can be used to minimize the latency from the transportation.

see QuantHouse, low latency data and algo trading solutions!

satyag · 11/19/09

Nice article I found in linked in Latest News, Press Releases and Events - Corvil

Andy Nguyen · 11/19/09

Nice article but my eyes... A light gray on a white background is really good for the eyes, I'm told.

Wallstyouth · 11/20/09

Low latency

I'm currently consulting for a proprietary arbitrage desk at a major IB and we're doing about 2-3ms end to end our business is purley latency arbitrage and we spend huge amount of dollars shedding microseconds from various components in our stack.

To give you an idea on the type of optimizations we're doing:

Our applications are using IPC shared memory for the aboslute lowest latency between our trading processes.

We get a 10 Gigabit hand off from our co-located market data provider going into a hardware based ticker planet which our servers subscribe to for market data via infiniband by passing the TCP stack using RDMA "Remote direct memory access".

We cross connect many exchange feeds directly to our infrastructure system we trading all major asset classes and are co-located in the same physical buildings as the major exchanges like NYSE, TSX, BATs, CME, NYSE. ARCA, FXALL etc..

We utilizing Linux Realtime and currently looking at writing custom processes running on Cuda GPU's for our models.

We do some nitfy tricks to limit resource consumption on our servers like CPU shielding/processor affinity, IRQ balance, splitting order entry nics, and market data nics etc..

The low latency game is very expensive but if you are willing to invest in the infrastructure and people you can still make a lot of money!!

amit patel · 11/20/09

Wallstyouth - Nice points

Which approch do you use ? Non-blocking IO or Multithreading ?

Wallstyouth · 11/20/09

Wallstyouth - Nice points

Which approch do you use ? Non-blocking IO or Multithreading ?

It varies amongst the various teams but my official take on this which I've had many design discussions in the past has been that the fastest sockets code uses non-blocking sockets and select to multiplex them.

You can put together something that will saturate a LAN connection without putting any strain on the CPU. The trouble is that an app written this way cant do much of anything else - it needs to be ready to shuffle bytes around at all times.Assuming that your app is actually supposed to do something more than that, threading is the optimal solution, (and using non-blocking sockets will be faster than using blocking sockets).

Unfortunately, threading support in Unixes varies both in API and quality. So the normal Unix solution is to fork a subprocess to deal with each connection. The overhead for this is significant (and dont do this on Windows - the overhead of process creation is enormous there). It also means that unless each subprocess is completely independent, youll need to use another form of IPC, say a pipe, or shared memory and semaphores, to communicate between the parent and child processes. Even though blocking sockets are somewhat slower than non-blocking, in many cases they are the right solution. After all, if your app is driven by the data it receives over a socket, theres not much sense in complicating the logic just so your app can wait on select instead of recv.

I know I probably didnt fully answer your question but its really depends on the magnitude of the problem and what your trying to solve.

Andy Nguyen · 11/15/10

A new technique exploits geography to maximize high-frequency trading opportunities between markets.

But two scientists have devised a new technique to exploit geography in high-frequency trading and minimize the delays inherent in data transmission, even at the speed of light. "At the highest level, rather than delivering orders for execution at a trading center like New York or London, the idea is to utilize a third location that is intermediate to two geographic cities and use it as a coordinator of trading between the first two," says Alex Wissner-Gross, a research affiliate at the Massachusetts Institute of Technology Media Laboratory.

Together with Cameron Freer, a junior researcher in the department of Mathematics at the University of Hawaii, Wissner-Gross examined pairs of 52 exchanges located around the world. "If you have some stocks traded on the New York Stock Exchange and the London Stock Exchange, but more heavily in New York, you will want to be located between the two cities, but closer to New York," Freer explains. "You want to be where ... you have the largest short-term discrepancy in prices," he says, adding, if pricing is much higher in London, for example, a trader will want to buy at the New York price and sell at the London price.

http://www.wallstreetandtech.com/articles/228200829

Low latency trading system

Lun

Aditya Chitral

Lun

Andy Nguyen

iddqd

Andy Nguyen

IlyaKEightSix

elliot

Eugene Krel

sunmulA

elliot

AnoopRN

satyag

jay.berg

satyag

Andy Nguyen

Wallstyouth

Vice President

amit patel

Wallstyouth

Vice President

Andy Nguyen