Architecture for a live data feed provider

asdwilmott · Oct 8, 2009

Can someone please give example of a typical architecture employed by a trading application to display live data feed to its clients?

eg., assuming the company name of the trading application is X, is this architecture possible?
1. X subscribes to Bloomberg using data feed API. There is a windows service or application process (process1) running on one of the servers of X. It maintains a socket connection with Bloomberg by calling the data API using a specific port,data address and a list of symbols.
2.Client machines have a Visual C++ application (front end of trading application) each running on their desktop.
3.There is a process(process2) running on one of the servers of X which has a TCP/IP socket connection listener. It is connected to server process process1 using named pipe to get streaming data.
4.The client machine VC++ GUI is connected to the server process process2 running on X using Socket connection, and live data stream is supplied to the client

If this is possible, I am confused whether how the server process (process2) is able to serve socket connections for so many clients (eg. 500) at the same time? Does it mean that process2 creates 500 threads to service each client?

Any tips or suggestions will be highly appreciated

NetTecture · Oct 9, 2009

Bad architecture.

Use a thread pool and queue work items You can easily scale to 20.000 to 30.000 connections then. Not saying you have the CPU power for that, but otoh that should not be THAT complicated.

There is NO Need to have one thread per socket. Java made the same beginner mistake in the old network api. Killed them there for servers. You waste tons of memory (stack space per thread), kill caching on the CPU, AND.... introduce tons of useless context switches.

* Get a thread pool. WIndows / IIS is a good example for a similar architectur. They use IIRC 50 or 100 threads per CPU core.
* Queue work items for work to be done.
* Decouple your architecture AS MUCH AS POSSIBLE, so things stay in queues as much as possible.
* Get used and access to spinlocks for queue access - do not use classical mutex locking, as it introduces a significant performance hit compared to a spinlock

I do the same in C# now - working on a data distribution. THis is the way I go - C#, THread Pool, QueueWorkItem.

dcraig · Oct 9, 2009

You have been able use 'select' in Java NIO for a rather long time.

To the OP, read any *nix documentation on listen () accept () and most importantly select () to understand how a single server process or thread may have many open tcp connections.

I say go to *nix documentation, because that's where BSD sockets come from. It is the source of most modern TCP networking. Stevens is generally accepted as the best book, if you want chapter and verse.

It's not really a matter of thread pools, though on a multi-core or multi CPU machine, it's obviously the way to go.

NetTecture · Oct 9, 2009

Quote from dcraig:

You have been able use 'select' in Java NIO for a rather long time.

To the OP, read any *nix documentation on listen () accept () and most importantly select () to understand how a single server process or thread may have many open tcp connections.

It's not really a matter of thread pools, though on a multi-core or multi CPU machine, it's obviously the way to go.
More...

Define "quite a long time" I remember the Java time BEFORE "NIO" which stands for "New IO"

Manni · Oct 9, 2009

Quote from NetTecture:

* Get used and access to spinlocks for queue access - do not use classical mutex locking, as it introduces a significant performance hit compared to a spinlock

I do the same in C# now - working on a data distribution. THis is the way I go - C#, THread Pool, QueueWorkItem. [/B]
More...

spinlocks will use up more cpu than a simple mutex as they are spinning in a loop waiting for a resource.... i dont understand how mutex locking can incur a performance hit over a spinlock. am i missing something here?

NetTecture · Oct 9, 2009

Yes. YOu miss the context.

Spinlocks are great for RARELY CONTESTED FAST LOCKS.

Like:
* CHeck whethe comething is in the queue (count variable)
* Take out item (linkled list, round robin buffer).
And:
* Lock to insert.

In these cases... the spinlock MAY RARELY use 200 or so CPU cycles waiting for the other lock to be released.

Compare that to a mutex context switch which is a LOT more cpu intensive and you get the issue. If the mutex lock gets activated it has t odo context swtiches which cost MORE than the cpu ait cycle for the spinlock for extreme fast operations.

Do NOT use a spinlock for anything doing processing / using more time But for queued list access synchonization they are great.

dcraig · Oct 9, 2009

Quote from Manni:

spinlocks will use up more cpu than a simple mutex as they are spinning in a loop waiting for a resource.... i dont understand how mutex locking can incur a performance hit over a spinlock. am i missing something here?
More...

The argument will go that a mutex implies a context switch while a spinlock doesn't, and under some high load circumstances may have some edge.

Like all such arguments about performance, I'm skeptical without real performance data. Spinlocks may best be left to kernel developers who know what they are doing.

dcraig · Oct 9, 2009

One of the cornerstones of the philosophy of *nix is to avoid premature optimization. It is all too easily forgotten.

Manni · Oct 9, 2009

Quote from NetTecture:

Yes. YOu miss the context.

Spinlocks are great for RARELY CONTESTED FAST LOCKS.

Like:
* CHeck whethe comething is in the queue (count variable)
* Take out item (linkled list, round robin buffer).
And:
* Lock to insert.

In these cases... the spinlock MAY RARELY use 200 or so CPU cycles waiting for the other lock to be released.

Compare that to a mutex context switch which is a LOT more cpu intensive and you get the issue. If the mutex lock gets activated it has t odo context swtiches which cost MORE than the cpu ait cycle for the spinlock for extreme fast operations.

Do NOT use a spinlock for anything doing processing / using more time But for queued list access synchonization they are great.
More...

i cant believe that context switching will be a issue. if you are using a single processor machine then spinlocks are useless as they will livelock the CPU, you need context switching for the threads to do their work. spinlocks can only work efficently on multi-processor machines and even then you need to be very selective in how you use them.

spinlocks are rarely used outside of low level kernel development and in a real life scenarios its almost never the case that spinlocks give you better performance than mutexes.

even if you have a multi processor machine and you use spinlocks, you cant guarantee that the OS will distribute out the threads evenly so that the thread spinning is on a different CPU than the thread holding the lock.

NetTecture · Oct 9, 2009

You miss one critical point here. I can basically nearly guarantee that the spinlock has nearly no zero chance to even actually have to spin