Recently I've stumbled over an a question on
how
to
significantly
improve
java
performance on stackoverflow.com.
The question was inspired by an article of Martin Fowler about the
The LMAX
Architecture where he describes the architecture of a trading
software made by the company LMAX. The have made a Business Logic
Processor that can handle 6 million orders per second. Whats so great
about it? Well, it is single threaded. Yes, there is only one thread
that executes the whole business logic.
The main trick is to keep all the data
needed by the business logic thread in memory. This brought them a
performance of 10K TPS. Then they've tweaked memory handling
(allocation/garbage collection) and unwanted executions (mainly
synchronization issues).
It's amazing what performance you get, when you
respect the memory
hierachy of computers and avoid unnecessary operations. The
drawback of this solution is that it doesn't focus on the processing
time that a user notices. When their business logic needs some
external information they stop processing of the order and let
another thread get the data that provides them as input of the next
calculation. Imagine that 90% of the time a user is waiting is spent
with bytes being sent over the wire. Their solution won't help you
much as it could only impact on the remaining 10%. You have to cut
back the time/amount of network transfer.