Single threaded async

April 11, 2012

This is an excellent explanation of the single threaded asynchronous model of programming. It’s refers to Twisted, but many of the points apply generally.

Magic Ink II

March 22, 2012

I posted earlier on Bret Victor’s Magic Ink. Now I’ve finished reading this quite lengthy paper. In the second half Victor goes into more detail on how information applications – which includes most financial apps – should be implemented. Info apps need to learn from the history of user behaviour: Victor uses the example of his own implementation in the BART scheduler app for this. He points out that decades of research have been done on machine learning, but we still don’t have neatly packaged abstractions around the results of that research that would make it usable for Joe Average app developer. This shouldn’t be the case, as we have nice usuable abstractions for all sorts of other areas of comp sci research: file systems, sort algorithms, GUIs etc.

The other major pillar of info apps in the Victor scheme is context sensitivity. He makes a compelling case for achieving this via a dynamically bound component model that sounds similar to that advocated by Brad Cox 20 years ago. Plus ca change !

Finally, Victor discusses the kind of device that info apps should run on. It reads like he’s describing the iPad. Bear in mind that the iPad launched in 2010, and Victor wrote Magic Ink in 2006. So he was remarkably prescient.

Turing’s Cathedral

March 20, 2012

So I’m reading Turing’s Cathedral, and hve greatly enjoyed the first couple of chapters. The Guardian’s review prompted me to get a copy. There is a mixture of history and technical theory in theis book, as Spufford pointed out in his review. Dyson is drawing out the connections between the bomb project and early computing development too, which is fascinating. I’m sure I’ll be struck by parallels with the development of financial pricing models as I read. In chapter one Dyson decribes how MANIAC’s memory was built out of 40 cathode ray tubes, each of which could store 1024 bits, giving 40K bits of addressable storage. He then comments: “Since a 10 bit order code, combined with 10 bits specifying a memory address, returned a string of 40 bits, the result was a chain reaction analogous to the two-for-one fission of neutrons within the core of an atomic bomb. All hell broke loose as a result. Random-access memory gave the world of machines access to the powers of numbers – and gave the world of numbers access to the powers of machines.”

Fundamentally, Dyson is describing a mechanism for indirection. As I read his desription I was struck by the parallel with Godel numbering, and how it allows mathematical statements to be turned into numbers, which can then be quantified over by further statements. That opens up the possibility of a self referential statement, which enables Godel to prove the incompleteness theorem.

Magic Ink

March 1, 2012

Thanks to reddit I’ve just discovered Bret Victor. I watched the Invention video, and enjoyed the whole theme on tightening the feedback loop between changing code and seeing results. The later part on moral crusading was interesting if not entirely convincing. So I checked out the web site, and am reading Magic Ink. Wow ! This is a full blown vision of doing software differently. Back in the 90s I got really excited by, in turn, Brad Cox’s vision, Patterns, and Open Source. About 10 years ago I discovered dynamically typed languages with Python and Smalltalk. And that’s the last time I had a real rush of excitement about some new approach in software. Sure, I’ve dabbled in functional languages like F#, and played with various OSS projects. But for the most part my attention has been on the trading topics that fascinate me, like electronic limit order books.

So what’s Magic Ink about ?  Victor divides software into three categories: information, manipulation and communication software. He focuses on information software, which is most apps really. And that includes most financial and trading apps. And then he proceeds to argue that there’s too much interactivity, and that interaction is bad. The way forward is context sensitivity combined with history and graphic design. Counterintuitive, and utterly convincing. A joy to read !

I can’t help wondering what the UX crew over at Caplin think of this ?  I haven’t seen them blogging on it. Victor’s views have radical implications for how etrading apps should work. I’d expect Sean Park to be pushing this angle with his portfolio companies too…

Early vs late binding

January 29, 2012

I agree with pretty much everything Jeff has to say on strongly typed, statically bound languages vs weakly typed late binding systems. But I don’t agree that we must decide in favour of the former. There is no one size fits all language, and I believe the right approach is to use a mix of early and late binding. Early binding when we want efficiency, compactness and speed. And late binding when we want expressiveness and rapid time to market. Electronic trading systems are a perfect example of a class of problem where we want all those contradictory virtues in the same system. A combination of languages is the only way to satisfy all those requirements. Brad Cox laid all this out this approach with great prescience in Planning the Software Industrial Revolution in 1990. What we need a development environments that allows us to mix languages at will. Debuggers that cope seamlessly when we examine a stack with different languages in different frames. Consistent threading models across runtimes. And consistent memory management models. Microsoft have come closest to achieving this with the .Net CLR. The Java VM can host multiple languages too, but not with the same quality tooling.

Personally I like the combination of C++ and Python. But there are tensions implicit in building systems with that mix. The first is the threading model. If you’re doing server engineering you mus be GIL aware. A single C++ thread works well with Python. A limited number of C++ threads on dedicated tasks, with just one of them invoking Python works well too. Multiple C++ threads as pooled workers using locking to cooperate in executing identical logic and each invoking Python will not work well.

Another tension is the different memory management models. The Python C runtime organises it’s own memory pool of PyObject* allocations. There are separate sub pools for different types, with different rules for the management of strings, integers and larger objects. Python’s memory pool tends to only grow, unlike a C++ program who’s memory profile we can see fall when the C RT lib hands memory back to the OS.

So if we have multiple languages in the same run time one of the biggest challenges is making the right architectural decisions so that those languages cooperate despite drawing on the same OS provided resources in different ways.

mod_wsgi on Symbiosis

January 8, 2012

Bytemark’s Symbiosis Linux doesn’t have mod_wsgi installed in its Apache2 deployment. To make it work with the Python 2.7 I built from source I  had to build mod_wsgi from source too. mod_wsgi’s Makefile needs Apache’s apxs2 tool, which isn’t in the standard Symbiosis. Fortunately that is in the apache-threaded-dev package. Here are the steps I used to make mod_wsgi work…

# get apxs2 so we can install mod-wsgi from source. Needs to be built from source to
# get the Python.h from the Python we built from source.
sudo apt-get install apache2-threaded-dev
# for mod_wsgi: configure should find apxs2
# the soft link is needed so that libtool can find gcc
./configure 
 ln -s /usr/bin/gcc-4.3 /usr/bin/i486-linux-gnu-gcc
 make
 make install
 /etc/init.d/apache2 restart

 

Satchmo on Symbiosis

December 20, 2011

I’m going a little bit off topic here, as this post has no trading or finance content. However, it’s good to share, and hopefully this will help anyone out there struggling with a Django based web dev stack on a Bytemark Symbiosis host. I’m using Satchmo for an ecommerce site. But a lot of the detail will be applicable to any Django based web site. If you’re hosting plain old HTML, or PHP, then Symbiosis is fine and dandy. If you want to get Satchmo or Django running you’ve got a bit more work to do as Symbiosis is Bytemark’s own Debian Lenny distro with Python 2.5 and no GCC. The ecommerce site I’m deploying has been developed with Python 2.7.2, Django 1.3.1 and Satchmo 0.9.2. So I needed to build up the whole thing from scratch, starting with an apt-get for GCC and a Python 2.7.2 source build. Here’s the recipe I followed. Note that I’m omitting all the directory tree specific cds, gunzips and ‘tar xvf’s from the commands, but leaving in all the fiddly cmd line options that can take ages to figure out…

## GCC + Python 1.7
apt-get install gcc-4.3
wget www.python.org/ftp/python/2.7.2/Python-2.7.2.tgz
# We need to bld Python with zlib support compiled in, otherwise 
# Satchmo's easy_install setuptools with fail with a zlib ImportError.
# Check that /usr/include/zlib.h exists
apt-get install zlib1g-dev
# We need the mercurial hg client to pull stuff off bitbucket repositories
apt-get install mercurial
# Also get pip installer
wget http://python-distribute.org/distribute_setup.py
python distribute_setup.py
wget https://raw.github.com/pypa/pip/master/contrib/get-pip.py
python get-pip.py
# Now build Python
./configure -with-zlib=/usr/include
make
make install 
# Python 2.7.2 is now the default on this host
## sqlite 2.6.3 - not a std part of Linux source build
apt-get install libsqlite3-dev
wget pysqlite.googlecode.com/files/pysqlite-2.6.3.tar.gz
python setup.py install
## Django
wget https://www.djangoproject.com/download/1.3.1/tarball
python setup.py install
python
>>> import django
## Satchmo
wget https://bitbucket.org/chris1610/satchmo/get/v0.9.1.tar.gz
# gunzip, untar then follow Satchmo install notes. First, get easy_install. This will fail if
# you didn't get the zlib stuff right.
wget pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11-py2.7.egg
sh setuptools-0.6c11-py2.7.egg
easy_install pycrypto
easy_install http://www.satchmoproject.com/snapshots/trml2pdf-1.2.tar.gz
easy_install django-registration
wget effbot.org/downloads/Imaging-1.1.7.tar.gz
python setup.py install
# ReportLab
wget www.reportlab.com/ftp/reportlab-2.5.tar.gz
python setup.py install
hg clone http://bitbucket.org/bkroeze/django-threaded-multihost/
python setup.py install
hg clone http://bitbucket.org/bkroeze/django-caching-app-plugins/
python setup.py install
pip install sorl-thumbnail==3.2.5
hg clone http://bitbucket.org/bkroeze/django-signals-ahoy/
python setup.py install
hg clone http://bitbucket.org/bkroeze/django-livesettings/
python setup.py install
hg clone http://bitbucket.org/bkroeze/django-keyedcache/
python setup.py install
hg clone http://bitbucket.org/chris1610/satchmo/
python setup.py install
python
>>> import django
>>> import satchmo_store
# Now clone the default Satchmo store
python scripts/clonesatchmo.py
python manage.py runserver
# Fix Symbiosis firewall to allow incoming on port 8000
# http://symbiosis.bytemark.co.uk/docs/ch-reference-firewall.html
cd /etc/symbiosis/firewall/incoming-d
touch 11-8000
firewall
# Finally, run satchmo. Note 0.0.0.0 to be available to any connection
python manage.py runserver 0.0.0.0:8000

Tales notes the trend towards Python in a recent post. Couple of points in response: firstly, slang at Goldman. I’m told by ex Goldmanites that slang is very close to Python anyway. Secondly, Python’s Global Interpreter Lock. The GIL always seems a disadvantage to developers accustomed to ‘traditional’ multi-threaded environments like C++, Java and C#. Those who criticise Python from that perspective are right to point out that the GIL prevents developers from achieving concurrency within a single process, because it puts locking around access to all memory owned by the Python runtime. That is a real constraint, but one that has benefits when dealing with mixed language environments. For example, if you’re implementing a pub sub API in C++ using Python’s C API, you’ll want a callback mechanism to handle incoming messages. Native pub sub APIs typically spawn another thread on which they dispatch new message callbacks. When that thread invokes the callback implementation, the callback implementation can invoke Python code without worrying about locking or mutexes or the fact that it’s invoking Python’s C API on a different thread than the main app thread. It’s not a concern, because the GIL serializes all access to Python’s memory. Which simplifies things hugely…

C++ is back ?

September 14, 2011

Tales blogs that C++ is back, and attributes it’s renewed significance to C++11 and Objective-C. I’d say it never went away. Of course C++ isn’t the right language for DB backed web form apps. But it’s absolutely the right choice for operating systems, drivers, gaming engines, pricing algorithms, HFT systems. Anything where efficiency, control and working closely with the hardware matter. I don’t see that changing anytime soon, since platform and language vendors are targeting the vast middle ground of “enterprise apps”. For “enterprise”, read boring…

Fascinating post from Quantivity – I’m hoping for more on the same topic from him. Many of the advantages listed would be enjoyed by any small non real money fund: hedge, prop, family office etc. Of course there are some serious obstacles that small (relatively) unregulated funds face, and Lars Kroijer describes them in detail in Money Mavericks. And a lack of legacy technology is indeed an advantage in building trading systems quickly. A relatively recent pre existing framework, either from vendor or in house built can be a big advantage though. A classic example is gateways for exchange/ECN connectivity.