I’ve been heads down working on SpreadServe recently, so haven’t paid so much attention to the etrading topics that I used to blog about so much. Thanks to an update from mdavey, I’ve been catching up on the excellent, thought provoking content that jgreco has been posting on his plans for a new US Treasury trading venue, organised as a limit order book, with buy and sell side trading on an equal footing. I enjoyed the post on internalization and adverse selection. His points about single dealer platforms are well founded too, though my own experience in rates trading is that it’s difficult to get client flow on to SDPs as by their very nature they can’t offer multi dealer RFQs, which are critical for real money clients that must prove best execution for regulatory reasons. Of course, if the inter dealer prices from BrokerTec, eSpeed and EuroMTS were public in the same way as equity prices from major exchanges are public, then more solutions to the best execution problem would be possible. As jgreco rightly points out, transparency is key.

Now I want to raise a few questions prompted by jgreco’s posts, both pure tech, and market microstructure…

  • Java? Really? I wonder if it’s inspired by LMAX’s Java exchange implementation, their custom collections and Disruptor. I would have expected C++, but then I’m an old school C++er.
  • Is that really the workflow ? That must be a tier 2 or 3 bank. All my experience has been at tier 1 orgs where all pricing and RFQ handling is automated. If a trader quotes a price by voice, it’s a price made by the bank’s own pricing engines. Those engines will be coded in C++, driven by Eurex futures or UST on the runs, and showing ticking prices on the trader desktop. Mind you, obfuscation techniques were used to frustrate step 2: copy and paste quote. After you’ve spent a fortune building a rates etrading infrastructure, you don’t want everyone backing out your curve from your Bloomberg pages.
  • Will DirectMatch have decimal pricing, or are you going to perpetuate that antiquated 1/32nd stuff?
  • How will you handle settlement/credit risk? Will each trade result in two, with counterparties facing off with a clearing house?
  • How do you shift liquidity? When liquidity is concentrated at a single venue, it’s difficult to move. The only case I know of is German Govt Futures moving from Liffe to Eurex. I guess UST liquidity is fragmented across D2D and D2C venues, so it’s not concentrated all in one place, improving DirectMatch’s chances of capturing some of the flow.

The old ones are the best ones, and I spent too many hours yesterday butting my head against a classic Windows bug. I was testing an optimisation in the SpreadServeEngine‘s handling of nested invocations of XLL supplied functions. I was using QuantLib‘s YieldCurveBootstrapping spreadsheet, and the QuantLib 1.4.0 XLL addin. The sheet invokes the qlPiecewiseYieldCurve function, and the fourth parameter, RateHelpers, is supplied by an invocation of ohPack. Normally in this scenario, the XLOPER returned by ohPack would be marshalled to the engine’s internal representation, before being marshalled back to an XLOPER for handing into qlPiecewiseYieldCurve. I added a shortcut that made the XLOPER available as well as the internal representation, so the marshalling process could proceed by shortcut if possible. Unwittingly I ended up newing an object in one DLL which was released by another DLL when qlPiecewiseYieldCurve returned and released it’s stack frame. My debug build threw a run time assert on _BLOCK_TYPE_IS_VALID. Cue a bug hunt with me toothcombing through the codebase for a double delete or a buffer overrun. Spreadsheet engines are complex beasts since they’re basically a runtime for a functional programming language, so must maintain an object graph and dispatch imperatively at the nodes to tokenised code and XLL supplied C/C++ functions. There’s a lot of memory pool and stack management to do in all that! So I was sure there was a subtle memory management bug somewhere that had been exposed by my optimisation for the nested XLL functions corner case. While I was looking for the bug I did get some good reading on Windows heap debugging done: this and this are recommended. When debugging it’s easy to keep going deeper and deeper in pursuit of a supposedly subtle bug. Better to take a breath and try and widen one’s focus. It was this article that made me think again about the fact that each DLL has it’s own C run time linked in, and if their memory management implementations differ, there will be problems. So it was in this case. Adding a static allocator function to the DLL responsible for freeing the object ensured that the new & delete were done by the same DLL, and the bootstrapping model started running through smoothly. Of course what I need to do now is revisit my build system’s link model and ensure all DLLs and EXEs are linking the same CRT implementation…

SpreadServe Beta

December 6, 2014

For the last few months I’ve been working on a new product: SpreadServe. SpreadServe’s mission is to take all those unwieldy spreadsheets full of XLL addins and VBA macros off trader desktops, and turn them into resilient, automated, scalable enterprise services. Excel spreadsheets are great, because they empower users to create their own solutions quickly. But on the other hand, they’re a liability, because they’re usually poorly tested and they have to be manually operated on the desktop. SpreadServe’s goal is to fix that by providing an alternate runtime for Excel spreadsheets. Retain business agility by continuing to design and build models in Excel. Auto-test, deploy, scale, log and automate those models in SpreadServe. Do contact us if you’re interested in joining the beta program.

Back in April I posted on my use of tornado for real time push up to a browser client using websockets. Then I was building a proof of concept. I’ve come a lot further since; I’m now building the beta of my product. I’m still using tornado on the server side, and I’ve adopted Twitter’s Bootstrap framework for the browser GUI. There’s a bit of a learning curve to become productive in Bootstrap & jQuery. But it’s been worthwhile, and I’ve been pleasantly surprised by how far browser GUI development has advanced in the last five years. Anyway, the purpose of this post is to flag an excellent resource for anyone digging deeper into tornado server dev: Oscar Vilaplana‘s Tornado In Depth Europython talk. There’s no sound on that youtube video for the first couple of minutes, and even after that the audio is a little scratchy. But bear with it. Oscar spends an hour walking you through tornado based code and the supporting internals, and makes it all beautifully simple. All the examples from his talk can be found here on bitbucket. Thanks Oscar!

Suppose you have a third party Win32 DLL, but don’t have access to the source code. The DLL exports a well known init function that makes callbacks into your code to register more functions. So you can call Win32’s LoadLibrary to load up the DLL, and you can use GetProcAddress to get hold of a pointer to the init func. You can invoke the init function from your code because you know the function prototype – the number and type of parameters. Then you get callbacks into a Register function you supply, which gives your code the names of other functions exported by the third party DLLs, as well as the number and type of parameters. Excel developers will recognise the description of the XLL function registration mechanism. So, given the names of those functions you can use GetProcAddress to get function pointers for them. But how do you invoke them? You don’t have function declarations available at compile time in your code. The functions don’t use varargs, so you can’t use va_start and va_end to figure out the params at run time.

The only way to resolve this dilemma is to kick down to assembler, and hand code some x86 that follows the Win32 calling conventions, which are well explained here and here. So here’s the code I wrote to invoke arbitray functions exported from a DLL. I used a couple of great resources to refresh my ASM chops, which have become very rusty after years of neglect: this primer and this x86 instruction set reference. It’s in inline assembler, together with the C++ preamble that sets up parameters to simplify the assembler.

bool cc_cdecl = true;                         // stdcall if false
int parmBytes = ( parmCount - 1) * 4;         // parmCount includes ret val, so subtract 1
int parmPop = ( cc_cdecl ? parmBytes : 0);    // number of bytes to pop off the stack after call
void* rvInt = 0;                              // for receiving int or ptr return value
double rvDbl = 0.0;                           // for a float return value from ST(0)
int paddr = ( int)parms;                      // parms is void** array of parameters. Cast to int
                                              // to prevent implicit ptr deref by asm
// Then asm code to do a cdecl or stdcall dispatch and call xf.
__asm {                      // push parms onto stack in reverse order
        push eax             // save eax
        mov eax, paddr       // point to start of parms
        add eax, parmBytes   // point to last parm
    pp: push [eax]           // stack a parm
        sub eax, 4           // point to next parm
        cmp eax, paddr       // have we hit the start yet?
        jg pp                // if eax > parms goto pp
        call xf              // invoke the function!
        add esp, parmPop     // pop parms if cdecl
        mov rvInt, eax       // int or ptr retvals are in eax
        fst rvDbl            // float ret vals are in st0
        pop eax              // restore eax

Invisible Engines

June 4, 2014

I first read Invisible Engines back in 2007. I still rate it highly now, and I’m pleased to see you can download the whole book as a PDF from the MIT site. It’s topic is the economic and technical aspects of software platforms. Anyone who’s followed the fortunes of IBM, Microsoft, Apple and Sun, and their respective software and hardware platforms should get a lot from this book. I had high expectations when I originally read it, and I wasn’t disappointed. Looking at the download again today, I can see it’s stood the test of time. The book goes well beyond operating systems as platforms and has excellent material on gaming consoles and the three way tension between mobile handsets, their OSes and network operators. It came out in 2006, so pre dates the rise of social software platforms. But the principles it elucidates on multi sided markets and APIs are obviously applicable to Facebook, Google and Twitter. And in the financial sector they apply equally to industry giants like Bloomberg who are a classic multi sided play. Bloomberg as a platform derives revenue from clients paying $1000/month for a terminal. The other sides of its market are dealers contributing quotes and liquidity to Bloomberg as an ECN, and software developers using Bloomberg APIs.

When you’re building a server product that will support C++ APIs you need to consider your ABI – the binary interface. Typically, C++ APIs are distributed as headers and libs. If the functions your API exports include parameters that use, for instance, std::string, you immediately have a problem, as you’re requiring client code to use the same STL implementation as you did to build the lib. That’s OK if client code has access to the source, and can rebuild. But commercial, proprietary products, tend not to distribute source. So how to avoid forcing dependencies on API client code? I went searching for some resources, and found two especially good ones I had to flag.

Here’s  Thiago Macieira on binary compatibility: an excellent presentation with guidelines for library authors. Here’s a summary of Thiago’s recommendations…


  1. Use pimpl idiom to hide object size
  2. Use plain old data types in function signatures
  3. Don’t hand out ptrs or refs to internals
  4. No inline funcs
  5. All classes need one non inline virtual; probably the dtor
  6. Avoid virtuals in template classes
  7. Do not reorder or remove public members, or change access levels

2 means no STL or Boost types in function parameters. I’ll address 6 by avoiding templates in my API.

This article by Agner Fog is a superb detailed survey of data sizes & alignments,  stack alignment, call conventions for register usage and parameter handling, name mangling schemes and object file formats inc COFF for all the major x86 & x64 OSes and C++ compilers. Strongly recommended.

I’ve been getting into windbg while working on my POC. I’ve long been a fan of Microsoft’s Visual Studio debugger. Even back in the late 90s, when I got a serious case of Open Source Religion after falling under the spell of Eric Raymond’s Cathedral and the Bazaar, and went through an anti Microsoft phase, I never stopped rating the debugger. Visual Studio debugger is great, but windbg is a whole ‘nother thing. It’s the debugger MS use themselves for debugging Windows. Yes, the interface is a little clunky compared to the VS debugger, but the power of the command set more than compensates. It’s got it’s own scripting system built in, so you can construct custom breakpoints: for instance, break the 5th time round the loop when this int is greater than 100. And it has an API too, so the debug engine can be driven from other languages like Python. Personally, I’m really enjoying discovering the power of windbg. While I do so I’m capturing tips and tricks here.

Python DAGs

May 5, 2014

Tales flags some interesting developments in the Python world. The demand for Python developers in finance does seem to be building. Both Man and Getco are big users, and as Tales points out, JP Morgan and BAML both use Python as the primary programming languages in their Athena and Quartz systems, both of which are inspired by Goldman’s SecDB/Slang. Tales wonders if Washington Square Tech will the fourth implementation of this paradigm; I believe it may be the fifth, as Morgan Stanley had an Athena like project called Pioneer during Jay Dweck’s ill fated tenure. Apparently that project is now defunct. The Athena paradigm is technically a very powerful solution for trading businesses that have run on ad hoc solutions using Excel for pricing and risk. Partly because they seek to replace Excel based pricing and risk, and partly because it’s compute efficient, all implementations of the Athena SecDB/Slang paradigm implement Directed Acyclic Graphs. I’m guessing that Washington Square Tech will think this could be very appealing for buy side firms that don’t have big in house tech stacks, together with incumbent tech teams defending them against replacements.

Directed Acyclic Graphs (DAGs) are a powerful implementation technique for minimising the load of compute intensive tasks like pricing and risk, and this is one good reason why Excel has been so successful in this area, as Ben Lerner of DataNitro explains persuasively here. So it should be no surprise to see that Man Groups own open source Python codebases include a DAG implementation, MDF. The MDF docs include a very good illustration of the power of the DAG approach.

tornado & websockets

April 26, 2014

The core of my POC prototype is a server engine, which doesn’t really make for a great demo. Generally, people grasp concepts quicker if they can see a tangible realization. So I needed a realistic way to show live ticking data getting cranked out by the server. A browser GUI seemed a natural candidate. And being a Pythonista I wanted to do the server coding in Python. Until recently getting live ticking data pushed up to a browser was a big deal, requiring sophisticated server products like the Caplin Liberator, and rich GUI toolkits like Caplin Trader. Fortunately, it’s now possible to hack some demoware in the form of a live, ticking webpage using some really simple jQuery & websockets in the browser, and tornado on the server side. JavaScript and browser GUIs are not my forte, so I won’t comment any further, except to note how much easier it seems than five years ago. On the server side, though, I do have more experience. Back in 2000 I was doing server side web dev in Python using Zope. Zope is a very powerful system, featuring a built in Object DB and an inheritance by instance rather than class mechanism called acquisition. Consequently it has a rather steep learning curve. In recent years Plone has had some traction as a CMS built on top of Zope. In 2001/2 I discovered Twisted Matrix, a general networking toolkit you can use to build any IP based networking functionality. Again there’s a steep learning curve, but it’s much lighter than Zope, and is now very mature. I will be using Twisted to build a general socket server capability for my core product: I’ve got C++ and Python APIs, but I’ll need a socket server for Java support. But what I needed for my demo purpose was real time server push to the browser. And tornado proved to be a good choice. Simple, lightweight, lots of worked examples and focused entirely on websockets. It didn’t take long to get ticking data into a webpage. Recommended!